diff --git a/handbook/engineering/README.md b/handbook/engineering/README.md index 47fec5cb51..239863c7a3 100644 --- a/handbook/engineering/README.md +++ b/handbook/engineering/README.md @@ -7,8 +7,8 @@ This handbook page details processes specific to working [with](#contact-us) and | Chief Technology Officer (CTO) | [Luke Heath](https://www.linkedin.com/in/lukeheath/) _([@lukeheath](https://github.com/lukeheath))_ | Client Platform Engineer & Community Advocate | [JD Strong](https://www.linkedin.com/in/jackdaniyelstrong/) _([@spokanemac](https://github.com/spokanemac/spokanemac))_ | Engineering Manager (EM) | _See [🛩️ Product groups](https://fleetdm.com/handbook/company/product-groups#current-product-groups)_ -| Quality Assurance Engineer (QA) | _See [🛩️ Product groups](https://fleetdm.com/handbook/company/product-groups#current-product-groups)_ -| Software Engineer | _See [🛩️ Product groups](https://fleetdm.com/handbook/company/product-groups#current-product-groups)_ +| Quality Assurance Engineer (QA) | _See [🛩️ Product groups](https://fleetdm.com/handbook/company/product-groups#current-product-groups)_ +| Software Engineer | _See [🛩️ Product groups](https://fleetdm.com/handbook/company/product-groups#current-product-groups)_ ## Contact us @@ -112,11 +112,6 @@ If an announcement is found for either data source that may impact data feed ava If code changes are found for any `fleetd` components, create a new release QA issue to update `fleetd`. Delete the top section for Fleet core, and retain the bottom section for `fleetd`. Populate the necessary version changes for each `fleetd` component. -### Create release QA issue -Next, create a new GitHub issue using the [Release QA template](https://github.com/fleetdm/fleet/issues/new?assignees=&labels=&projects=&template=release-qa.md). Add the release version to the title, and assign the quality assurance members of the [MDM](https://fleetdm.com/handbook/company/development-groups#mdm-group) and [Endpoint ops](https://fleetdm.com/handbook/company/product-groups#endpoint-ops-group) product groups. - -The issue's template will contain validation steps for Fleet and individual `fleetd` components. Remove any instructions that do not apply to this release. - ### Indicate your product group is release-ready Once a product group completes its QA process during the freeze period, its QA lead moves the smoke testing ticket to the "Ready for release" column on their ZenHub board. They then notify the release ritual DRI by tagging them in a comment, indicating that their group is prepared for release. The release ritual DRI starts the [release process](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/Releasing-Fleet.md) after all QA leads have made these updates and confirmed their readiness for release. @@ -418,27 +413,15 @@ Steps to renew the certificate: 11. Adjust calendar event to be between 2-4 weeks before the next expiration. ### Perform an incident postmortem - -At Fleet, we take customer incidents very seriously. After working with customers to resolve issues, we will conduct an internal postmortem to determine any process, documentation, or coding changes to prevent similar incidents from happening in the future. Why? We strive to make Fleet the best osquery management platform globally, and we sincerely believe that starts with sharing lessons learned with the community to become stronger together. +Conduct a postmortem meetings for every service or feature outage and every critical bug, whether it's a customer's environment or on fleetdm.com. -At Fleet, we do postmortem meetings for every service or feature outage and every critical bug, whether it's a customer's environment or on fleetdm.com. - -- **Postmortem documentation** -Before running the postmortem meeting, copy this [postmortem template](https://docs.google.com/document/d/1Ajp2LfIclWfr4Bm77lnUggkYNQyfjePiWSnBv1b1nwM/edit?usp=sharing) document and populate it with some initial data to enable a productive conversation. - -- **Postmortem meeting** -Invite all stakeholders, typically the team involved and QA representatives. - -Follow the document topic by topic. Keep the goal in mind which is to take action items for addressing the root cause and making sure a similar incident will not happen again. - -Distinguish between the root cause of the bug, which by that time was solved and released, and the root cause of why this issue reached our customers. These could be different issues. (e.g. the root cause of the bug was a coding issue, but the root causes (plural) of the event may be that the test plan did not cover a specific scenario, a lack of testing, and a lack of metrics to identify the issue quickly). +1. Copy this [postmortem template](https://docs.google.com/document/d/1Ajp2LfIclWfr4Bm77lnUggkYNQyfjePiWSnBv1b1nwM/edit?usp=sharing) document and pre-populate where possible. +2. Invite stakeholders. Typically the EM, PM, QA, and engineers involved. If a customer incident, include the CSM. +3. Follow and populate document topic by topic. Determine the root cause (why it happened), as well as why our controls did not catch it before release. +4. Assign each action item an owner that who is responsible for creating a Github issue promptly and working with with the relevant PM/EM to prioritize. [Example Finished Document](https://docs.google.com/document/d/1YnETKhH9R7STAY-PaFnPy2qxhNht2EAFfkv-kyEwebQ/edit?usp=share_link) -- **Postmortem action items** -Each action item will have an owner that will be responsible for creating a Github issue promptly after the meeting. This Github issue should be prioritized with the relevant PM/EM. - - ### Process incoming equipment Upon receiving any device, follow these steps to process incoming equipment. 1. Search for the SN of the physical device in the ["Company equipment" spreadsheet](https://docs.google.com/spreadsheets/d/1hFlymLlRWIaWeVh14IRz03yE-ytBLfUaqVz0VVmmoGI/edit#gid=0) to confirm the correct equipment was received. @@ -451,7 +434,6 @@ Upon receiving any device, follow these steps to process incoming equipment. 9. Follow the prompts to activate the device and reinstall the appropriate version of macOS. > If you are prevented from completing the steps above, create a ["💻 IT support issue](https://github.com/fleetdm/confidential/issues/new?assignees=%40spokanemac&labels=%3Ahelp-it&projects=&template=request-it-support.md&title=%F0%9F%92%BB+Request+IT+support) for IT, for the device to be scheduled for troubleshooting and remediation. Please note in the issue where you encountered blockers to completing the steps. - ### Ship approved equipment Once the Business Operations department approves inventory to be shipped from Fleet IT, follow these step to ship the equipment. 1. Compare the equipment request issue with the ["Company equipment" spreadsheet](https://docs.google.com/spreadsheets/d/1hFlymLlRWIaWeVh14IRz03yE-ytBLfUaqVz0VVmmoGI/edit#gid=0) and verify physical inventory. @@ -462,7 +444,6 @@ Once the Business Operations department approves inventory to be shipped from Fl 6. Ship via FedEx to the address listed in the equipment request. 7. Add a comment to the equipment request issue, at-mentioning the requestor with the FedEx tracking info and close the issue. - ## Rituals