fleet/server
Lucas Manuel Rodriguez 608038a1bb
Fix deadlock when deleting software during data ingestion (#15459)
This fixes the deadlock reported in #14779.

We found a deadlock in software ingestion during load tests performed in
October:
```
2023-10-26T17:20:41.719627Z 0 [Note] [MY-012468] [InnoDB] Transactions deadlock detected, dumping detailed information. (lock0lock.cc:6482)
2023-10-26T17:20:41.719661Z 0 [Note] [MY-012469] [InnoDB]  *** (1) TRANSACTION:  (lock0lock.cc:6496)
TRANSACTION 3069866646, ACTIVE 0 sec starting index read
mysql tables in use 2, locked 2
LOCK WAIT 8 lock struct(s), heap size 1136, 18 row lock(s), undo log entries 10
MySQL thread id 95, OS thread handle 70431326097136, query id 340045 10.12.3.105 fleet executing
DELETE FROM software WHERE id IN (165, 79, 344, 47, 212, 21, 60, 127, 173, 145) AND
        NOT EXISTS (
                SELECT 1 FROM host_software hsw WHERE hsw.software_id = software.id
        )
2023-10-26T17:20:41.719700Z 0 [Note] [MY-012469] [InnoDB]  *** (1) HOLDS THE LOCK(S):  (lock0lock.cc:6496)
RECORD LOCKS space id 932 page no 8 n bits 256 index PRIMARY of table `fleet`.`software` trx id 3069866646 lock_mode X locks rec but not gap
Record lock, heap no 22 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
 0: len 8; hex 0000000000000015; asc         ;;
 1: len 6; hex 0000a74c4a7c; asc    LJ|;;
 2: len 7; hex 82000000d00264; asc       d;;
 3: len 26; hex 616e74692d76697275735f666f725f736f70686f735f686f6d65; asc anti-virus_for_sophos_home;;
 4: len 5; hex 322e322e36; asc 2.2.6;;
 5: len 4; hex 61707073; asc apps;;
 6: len 0; hex ; asc ;;
 7: len 0; hex ; asc ;;
 8: len 0; hex ; asc ;;
 9: len 0; hex ; asc ;;
 10: len 0; hex ; asc ;;

Record lock, heap no 48 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
 0: len 8; hex 000000000000002f; asc        /;;
 1: len 6; hex 0000a74c4aad; asc    LJ ;;
 2: len 7; hex 81000000e30220; asc        ;;
 3: len 10; hex 7265616c706c61796572; asc realplayer;;
 4: len 11; hex 31322e302e312e31373338; asc 12.0.1.1738;;
 5: len 4; hex 61707073; asc apps;;
6: len 0; hex ; asc ;;
 7: len 0; hex ; asc ;;
 8: len 0; hex ; asc ;;
 9: len 0; hex ; asc ;;
 10: len 0; hex ; asc ;;

Record lock, heap no 61 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
 0: len 8; hex 000000000000003c; asc        <;;
 1: len 6; hex 0000a74c4afb; asc    LJ ;;
 2: len 7; hex 820000017501ba; asc     u  ;;
 3: len 7; hex 636f6e6e656374; asc connect;;
 4: len 5; hex 332e322e37; asc 3.2.7;;
 5: len 4; hex 61707073; asc apps;;
 6: len 0; hex ; asc ;;
 7: len 0; hex ; asc ;;
 8: len 0; hex ; asc ;;
 9: len 0; hex ; asc ;;
 10: len 0; hex ; asc ;;

Record lock, heap no 80 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
 0: len 8; hex 000000000000004f; asc        O;;
 1: len 6; hex 0000a74c4b32; asc    LK2;;
 2: len 7; hex 820000008a01cb; asc        ;;
 3: len 7; hex 68697063686174; asc hipchat;;
 4: len 4; hex 342e3330; asc 4.30;;
 5: len 4; hex 61707073; asc apps;;
 6: len 0; hex ; asc ;;
 7: len 0; hex ; asc ;;
 8: len 0; hex ; asc ;;
 9: len 0; hex ; asc ;;
 10: len 0; hex ; asc ;;

2023-10-26T17:20:41.720564Z 0 [Note] [MY-012469] [InnoDB]  *** (1) WAITING FOR THIS LOCK TO BE GRANTED:  (lock0lock.cc:6496)
RECORD LOCKS space id 695 page no 5994 n bits 1000 index host_software_software_id_fk of table `fleet`.`host_software` trx id 3069866646 lock mode S waiting
Record lock, heap no 31 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 8; hex 000000000000004f; asc        O;;
 1: len 4; hex 0000000c; asc     ;;

2023-10-26T17:20:41.720650Z 0 [Note] [MY-012469] [InnoDB]  *** (2) TRANSACTION:  (lock0lock.cc:6496)
TRANSACTION 3069866680, ACTIVE 0 sec starting index read
mysql tables in use 2, locked 2
LOCK WAIT 7 lock struct(s), heap size 1136, 12 row lock(s), undo log entries 8
MySQL thread id 98, OS thread handle 70375801900784, query id 340524 10.12.3.9 fleet executing
DELETE FROM software WHERE id IN (49, 113, 183, 187, 223, 79, 81, 116) AND
        NOT EXISTS (
                SELECT 1 FROM host_software hsw WHERE hsw.software_id = software.id
        )
2023-10-26T17:20:41.720682Z 0 [Note] [MY-012469] [InnoDB]  *** (2) HOLDS THE LOCK(S):  (lock0lock.cc:6496)
RECORD LOCKS space id 695 page no 5994 n bits 1000 index host_software_software_id_fk of table `fleet`.`host_software` trx id 3069866680 lock_mode X locks rec but not gap
Record lock, heap no 31 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 8; hex 000000000000004f; asc        O;;
 1: len 4; hex 0000000c; asc     ;;

2023-10-26T17:20:41.720760Z 0 [Note] [MY-012469] [InnoDB]  *** (2) WAITING FOR THIS LOCK TO BE GRANTED:  (lock0lock.cc:6496)
RECORD LOCKS space id 932 page no 8 n bits 256 index PRIMARY of table `fleet`.`software` trx id 3069866680 lock_mode X locks rec but not gap waiting
Record lock, heap no 80 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
 0: len 8; hex 000000000000004f; asc        O;;
 1: len 6; hex 0000a74c4b32; asc    LK2;;
 2: len 7; hex 820000008a01cb; asc        ;;
 3: len 7; hex 68697063686174; asc hipchat;;
 4: len 4; hex 342e3330; asc 4.30;;
 5: len 4; hex 61707073; asc apps;;
 6: len 0; hex ; asc ;;
 7: len 0; hex ; asc ;;
 8: len 0; hex ; asc ;;
 9: len 0; hex ; asc ;;
 10: len 0; hex ; asc ;;

2023-10-26T17:20:41.720984Z 0 [Note] [MY-012469] [InnoDB] *** WE ROLL BACK TRANSACTION (2)  (lock0lock.cc:6496)
```

I was able to reproduce this issue on `main` with the added test. The
solution is to remove the deletion (cleanup) of `software` to a separate
transaction after the main transaction is done.

- [X] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [X] Added/updated tests
- [X] Manual QA for all new/changed functionality
2023-12-07 09:34:53 -03:00
..
authz feat: device health endpoint (#15432) 2023-12-06 14:42:29 -05:00
bindata Allow users to be readded if they were ever removed (#1945) 2021-09-07 13:33:40 -03:00
config Validate that WSTEP is configured before enabling Windows MDM (#14858) 2023-11-09 10:08:54 -03:00
contexts Fix edge case of AppConfig changes getting lost in cached mysql. (#15352) 2023-11-29 10:09:37 -05:00
datastore Fix deadlock when deleting software during data ingestion (#15459) 2023-12-07 09:34:53 -03:00
errorstore Enable errcheck linter for golangci-lint (#8899) 2022-12-05 16:50:49 -06:00
fleet Allow filtering hosts by software_version_id and software_title_id. (#15433) 2023-12-06 14:59:00 -05:00
health Separate health checks for MySQL and Redis (#6468) 2022-07-01 08:08:03 -03:00
launcher Ingest pending MDM hosts (#9065) 2022-12-26 15:32:39 -06:00
live_query Bump go to 1.19.1 (#7690) 2022-09-12 20:32:43 -03:00
logging chore: remove refs to deprecated io/ioutil (#14485) 2023-10-27 15:28:54 -03:00
mail 14729 smtp settings validation for TLS (#15029) 2023-11-21 11:48:21 -07:00
mdm Add validations to disallow custom MDM profiles that contain names reserved by Fleet (#15373) 2023-11-30 17:19:18 -06:00
mock feat: device health endpoint (#15432) 2023-12-06 14:42:29 -05:00
policies Refactor webhooks cron to new schedule package (#7840) 2022-09-20 14:26:36 -05:00
ptr Add Description text to CVE Metadata (#13856) 2023-09-15 11:24:10 -06:00
pubsub Provide more feedback to the user when there's a Redis connection issue when running live queries (#11947) 2023-06-01 16:11:55 -03:00
service Update fleetctl get software to list titles and versions. (#15444) 2023-12-06 16:07:03 -05:00
sso chore: remove refs to deprecated io/ioutil (#14485) 2023-10-27 15:28:54 -03:00
test Prevent empty logging_type when creating and editing queries (#14575) 2023-10-16 19:33:39 -03:00
vulnerabilities HotFix - ambiguous policy search name (#15312) 2023-11-27 12:21:39 -07:00
webhooks chore: remove refs to deprecated io/ioutil (#14485) 2023-10-27 15:28:54 -03:00
websocket Enable errcheck linter for golangci-lint (#8899) 2022-12-05 16:50:49 -06:00
worker chore: remove refs to deprecated io/ioutil (#14485) 2023-10-27 15:28:54 -03:00
utils.go allow to set mdm.windows_settings.custom_settings in configs (#15145) 2023-11-15 13:58:46 -03:00
utils_test.go allow to set mdm.windows_settings.custom_settings in configs (#15145) 2023-11-15 13:58:46 -03:00