fleet/changes/14779-fix-host_batteries-deadlock
Lucas Manuel Rodriguez 57351011fa
Fix deadlock when replacing (upserting) host_batteries (#15447)
#14779

This PR fixes the deadlock when upserting to `host_batteries`.
Which probably happens because InnoDB uses row-locking.

I was able to reproduce in main with the new test
`TestHosts/ReplaceHostBatteriesDeadlock`.
I refactored `ds.ReplaceHostBatteries` to use the same upsert pattern as
`ds.ReplaceHostDeviceMapping` (given `battery` is assumed to return just
a few rows per host). With such pattern the tests does not fail with
deadlock errors anymore.

Here are some of the techniques MySQL recommends:
https://dev.mysql.com/doc/refman/5.7/en/innodb-deadlocks-handling.html
Basically by changing the upsert pattern the deadlock goes away (It's
hard to know exactly why the original code deadlocks).

Here's the deadlock trace from load test performed in October:
```
2023-10-26T17:19:17.244707Z 0 [Note] [MY-012468] [InnoDB] Transactions deadlock detected, dumping detailed information. (lock0lock.cc:6482)
2023-10-26T17:19:17.244756Z 0 [Note] [MY-012469] [InnoDB]  *** (1) TRANSACTION:  (lock0lock.cc:6496)
TRANSACTION 3069771944, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 7 lock struct(s), heap size 1136, 5 row lock(s), undo log entries 1
MySQL thread id 75, OS thread handle 70369297350384, query id 658 10.12.3.201 fleet update
INSERT INTO
      host_batteries (
        host_id,
        serial_number,
        cycle_count,
        health
      )
    VALUES
      (27472, '0000', 505, 'Good'),(27472, '0001', 730, 'Good')
    ON DUPLICATE KEY UPDATE
      cycle_count = VALUES(cycle_count),
      health = VALUES(health),
      updated_at = CURRENT_TIMESTAMP
2023-10-26T17:19:17.244800Z 0 [Note] [MY-012469] [InnoDB]  *** (1) HOLDS THE LOCK(S):  (lock0lock.cc:6496)
RECORD LOCKS space id 867 page no 320 n bits 280 index PRIMARY of table `fleet`.`host_batteries` trx id 3069771944 lock_mode X locks gap before rec
Record lock, heap no 205 PHYSICAL RECORD: n_fields 9; compact format; info bits 0
 0: len 4; hex 00526996; asc  Ri ;;
 1: len 6; hex 0000b6f900d0; asc       ;;
 2: len 7; hex 82000033370110; asc    37  ;;
 3: len 4; hex 0000d829; asc    );;
 4: len 4; hex 30303030; asc 0000;;
 5: len 4; hex 8000065b; asc    [;;
 6: len 4; hex 506f6f72; asc Poor;;
 7: len 4; hex 653a9f95; asc e:  ;;
 8: len 4; hex 653a9f95; asc e:  ;;

2023-10-26T17:19:17.245027Z 0 [Note] [MY-012469] [InnoDB]  *** (1) WAITING FOR THIS LOCK TO BE GRANTED:  (lock0lock.cc:6496)
RECORD LOCKS space id 867 page no 320 n bits 280 index PRIMARY of table `fleet`.`host_batteries` trx id 3069771944 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 205 PHYSICAL RECORD: n_fields 9; compact format; info bits 0
 0: len 4; hex 00526996; asc  Ri ;;
 1: len 6; hex 0000b6f900d0; asc       ;;
 2: len 7; hex 82000033370110; asc    37  ;;
 3: len 4; hex 0000d829; asc    );;
 4: len 4; hex 30303030; asc 0000;;
 5: len 4; hex 8000065b; asc    [;;
 6: len 4; hex 506f6f72; asc Poor;;
 7: len 4; hex 653a9f95; asc e:  ;;

2023-10-26T17:19:17.245239Z 0 [Note] [MY-012469] [InnoDB]  *** (2) TRANSACTION:  (lock0lock.cc:6496)
TRANSACTION 3069771958, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 7 lock struct(s), heap size 1136, 5 row lock(s), undo log entries 1
MySQL thread id 9, OS thread handle 70369296809712, query id 708 10.12.2.156 fleet update
INSERT INTO
      host_batteries (
        host_id,
        serial_number,
        cycle_count,
        health
      )
    VALUES
      (59161, '0000', 1384, 'Fair'),(59161, '0001', 396, 'Good')
    ON DUPLICATE KEY UPDATE
      cycle_count = VALUES(cycle_count),
      health = VALUES(health),
      updated_at = CURRENT_TIMESTAMP
2023-10-26T17:19:17.245272Z 0 [Note] [MY-012469] [InnoDB]  *** (2) HOLDS THE LOCK(S):  (lock0lock.cc:6496)
RECORD LOCKS space id 867 page no 320 n bits 280 index PRIMARY of table `fleet`.`host_batteries` trx id 3069771958 lock_mode X locks gap before rec
Record lock, heap no 205 PHYSICAL RECORD: n_fields 9; compact format; info bits 0
 0: len 4; hex 00526996; asc  Ri ;;
 1: len 6; hex 0000b6f900d0; asc       ;;
 2: len 7; hex 82000033370110; asc    37  ;;
 3: len 4; hex 0000d829; asc    );;
 4: len 4; hex 30303030; asc 0000;;
 5: len 4; hex 8000065b; asc    [;;
 6: len 4; hex 506f6f72; asc Poor;;
 7: len 4; hex 653a9f95; asc e:  ;;
 8: len 4; hex 653a9f95; asc e:  ;;

2023-10-26T17:19:17.245504Z 0 [Note] [MY-012469] [InnoDB]  *** (2) WAITING FOR THIS LOCK TO BE GRANTED:  (lock0lock.cc:6496)
RECORD LOCKS space id 867 page no 320 n bits 280 index PRIMARY of table `fleet`.`host_batteries` trx id 3069771958 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 205 PHYSICAL RECORD: n_fields 9; compact format; info bits 0
 0: len 4; hex 00526996; asc  Ri ;;
 1: len 6; hex 0000b6f900d0; asc       ;;
 2: len 7; hex 82000033370110; asc    37  ;;
 3: len 4; hex 0000d829; asc    );;
 4: len 4; hex 30303030; asc 0000;;
 5: len 4; hex 8000065b; asc    [;;
 6: len 4; hex 506f6f72; asc Poor;;
 7: len 4; hex 653a9f95; asc e:  ;;
 8: len 4; hex 653a9f95; asc e:  ;;

2023-10-26T17:19:17.245730Z 0 [Note] [MY-012469] [InnoDB] *** WE ROLL BACK TRANSACTION (2)  (lock0lock.cc:6496)
```

- [X] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [X] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)
- [X] Added/updated tests
- [X] Manual QA for all new/changed functionality
2023-12-05 18:24:58 -03:00

1 line
86 B
Text

* Fix possible deadlocks when upserting to `host_batteries` (found during load test).