* test: fix TestSelfHealExponentialBackoff to test exceeding Backoff.Cap
Signed-off-by: Michal Ryšavý <michal.rysavy@ext.csas.cz>
* fix: fix calculating SelfHealBackOff delay when exceeding maximum
Signed-off-by: Michal Ryšavý <michal.rysavy@ext.csas.cz>
---------
Signed-off-by: Michal Ryšavý <michal.rysavy@ext.csas.cz>
Co-authored-by: Michal Ryšavý <michal.rysavy@ext.csas.cz>
* chore: simplify sync status comparison
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* add tests
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* more tests, some docs
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
---------
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* add lastTransitionTime to health status
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
* address first feedback
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
* set transition time if health status is unknown
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
* extend health improvement tests
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
* add apoplication controller test
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
* use require for NoError
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
* more extensive tests for health state changes
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
* Apply suggestions from code review
Co-authored-by: Blake Pettersson <blake.pettersson@gmail.com>
Signed-off-by: Manuel Kieweg <2939765+mkieweg@users.noreply.github.com>
* Code review suggestions
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
* remove obsolete assert
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
* Change LastTransitionTime field to pointer type
Due to implementation limitations, setting LastTransitionTime at the resource level is challenging.
Converting it to a pointer type allows it to be skipped at the resource level and prevents it from appearing
in .status.resources of the Application CR. Additionally, it doesn’t provide much value or have a known
use case right now.
Signed-off-by: Siddhesh Ghadi <sghadi1203@gmail.com>
* Resolve rebase conflicts
Signed-off-by: Siddhesh Ghadi <sghadi1203@gmail.com>
* Address review comment
Signed-off-by: Siddhesh Ghadi <sghadi1203@gmail.com>
* Trigger CI
Signed-off-by: Siddhesh Ghadi <sghadi1203@gmail.com>
---------
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Signed-off-by: Manuel Kieweg <2939765+mkieweg@users.noreply.github.com>
Signed-off-by: Siddhesh Ghadi <sghadi1203@gmail.com>
Co-authored-by: Blake Pettersson <blake.pettersson@gmail.com>
Co-authored-by: Siddhesh Ghadi <sghadi1203@gmail.com>
* feat: option to disable writing k8s events
optioned to write logs for k8s events.
Each is passed as an environment variable and defaults to true,
disabling it requires explicitly setting the option to false.
Signed-off-by: Jack-R-lantern <tjdfkr2421@gmail.com>
* feat: option to disable writing k8s events
fix unit test
- application_test
- applicationset_test
- project_test
- appcontroller_tes
- audit_logger_test
Signed-off-by: Jack-R-lantern <tjdfkr2421@gmail.com>
* rebase
Signed-off-by: Jack-R-lantern <tjdfkr2421@gmail.com>
---------
Signed-off-by: Jack-R-lantern <tjdfkr2421@gmail.com>
Closes#13096
Implement a new metric exposing Applications conditions.
This is particularly useful for SRE teams to be able
to setup alerts on issues that aren't displayed via
"health_status" and "sync_status" in the metric "argocd_app_info".
Signed-off-by: Foyer Unix <foyerunix@foyer.lu>
Co-authored-by: Foyer Unix <foyerunix@foyer.lu>
* chore(deps): bump k8s libs from 0.29.6 to 0.30.2
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* latest commit
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* update known types
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* bump controller-runtime to a version that's compatible with go-client 0.30.x
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* update go-to-protobuf flag
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* handle new requirements for proto file locations
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* bump gitops-engine version
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* fix openapigen
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* remove toolchain
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* bump gitops-engine version
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* chore: enable lint for deprecated symbols
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* chore: bump to k8s 1.31
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* codegen
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* don't be generic if you don't have to be
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* don't be generic if you don't have to be
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* new commit
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* use gitops-engine commit
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
---------
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* fix(controller): selfHeal respected in multi-source apps
when there is object change in cluster
Signed-off-by: Eric Lin <38420555+Ezzahhh@users.noreply.github.com>
* fix: tests for multi source selfheal
Signed-off-by: Ezzahhh <38420555+Ezzahhh@users.noreply.github.com>
---------
Signed-off-by: Eric Lin <38420555+Ezzahhh@users.noreply.github.com>
Signed-off-by: Ezzahhh <38420555+Ezzahhh@users.noreply.github.com>
* feat(sourceNamespace): Regex Support
Signed-off-by: Arthur <arthur@arthurvardevanyan.com>
* feat(sourceNamespace): Separate exactMatch into patternMatch
Signed-off-by: Arthur <arthur@arthurvardevanyan.com>
---------
Signed-off-by: Arthur <arthur@arthurvardevanyan.com>
Adding app to the operation queue and refresh queue could cause waiting for resource for minutes to tens of minutes.
Sync state operates on resources gathered from reconciliation, so if the app operation event is processed before the refresh one (when triggered on resource update/creation), the refresh doesn’t help sync to progress and it essentially needs to wait for another app refresh.
The fix seems to be to schedule app operation event after refresh event is finished processing. There’s one place where operation event is scheduled without refresh event (which can be kept there), and one place where refresh even is scheduled without the operation one during the app deletion handling 3e2cfb1387/controller/appcontroller.go (L2177). It’s probably safe to schedule operation even after that, since it has some code to check that app was deleted. If not, an update can be made to have refresh queue storing a tuple with app key and bool whether to enqueue app operation.
If there are issues: try keeping both old places to add to app operation queue and new addition after refresh.
Note on cherry pick: add to as many releases as you can. This can be a significant performance boost.
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Closes#18929
Helps with #18500
Use iterate hierarchy v2 to have a roughly linear performance for getting the resource tree instead of up to quadratic.
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
The timing started before reconciliation timing started including get from the queue, leading to very big times reported, not making sense for what's actually going on.
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Closes#18923
There are some gaps in debugging information for long reconciliations. Fill in a lot of those gaps by adding more debug logs with timing information about different code execution steps.
Also, fix a flaky test in app_test.go.
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
* chore: Improve appcontroller logs further - Closes [#18113]
Add application fully qualified name as a logrus field to several places that were missing it.
Remove appNamespace field and replace application name with a fully qualified name in a few places for consistency.
Add app project to the log fields.
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
* chore: Switch to using separate field for application name and qualified name, add a common function for app log entry - Closes[#18113]
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
---------
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
* Adding app list to sharding cache
Signed-off-by: Andrew Lee <andrewkl@enclavenet.com>
* Add shard by apps test
Signed-off-by: Andrew Lee <andrewkl@enclavenet.com>
* Fix lint
Signed-off-by: Andrew Lee <andrewkl@enclavenet.com>
* Add coverage to test
Signed-off-by: Andrew Lee <andrewkl@enclavenet.com>
* Fix lint
Signed-off-by: Andrew Lee <andrewkl@enclavenet.com>
* Converted cluster/app accesors to private, add apps-in-any-namespace suport in shardingcache init, added read lock to GetAppDistribution
Signed-off-by: Andrew Lee <andrewkl@enclavenet.com>
* Fix tests
Signed-off-by: Andrew Lee <andrewkl@enclavenet.com>
---------
Signed-off-by: Andrew Lee <andrewkl@enclavenet.com>
* feat(sharding): use a cache
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
* cluster cmd
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
* - Assign shard 0 to in-cluster cluster and nil check updates
- Caching clusters while sharding: Fixing unit tests
- Update generated docs
- Debug e2e tests
- Default the shardNumber to the number of replicas if it is calculated to a higher value
- defered Unlock only when a lock is set
- Disabling temporarly other versions of k3s to check if e2e passes
- Do not fail if hostname format is not abc-n
- Fix unit test and skip some e2e
- Skip TestGitSubmoduleHTTPSSupport test
- Remove breaking defer c.lock.Unlock()
- Reverting testing all k3s version
- Default sharding fix
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
* fixes related to code review: renaming structure param, moving db initialisation
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
* Code review
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
* Set default shard to 0
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
* Set different default value for Sts and Deployment mode
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
* Expose ClusterShardingCache
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
* Removing use of argoDB.db for DistributionFunction
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
* Update generated documentation
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
* Fix comment about NoShardingDistributionFunction and NoShardingAlgorithm
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
---------
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
Co-authored-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>