Commit graph

3 commits

Author SHA1 Message Date
Abdul Fatir
ad410c9c0a
Add pipeline.embed support for Chronos-Bolt (#247) 2024-12-22 13:56:41 +01:00
Abdul Fatir
86f755c179
Fix auto eval workflow (#224)
*Issue #, if available:*

*Description of changes:* This PR fixes the auto evaluation workflow.
The second workflow step did not work because it did not know the right
PR number to post the comment on. The fix is to include the PR number in
the CSV file name and read it in the second workflow.

PS: This is a really poor user experience because there's no way to test
that this works right without merging!


By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>
2024-12-02 11:47:29 +01:00
Abdul Fatir
eac768ce28
Add workflow to run evaluation on a subset of datasets (#222)
*Issue #, if available:*

*Description of changes:* This PR adds a workflow that will run the
evaluation script on `chronos-bolt-small` for a subset of datasets
specified in `ci/evaluate/backtest_configs.yaml`. After evaluation, a
comment will be made on the PR. The workflow will only run if the
`run-eval` label is present on a PR. The end-to-end workflow has been
split into two workflows:

- `eval-model.yml`: only has read access (can be run from forks). This
will evaluate the model and upload the metrics CSV file as a Github
artifact.
- `eval-pr-comment.yml`: has read and write access (can only be run when
in the `main` branch). This will be triggered when the first job
finishes, will download the CSV from the eval job and make the comment.
According to [this
post](https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/),
splitting into two jobs as done here is the recommended and secure way
to do this.

**NOTE**: The first steps works as expected, but we can only test the
second step after the merging because this workflow needs to be part of
the `main` branch for this to work.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>
2024-12-02 10:05:57 +01:00