chronos-forecasting

mirror of https://github.com/amazon-science/chronos-forecasting synced 2026-05-24 01:58:27 +00:00

Author	SHA1	Message	Date
Abdul Fatir	4c43cfbdac	Return predictions in fp32 on CPU (#219 ) Issue #, if available: N/A Description of changes: This PR ensures that predictions are returned in FP32 and on the CPU device. This choice is now better because we have two types of models which have different types of forecasts (samples vs. quantiles). Furthermore, `int64` input_type (our README example is one such case) ran into issues with `predict_quantiles` before. This choice also fixes that. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>	2024-11-29 16:54:21 +01:00
Abdul Fatir	72ab64166c	⚡ Add support for Chronos-Bolt models (#204 ) Issue #, if available: N/A Description of changes: This PR adds support for Chronos-Bolt models. TODOs: - [x] Update evaluation script - [x] Fix and add tests for Bolt - [x] Update docstrings - [x] Update README example and mention Chronos-Bolt - [x] Update results bar plot in README - [x] Add versions for libraries in `pyproject.toml` - [x] Check that the training and eval scripts work - [x] Change `autogluon` -> `amazon` in model names Post Merge: - [ ] Update Citation style in README, both Github and HuggingFace repos - [ ] Remove note about AutoGluon - [ ] Update READMEs of original Chronos models to refer to Chronos-Bolt NOTE: To be merged after Chronos-Bolt models are available under the `amazon` namespace on HF. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de> Co-authored-by: Caner Turkmen <turkmen.ac@gmail.com> Co-authored-by: Lorenzo Stella <stellalo@amazon.com>	2024-11-26 17:47:14 +01:00
Lorenzo Stella	d2eef92009	Force context scaling and quantization in float32, add assertions to tests (#197 ) Issue #, if available: Fixes #193 Description of changes: Passing in contexts in lower precision than float32 may result in a drop of accuracy. This change ensures that the tokenizer (which does scaling and quantization) operates on a float32 batch. Tested across GPU/CPU and different context dtypes with ```python from itertools import product import pandas as pd import torch from chronos import ChronosPipeline import matplotlib.pyplot as plt # requires: pip install matplotlib import numpy as np df = pd.read_csv("https://raw.githubusercontent.com/AileenNielsen/TimeSeriesAnalysisWithPython/master/data/AirPassengers.csv") for context_dtype, context_device, model_dtype, model_device in product( [torch.bfloat16, torch.float16, torch.float32], ["cpu"], # only cpu input supported at the moment [torch.bfloat16, torch.float16, torch.float32], ["cpu", "cuda"], ): pipeline = ChronosPipeline.from_pretrained( "amazon/chronos-t5-tiny", device_map=model_device, torch_dtype=model_dtype, ) forecast = pipeline.predict( context=torch.tensor(df["#Passengers"]).to(dtype=context_dtype, device=context_device), prediction_length=65, num_samples=20, limit_prediction_length=False, ) assert forecast.dtype == context_dtype, f"{forecast.dtype=} but {context_dtype=}" assert str(forecast.device) == context_device, f"{forecast.device=} but {context_device=}" forecast_index = range(len(df), len(df) + 65) low, median, high = np.quantile(forecast[0].to(device="cpu", dtype=torch.float32).numpy(), [0.1, 0.5, 0.9], axis=0) plt.figure(figsize=(8, 4)) plt.plot(df["#Passengers"], color="royalblue", label="historical data") plt.plot(forecast_index, median, color="tomato", label="median forecast") plt.fill_between(forecast_index, low, high, color="tomato", alpha=0.3, label="80% prediction interval") plt.legend() plt.grid() plt.show() ``` By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.	2024-11-18 09:55:54 +01:00
Alvaro Perez-Diaz	ac6ee36ace	Fix number of quantisation buckets (#182 ) Fixes https://github.com/amazon-science/chronos-forecasting/issues/181. Chronos' tokenizer has a vocabulary size of `n_tokens`. Among these, there are `n_special_tokens` reserved for EOS, PAD, etc. and `n_tokens - n_special_tokens` allocated to numerical values. However, the provided `MeanScaleUniformBins` tokenizer creates` n_tokens - n_special_tokens + 1` different buckets, resulting in a total of `n_tokens + 1` possible tokens. This causes training and inference errors when one of the data points gets allocated to the largest bucket, as the model requires 0 <= token_id < n_tokens. This PR modifies the `MeanScaleUniformBins` tokenizer, so that it creates one less bucket for numerical values. --- By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Lorenzo Stella <lorenzostella@gmail.com>	2024-10-04 23:00:42 +02:00
Abdul Fatir	223e576e2e	Split `input_transform` into `context_input_transform` and `label_input_transform` (#82 ) Description of changes: This splits `input_transform` into `context_input_transform` and `label_input_transform`. Previously, `input_transform` was being used for both context and label during training which would lead to incorrect results where `prediction_length` > `context_length`. TODO: - [x] Update docstrings - [x] Test the training script By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.com>	2024-05-28 09:58:22 +02:00
HugoSenetaire	3fe24ff8cd	Fix output transform, add test to enforce tokenizer consistency (#73 ) Description of changes: The bin indexes were shifted by one between input transform and output transform. Subtracting 1 to the sampled tokens in output transform lead to the correct reconstruction of the signal. Add a test to ensure the consistency of the Chronos Tokenizer. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. Co-authored-by: Lorenzo Stella <stellalo@amazon.com> and Abdul Fatir Ansari <ansarnd@amazon.com>	2024-05-17 15:29:18 +02:00
Lorenzo Stella	4b1d1c818b	Fix types, add mypy to workflow (#42 ) Description of changes: Fix some type checking issues, add mypy to github workflow, apply black By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.	2024-04-05 15:36:39 +02:00
Abdul Fatir	0595bd872b	Add pipeline.embed (#24 ) Description of changes: This PR adds `pipeline.embed` which extracts encoder embeddings from the model. These embeddings may be useful for some downstream tasks such as classification, so this is useful to have. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>	2024-03-25 13:18:50 +01:00
Lorenzo Stella	7ba945c995	Upload code	2024-03-13 09:58:39 +01:00

9 commits