mirror of
https://github.com/amazon-science/chronos-forecasting
synced 2026-05-24 01:58:27 +00:00
Update notebook
This commit is contained in:
parent
9e34e8aa38
commit
649d1bc620
1 changed files with 46 additions and 9 deletions
|
|
@ -29,28 +29,41 @@
|
|||
"**[Serverless Inference](https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html)** [(Section 2)](#Section-2:-Serverless-Inference)\n",
|
||||
"- ✅ Pay only for active inference time, no infrastructure management\n",
|
||||
"- ✅ Cost-efficient for intermittent or unpredictable traffic\n",
|
||||
"- ❌ Cold start latency on first request after idle, CPU only, 6GB memory limit\n",
|
||||
"- ❌ Cold start latency on first request after idle, CPU only, lowest throughput of all options\n",
|
||||
"- ❌ [More complex setup](#Setup-for-Serverless-and-Batch-Transform) (requires repackaging model artifacts)\n",
|
||||
"\n",
|
||||
"**[Batch Transform](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html)** [(Section 3)](#Section-3:-Batch-Transform)\n",
|
||||
"- ✅ Pay only for active compute time, no persistent infrastructure\n",
|
||||
"- ✅ Cost-efficient for large-scale batch prediction jobs\n",
|
||||
"- ❌ Highest latency (not for real-time use), CPU only, requires data in S3\n",
|
||||
"- ❌ Initialization takes severa minutes for each job (not for real-time use), CPU only, requires data in S3\n",
|
||||
"- ❌ [More complex setup](#Setup-for-Serverless-and-Batch-Transform) (requires repackaging model artifacts)\n",
|
||||
"\n",
|
||||
"**Reference benchmark** on M5 dataset (30K daily retail time series, prediction_length=28):\n",
|
||||
"| Mode | Instance | Time |\n",
|
||||
"**Reference benchmark** on a dataset with 1M rows (2000 time series with 500 observations each) and prediction length of 28:\n",
|
||||
"| Mode | Instance | Inference time (s) |\n",
|
||||
"|------|----------|------|\n",
|
||||
"| Real-time (GPU) | ml.g5.2xlarge | X min |\n",
|
||||
"| Real-time (CPU) | ml.c5.4xlarge | X min |\n",
|
||||
"| Serverless | 6GB memory | X min |\n",
|
||||
"| Batch Transform | ml.c5.4xlarge | X min |\n",
|
||||
"| Real-time (GPU) | ml.g5.2xlarge | 18 |\n",
|
||||
"| Real-time (CPU) | ml.c5.4xlarge | 50 |\n",
|
||||
"| Serverless | 6GB memory | 120 |\n",
|
||||
"| Batch Transform | ml.c5.4xlarge | 60 (+200s setup) |\n",
|
||||
"\n",
|
||||
"We recommend starting with **Real-time Inference** as it offers the simplest setup and highest throughput. Consider Serverless or Batch Transform when you need to optimize costs and don't require GPU acceleration.\n",
|
||||
"\n",
|
||||
"For a complete specification of all supported request parameters, see the [Endpoint API Reference](#Endpoint-API-Reference) at the end of this notebook."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "15b5fd55",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# GPU: 18s\n",
|
||||
"# CPU: 50s\n",
|
||||
"# Serverless: 120s\n",
|
||||
"# Batch transform: 60s (+200s setup)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "78b40323",
|
||||
|
|
@ -879,7 +892,7 @@
|
|||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"## Setup for Serverless and Batch Transform\n",
|
||||
"## Setup for Serverless Inference and Batch Transform\n",
|
||||
"\n",
|
||||
"Serverless Inference and Batch Transform only support CPU instances. Unlike real-time inference with JumpStart, these modes require you to create a custom SageMaker Model with repackaged artifacts.\n",
|
||||
"\n",
|
||||
|
|
@ -999,6 +1012,30 @@
|
|||
"chronos_model.create()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "129bc389",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Alternatively, you can load an existing model as follows:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f7cb0f14",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# model_info = boto3.client(\"sagemaker\").describe_model(ModelName=\"chronos-2-cpu\")\n",
|
||||
"# model = Model(\n",
|
||||
"# model_data=model_info[\"PrimaryContainer\"][\"ModelDataUrl\"],\n",
|
||||
"# image_uri=model_info[\"PrimaryContainer\"][\"Image\"],\n",
|
||||
"# role=model_info[\"ExecutionRoleArn\"],\n",
|
||||
"# name=model_info[\"ModelName\"],\n",
|
||||
"# )"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ba12b52d",
|
||||
|
|
|
|||
Loading…
Reference in a new issue