mirror of
https://github.com/amazon-science/chronos-forecasting
synced 2026-05-23 09:39:35 +00:00
Address PR comment
This commit is contained in:
parent
113d218ae6
commit
3f43317206
1 changed files with 21 additions and 15 deletions
|
|
@ -21,22 +21,22 @@
|
|||
"### Deployment Options\n",
|
||||
"This notebook covers three deployment modes on SageMaker:\n",
|
||||
"\n",
|
||||
"**[Real-time Inference](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html)** (Section 1)\n",
|
||||
"- ✅ Highest throughput, consistently low latency, supports both GPU and CPU instances\n",
|
||||
"- ✅ Simple setup via JumpStart\n",
|
||||
"- ❌ By default, you pay for the time the endpoint is running (can be configured to [scale to zero](https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling-zero-instances.html))\n",
|
||||
"1. **[Real-time Inference](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html)**\n",
|
||||
" - ✅ Highest throughput, consistently low latency, supports both GPU and CPU instances\n",
|
||||
" - ✅ Simple setup via JumpStart\n",
|
||||
" - ❌ By default, you pay for the time the endpoint is running (can be configured to [scale to zero](https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling-zero-instances.html))\n",
|
||||
"\n",
|
||||
"**[Serverless Inference](https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html)** (Section 2)\n",
|
||||
"- ✅ Pay only for active inference time, no infrastructure management\n",
|
||||
"- ✅ Cost-efficient for intermittent or unpredictable traffic\n",
|
||||
"- ❌ Cold start latency on first request after idle, CPU only, lowest throughput of all options\n",
|
||||
"- ❌ More complex setup (requires repackaging model artifacts)\n",
|
||||
"2. **[Serverless Inference (CPU only)](https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html)**\n",
|
||||
" - ✅ Pay only for active inference time, no infrastructure management\n",
|
||||
" - ✅ Cost-efficient for intermittent or unpredictable traffic\n",
|
||||
" - ❌ Cold start latency on first request after idle, lowest throughput of all options\n",
|
||||
" - ❌ More complex setup (requires repackaging model artifacts)\n",
|
||||
"\n",
|
||||
"**[Batch Transform](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html)** (Section 3)\n",
|
||||
"- ✅ Pay only for active compute time, no persistent infrastructure\n",
|
||||
"- ✅ Cost-efficient for large-scale batch prediction jobs\n",
|
||||
"- ❌ Initialization takes severa minutes for each job (not for real-time use), CPU only, requires data in S3\n",
|
||||
"- ❌ More complex setup (requires repackaging model artifacts)\n",
|
||||
"3. **[Batch Transform](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html)**\n",
|
||||
" - ✅ Pay only for active compute time, no persistent infrastructure\n",
|
||||
" - ✅ Cost-efficient for large-scale batch prediction jobs\n",
|
||||
" - ❌ Initialization takes severa minutes for each job (not for real-time use), requires data in S3\n",
|
||||
" - ❌ More complex setup (requires repackaging model artifacts)\n",
|
||||
"\n",
|
||||
"**Reference benchmark** on a dataset with 1M rows (2000 time series with 500 observations each) and prediction length of 28:\n",
|
||||
"| Mode | Instance | Inference time (s) |\n",
|
||||
|
|
@ -1410,8 +1410,14 @@
|
|||
"main_language": "python",
|
||||
"notebook_metadata_filter": "-all"
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "ag",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
"name": "python",
|
||||
"version": "3.12.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
|
|
|||
Loading…
Reference in a new issue