*Issue #, if available:* On Linux, the final call to `.to` creates
trouble when input tensors are integer. For example:
```
>>> a = torch.tensor([1])
>>> b = torch.stack([torch.full((1,), torch.nan), a])
>>> b
tensor([[nan],
[1.]])
>>> b.to(a)
tensor([[-9223372036854775808],
[ 1]])
```
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Issue #, if available:* N/A
*Description of changes:* This PR ensures that predictions are returned
in FP32 and on the CPU device. This choice is now better because we have
two types of models which have different types of forecasts (samples vs.
quantiles). Furthermore, `int64` input_type (our README example is one
such case) ran into issues with `predict_quantiles` before. This choice
also fixes that.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>