Optimize your inference jobs using dynamic batch inference with TorchServe on Amazon SageMaker

Matt Vaughn
Date : January 12, 2022
Categories : AWS Batch
Tags : batch ,modeling ,sage maker ,artificial intelligence ,hpcblog

In deep learning, batch processing refers to feeding multiple inputs into a model. Although it’s essential during training, it can be very helpful to manage the cost and optimize throughput during inference time as well. Hardware accelerators are optimized for parallelism, and batching helps saturate the compute capacity and often leads to higher throughput. Batching […]

Read the Post on the AWS Blog Channel