Deep Studying AMI and AWS Deep Studying Containers are actually enabled with assist for SOCI snapshotter and index. Seekable OCI (SOCI) is a expertise that permits environment friendly container picture administration by selective file downloading. It makes use of a layer-based indexing system to map file areas inside container photographs, permitting containers to start out with solely the mandatory recordsdata loaded (lazy loading). This strategy reduces community bandwidth utilization and improves container startup instances, making it notably precious for organizations managing massive container photographs in cloud environments.
On this publish, we take a look at the right way to use SOCI on publicly out there Deep Studying AMIs and Containers, when to make use of the varied SOCI modes supplied by the device, and the right way to rapidly and effectively use this device in your workloads immediately.
Background
As organizations deploy synthetic intelligence (AI) and machine studying (ML) workloads at scale, container startup time has change into a bottleneck in manufacturing environments. Whether or not it’s spinning up coaching jobs, serving inference endpoints, or scaling GPU clusters robotically, the time spent downloading multi-gigabyte container photographs immediately impacts price, consumer expertise, and operational effectivity. Conventional container deployment approaches power groups to obtain complete photographs earlier than workloads can start. This course of can take a number of minutes to start out up photographs generally utilized in manufacturing. Throughout improvement, a couple of minutes of wait time is barely noticeable. In manufacturing, those self same minutes add up quick.
Organizations deploying deep studying infrastructure at scale sometimes encounter a number of vital challenges:
- Extended chilly begin instances. Normal Docker picture pulls of 15–20 GB can take 4–6 minutes per occasion, delaying coaching jobs and inference endpoints throughout scaling occasions.
- Wasted compute assets. GPU cases sit idle throughout picture pulls, burning by costly compute hours whereas ready for container initialization to complete.
- Scaling bottlenecks. When demand spikes set off computerized scaling, sluggish container startup instances forestall speedy response, resulting in degraded efficiency or dropped requests.
- Bandwidth constraints. Giant-scale deployments pulling huge photographs concurrently can saturate community bandwidth, creating cascading delays throughout the infrastructure.
- Developer productiveness. Information scientists and ML engineers waste precious time ready for containers to start out throughout iterative improvement and experimentation cycles.
Container pulling mechanisms
When pulling a container on your workloads, AWS Deep Studying AMIs (DLAMI) and Deep Studying Containers provide three choices: the usual Docker pull, SOCI parallel pull, and SOCI lazy loading by SOCI index. Consider these as a sliding scale of tradeoffs. Docker pulls are sequential and sluggish. SOCI parallel pull offers sooner startup instances by chunking downloads at the price of compute assets. SOCI lazy loading offers near-instant container loading however requires recordsdata to be fetched on demand. You need to use the next information to decide on the suitable mechanism on your workloads:
- The selection between lazy loading and parallel pull modes will depend on the picture, occasion specs, and storage configuration. Lazy loading requires photographs to have a SOCI index. With out one, the system falls again to straightforward pulling.
- Decrease-spec cases ought to use lazy loading to preserve assets, whereas high-spec cases with a number of vCPUs and excessive community bandwidth profit from parallel pull mode. Storage efficiency varies: EBS volumes are bounded by their provisioned IOPS and quantity kind, probably creating bottlenecks throughout unpacking, whereas NVMe occasion retailer delivers most I/O efficiency at the price of information persistence throughout occasion cease/begin cycles.
The next instance exhibits the varied mechanisms based mostly on the vLLM Deep Studying Container:

Deep Studying Container Pull Mechanisms
Answer structure
The next diagram exhibits the structure for utilizing SOCI with DLAMI and Deep Studying Containers.

Container startup time comparability with SOCI snapshotter
The next benchmarks examine customary Docker pulls towards SOCI snapshotter in each lazy loading and parallel pull modes.
Lazy loading mode
Lazy loading mode begins containers instantly by fetching solely the mandatory information on demand, with remaining layers loaded within the background as wanted.
Conditions
SOCI index required
Necessary: Lazy loading mode requires the container picture to have a SOCI index saved within the registry. With no SOCI index, the snapshotter will fall again to straightforward pull conduct, and also you gained’t see any efficiency enchancment. AWS Deep Studying Containers (DLCs) with the -soci tag suffix include SOCI indexes pre-created and pushed to the registry, enabling lazy loading out of the field. For customized photographs, you need to create and push SOCI indexes
Setting
- Occasion Kind: g5.2xlarge
- EBS: Measurement 500GiB, IOPS 3000, Throughput 125
- AMI: Deep Studying Base OSS Nvidia Driver GPU AMI (Ubuntu 24.04) 20260413 (
ami-06abbbf2049359343) - Docker Picture:
public.ecr.aws/deep-learning-containers/vllm:0.19.0-gpu-py312-ec2-soci - Picture Measurement: 9.72GB (compressed), 32.7GB (disk utilization)
- Community: Corp
Begin container with Docker (non-SOCI)
We use Docker to start out the inference server immediately. Since no picture exists regionally, Docker pulls and extracts your complete picture earlier than beginning the container.
Whole time: 6m59.099s.
Begin container with SOCI snapshotter (lazy loading)
We use nerdctl with SOCI snapshotter to start out the inference container. Though no picture exists regionally, the SOCI-indexed picture permits nerdctl to drag solely the index and crucial layers to start out the container, enabling lazy loading of remaining layers. Whole time: 21.125s.
Lazy loading abstract
Utilizing SOCI snapshotter with lazy loading, the container began in 21.125 seconds, in comparison with 6 minutes 59.099 seconds with customary Docker. This enchancment is achieved as a result of SOCI pulls solely the mandatory layers to start out the container, with remaining layers loaded on demand as wanted.
Parallel pull mode
Whereas lazy loading mode begins containers instantly by fetching solely the required information on-demand, parallel pull mode downloads your complete picture earlier than startup however does so with greater concurrency than customary Docker pulls. This mode is good once you want the total picture out there at startup or when operating I/O-intensive workloads.
Setting
- Occasion Kind: g5.4xlarge
- EBS: 500GiB gp3, 16000 IOPS, 1000 MB/s Throughput
- AMI: Deep Studying Base OSS Nvidia Driver GPU AMI (Ubuntu 24.04) 20260413 (
ami-06abbbf2049359343) - Docker Picture:
763104351884.dkr.ecr.us-east-1.amazonaws.com/sglang:0.5.10-gpu-py312-cu129-ubuntu24.04-sagemaker - Picture Measurement: 19.32GB (compressed), 60.4GB (Disk Utilization)
- Community: Corp
Be aware: We use a non-public ECR picture for this benchmark as a result of public ECR is fronted by Amazon CloudFront, which limits community bandwidth and impacts parallel mode efficiency. Personal ECR is served immediately from Amazon Easy Storage Service (Amazon S3), offering greater throughput.
Enabling parallel pull mode
The SOCI snapshotter on Deep Studying AMI defaults to lazy loading mode. To allow parallel pull mode, modify the configuration file at /and so on/soci-snapshotter-grpc/config.toml:
Apply the configuration by restarting the service:
Tip: You’ll be able to tune max_concurrent_downloads_per_image and max_concurrent_unpacks_per_image based mostly in your occasion kind and community bandwidth. For detailed tuning steering, see Introducing Seekable OCI Parallel Pull Mode for Amazon EKS.
Verifying parallel mode is energetic
Monitor the SOCI snapshotter logs throughout picture pull to verify parallel mode is enabled:
Search for log entries indicating parallel pull/unpack:
Pull picture with Docker (non-SOCI)
Normal Docker pull downloads and extracts layers with restricted concurrency.
Whole time: 4m 44.163s
Pull picture with SOCI parallel mode
Utilizing nerdctl with SOCI parallel pull mode makes use of elevated concurrency for each downloads and unpacking operations.
Whole time: 2m 12.846s
Parallel pull abstract
Utilizing SOCI parallel pull mode decreased picture pull time from 4 minutes 44 seconds to 2 minutes 12 seconds, representing a 2.2x enchancment in pull efficiency.
Conclusion
SOCI snapshotter offers enhancements for each container startup and picture pull operations:
- Lazy loading mode — Achieved a 20x enchancment in container startup time (from 6+ minutes to ~21 seconds)
- Parallel pull mode — Achieved a 2.2x enchancment in picture pull time (from 4 minutes 44 seconds to 2 minutes 12 seconds)
Select lazy loading mode once you want the quickest doable container startup, or parallel pull mode once you want the total picture out there earlier than your workload begins.
Clear up
In case you launched EC2 cases to check SOCI snapshotter, terminate them to keep away from incurring ongoing costs. Delete any container photographs you pushed to Amazon Elastic Container Registry (Amazon ECR) throughout testing, and take away any SOCI indexes you not want.
Getting began with SOCI
DLAMI and Deep Studying Containers are publicly out there immediately with SOCI snapshotter and SOCI index. For extra info on publicly out there DLAMI and Deep Studying Containers, you’ll be able to try SOCI Index DLAMI to pick out the pictures that assist SOCI, and take a look at the Deep Studying Container repository to get extra info on supported photographs with SOCI index.
For detailed configuration steering and greatest practices, consult with the SOCI documentation and the Deep Studying Container SOCI documentation.
In regards to the authors

