How iFood constructed a platform to run a whole bunch of machine studying fashions with Amazon SageMaker Inference

Headquartered in São Paulo, Brazil, iFood is a nationwide non-public firm and the chief in food-tech in Latin America, processing hundreds of thousands of orders month-to-month. iFood has stood out for its technique of incorporating cutting-edge know-how into its operations. With the assist of AWS, iFood has developed a strong machine studying (ML) inference infrastructure, utilizing providers comparable to Amazon SageMaker to effectively create and deploy ML fashions. This partnership has allowed iFood not solely to optimize its inside processes, but in addition to supply revolutionary options to its supply companions and eating places.

iFood’s ML platform contains a set of instruments, processes, and workflows developed with the next aims:

Speed up the event and coaching of AI/ML fashions, making them extra dependable and reproducible
Ensure that deploying these fashions to manufacturing is dependable, scalable, and traceable
Facilitate the testing, monitoring, and analysis of fashions in manufacturing in a clear, accessible, and standardized method

To attain these aims, iFood makes use of SageMaker, which simplifies the coaching and deployment of fashions. Moreover, the combination of SageMaker options in iFood’s infrastructure automates essential processes, comparable to producing coaching datasets, coaching fashions, deploying fashions to manufacturing, and repeatedly monitoring their efficiency.

On this publish, we present how iFood makes use of SageMaker to revolutionize its ML operations. By harnessing the facility of SageMaker, iFood streamlines the whole ML lifecycle, from mannequin coaching to deployment. This integration not solely simplifies advanced processes but in addition automates essential duties.

AI inference at iFood
iFood has harnessed the facility of a strong AI/ML platform to raise the client expertise throughout its numerous touchpoints. Utilizing the chopping fringe of AI/ML capabilities, the corporate has developed a set of transformative options to deal with a large number of buyer use circumstances:

Personalised suggestions – At iFood, AI-powered suggestion fashions analyze a buyer’s previous order historical past, preferences, and contextual elements to recommend essentially the most related eating places and menu objects. This customized strategy makes certain clients uncover new cuisines and dishes tailor-made to their tastes, enhancing satisfaction and driving elevated order volumes.
Clever order monitoring – iFood’s AI techniques observe orders in actual time, predicting supply instances with a excessive diploma of accuracy. By understanding elements like visitors patterns, restaurant preparation instances, and courier areas, the AI can proactively notify clients of their order standing and anticipated arrival, lowering uncertainty and anxiousness throughout the supply course of.
Automated buyer Service – To deal with the hundreds of each day buyer inquiries, iFood has developed an AI-powered chatbot that may shortly resolve frequent points and questions. This clever digital agent understands pure language, accesses related information, and gives customized responses, delivering quick and constant assist with out overburdening the human customer support crew.
Grocery procuring help – Integrating superior language fashions, iFood’s app permits clients to easily converse or kind their recipe wants or grocery listing, and the AI will mechanically generate an in depth procuring listing. This voice-enabled grocery planning function saves clients effort and time, enhancing their general procuring expertise.

By these numerous AI-powered initiatives, iFood is ready to anticipate buyer wants, streamline key processes, and ship a constantly distinctive expertise—additional strengthening its place because the main food-tech platform in Latin America.

Resolution overview

The next diagram illustrates iFood’s legacy structure, which had separate workflows for information science and engineering groups, creating challenges in effectively deploying correct, real-time machine studying fashions into manufacturing techniques.

Prior to now, the information science and engineering groups at iFood operated independently. Information scientists would construct fashions utilizing notebooks, regulate weights, and publish them onto providers. Engineering groups would then battle to combine these fashions into manufacturing techniques. This disconnection between the 2 groups made it difficult to deploy correct real-time ML fashions.

To beat this problem, iFood constructed an inside ML platform that helped bridge this hole. This platform has streamlined the workflow, offering a seamless expertise for creating, coaching, and delivering fashions for inference. It gives a centralized integration the place information scientists might construct, practice, and deploy fashions seamlessly from an built-in strategy, contemplating the event workflow of the groups. The interplay with engineering groups might eat these fashions and combine them into purposes from each a web-based and offline perspective, enabling a extra environment friendly and streamlined workflow.

By breaking down the limitations between information science and engineering, AWS AI platforms empowered iFood to make use of the total potential of their information and speed up the event of AI purposes. The automated deployment and scalable inference capabilities supplied by SageMaker made certain that fashions have been available to energy clever purposes and supply correct predictions on demand. This centralization of ML providers as a product has been a recreation changer for iFood, permitting them to concentrate on constructing high-performing fashions slightly than the intricate particulars of inference.

One of many core capabilities of iFood’s ML platform is the flexibility to offer the infrastructure to serve predictions. A number of use circumstances are supported by the inference made obtainable by means of ML Go!, accountable for deploying SageMaker pipelines and endpoints. The previous are used to schedule offline predictions jobs, and the latter are employed to create mannequin providers, to be consumed by the applying providers. The next diagram illustrates iFood’s up to date structure, which includes an inside ML platform constructed to streamline workflows between information science and engineering groups, enabling environment friendly deployment of machine studying fashions into manufacturing techniques.

Integrating mannequin deployment into the service growth course of was a key initiative to allow information scientists and ML engineers to deploy and preserve these fashions. The ML platform empowers the constructing and evolution of ML techniques. A number of different integrations with different essential platforms, just like the function platform and information platform, have been delivered to extend the expertise for the customers as an entire. The method of consuming ML-based selections was streamlined—however it doesn’t finish there. The iFood’s ML platform, ML Go!, is now specializing in new inference capabilities, supported by current options by which the iFood’s crew was accountable for supporting their ideation and growth. The next diagram illustrates the ultimate structure of iFood’s ML platform, showcasing how mannequin deployment is built-in into the service growth course of, the platform’s connections with function and information platforms, and its concentrate on new inference capabilities.

One of many greatest modifications is oriented to the creation of 1 abstraction for connecting with SageMaker Endpoints and Jobs, referred to as ML Go! Gateway, and in addition, the separation of issues inside the Endpoints, by means of the Inference Parts function, making the serving quicker and extra environment friendly. On this new inference construction, the Endpoints are additionally managed by the ML Go! CI/CD, leaving for the pipelines, to deal solely with mannequin promotions, and never the infrastructure itself. It should cut back the lead time to modifications, and alter failure ratio over the deployments.

Utilizing SageMaker Inference Mannequin Serving Containers:

One of many key options of recent machine studying platforms is the standardization of machine studying and AI providers. By encapsulating fashions and dependencies as Docker containers, these platforms guarantee consistency and portability throughout completely different environments and phases of ML. Utilizing SageMaker, information scientists and builders can use pre-built Docker containers, making it simple to deploy and handle ML providers. As a mission progresses, they’ll spin up new situations and configure them in accordance with their particular necessities. SageMaker gives Docker containers which are designed to work seamlessly with SageMaker. These containers present a standardized and scalable setting for operating ML workloads on SageMaker.

SageMaker gives a set of pre-built containers for standard ML frameworks and algorithms, comparable to TensorFlow, PyTorch, XGBoost, and lots of others. These containers are optimized for efficiency and embody all the mandatory dependencies and libraries pre-installed, making it simple to get began along with your ML tasks. Along with the pre-built containers, it gives choices to deliver your personal customized containers to SageMaker, which embody your particular ML code, dependencies, and libraries. This may be significantly helpful for those who’re utilizing a much less frequent framework or have particular necessities that aren’t met by the pre-built containers.

iFood was extremely targeted on utilizing customized containers for the coaching and deployment of ML workloads, offering a constant and reproducible setting for ML experiments, and making it easy to trace and replicate outcomes. Step one on this journey was to standardize the ML customized code, which is definitely the piece of code that the information scientists ought to concentrate on. With out a pocket book, and with BruceML, the way in which to create the code to coach and serve fashions has modified, to be encapsulated from the beginning as container photos. BruceML was accountable for creating the scaffolding required to seamlessly combine with the SageMaker platform, permitting the groups to benefit from its varied options, comparable to hyperparameter tuning, mannequin deployment, and monitoring. By standardizing ML providers and utilizing containerization, fashionable platforms democratize ML, enabling iFood to quickly construct, deploy, and scale clever purposes.

Automating mannequin deployment and ML system retraining

When operating ML fashions in manufacturing, it’s essential to have a strong and automatic course of for deploying and recalibrating these fashions throughout completely different use circumstances. This helps ensure that the fashions stay correct and performant over time. The crew at iFood understood this problem properly—not solely the mannequin is deployed. As a substitute, they depend on one other idea to maintain issues operating properly: ML pipelines.

Utilizing Amazon SageMaker Pipelines, they have been capable of construct a CI/CD system for ML, to ship automated retraining and mannequin deployment. Additionally they built-in this whole system with the corporate’s present CI/CD pipeline, making it environment friendly and in addition sustaining good DevOps practices used at iFood. It begins with the ML Go! CI/CD pipeline pushing the newest code artifacts containing the mannequin coaching and deployment logic. It contains the coaching course of, which makes use of completely different containers for implementing the whole pipeline. When coaching is full, the inference pipeline might be executed to start the mannequin deployment. It may be a completely new mannequin, or the promotion of a brand new model to extend the efficiency of an present one. Each mannequin obtainable for deployment can also be secured and registered mechanically by ML Go! in Amazon SageMaker Mannequin Registry, offering versioning and monitoring capabilities.

The ultimate step will depend on the supposed inference necessities. For batch prediction use circumstances, the pipeline creates a SageMaker batch rework job to run large-scale predictions. For real-time inference, the pipeline deploys the mannequin to a SageMaker endpoint, rigorously choosing the suitable container variant and occasion kind to deal with the anticipated manufacturing visitors and latency wants. This end-to-end automation has been a recreation changer for iFood, permitting them to quickly iterate on their ML fashions and deploy updates and recalibrations shortly and confidently throughout their varied use circumstances. SageMaker Pipelines has supplied a streamlined technique to orchestrate these advanced workflows, ensuring mannequin operationalization is environment friendly and dependable.

Operating inference in several SLA codecs

iFood makes use of the inference capabilities of SageMaker to energy its clever purposes and ship correct predictions to its clients. By integrating the sturdy inference choices obtainable in SageMaker, iFood has been capable of seamlessly deploy ML fashions and make them obtainable for real-time and batch predictions. For iFood’s on-line, real-time prediction use circumstances, the corporate makes use of SageMaker hosted endpoints to deploy their fashions. These endpoints are built-in into iFood’s customer-facing purposes, permitting for fast inference on incoming information from customers. SageMaker handles the scaling and administration of those endpoints, ensuring that iFood’s fashions are available to offer correct predictions and improve the person expertise.

Along with real-time predictions, iFood additionally makes use of SageMaker batch rework to carry out large-scale, asynchronous inference on datasets. That is significantly helpful for iFood’s information preprocessing and batch prediction necessities, comparable to producing suggestions or insights for his or her restaurant companions. SageMaker batch rework jobs allow iFood to effectively course of huge quantities of information, additional enhancing their data-driven decision-making.

Constructing upon the success of standardization to SageMaker Inference, iFood has been instrumental in partnering with the SageMaker Inference crew to construct and improve key AI inference capabilities inside the SageMaker platform. Because the early days of ML, iFood has supplied the SageMaker Inference crew with invaluable inputs and experience, enabling the introduction of a number of new options and optimizations:

Value and efficiency optimizations for generative AI inference – iFood helped the SageMaker Inference crew develop revolutionary strategies to optimize using accelerators, enabling SageMaker Inference to scale back basis mannequin (FM) deployment prices by 50% on common and latency by 20% on common with inference parts. This breakthrough delivers important price financial savings and efficiency enhancements for purchasers operating generative AI workloads on SageMaker.
Scaling enhancements for AI inference – iFood’s experience in distributed techniques and auto scaling has additionally helped the SageMaker crew develop superior capabilities to raised deal with the scaling necessities of generative AI fashions. These enhancements cut back auto scaling instances by as much as 40% and auto scaling detection by six instances, ensuring that clients can quickly scale their inference workloads on SageMaker to satisfy spikes in demand with out compromising efficiency.
Streamlined generative AI mannequin deployment for inference – Recognizing the necessity for simplified mannequin deployment, iFood collaborated with AWS to introduce the flexibility to deploy open supply massive language fashions (LLMs) and FMs with only a few clicks. This user-friendly performance removes the complexity historically related to deploying these superior fashions, empowering extra clients to harness the facility of AI.
Scale-to-zero for inference endpoints – iFood performed a vital function in collaborating with SageMaker Inference to develop and launch the scale-to-zero function for SageMaker inference endpoints. This revolutionary functionality permits inference endpoints to mechanically shut down when not in use and quickly spin up on demand when new requests arrive. This function is especially useful for dev/take a look at environments, low-traffic purposes, and inference use circumstances with various inference calls for, as a result of it eliminates idle useful resource prices whereas sustaining the flexibility to shortly serve requests when wanted. The dimensions-to-zero performance represents a serious development in cost-efficiency for AI inference, making it extra accessible and economically viable for a wider vary of use circumstances.
Packaging AI mannequin inference extra effectively – To additional simplify the AI mannequin lifecycle, iFood labored with AWS to reinforce SageMaker’s capabilities for packaging LLMs and fashions for deployment. These enhancements make it simple to arrange and deploy these AI fashions, accelerating their adoption and integration.
Multi-model endpoints for GPU – iFood collaborated with the SageMaker Inference crew to launch multi-model endpoints for GPU-based situations. This enhancement means that you can deploy a number of AI fashions on a single GPU-enabled endpoint, considerably enhancing useful resource utilization and cost-efficiency. By making the most of iFood’s experience in GPU optimization and mannequin serving, SageMaker now provides an answer that may dynamically load and unload fashions on GPUs, lowering infrastructure prices by as much as 75% for purchasers with a number of fashions and ranging visitors patterns.
Asynchronous inference – Recognizing the necessity for dealing with long-running inference requests, the crew at iFood labored carefully with the SageMaker Inference crew to develop and launch Asynchronous Inference in SageMaker. This function allows you to course of massive payloads or time-consuming inference requests with out the constraints of real-time API calls. iFood’s expertise with large-scale distributed techniques helped form this answer, which now permits for higher administration of resource-intensive inference duties, and the flexibility to deal with inference requests that may take a number of minutes to finish. This functionality has opened up new use circumstances for AI inference, significantly in industries coping with advanced information processing duties comparable to genomics, video evaluation, and monetary modeling.

By carefully partnering with the SageMaker Inference crew, iFood has performed a pivotal function in driving the speedy evolution of AI inference and generative AI inference capabilities in SageMaker. The options and optimizations launched by means of this collaboration are empowering AWS clients to unlock the transformative potential of inference with larger ease, cost-effectiveness, and efficiency.

“At iFood, we have been on the forefront of adopting transformative machine studying and AI applied sciences, and our partnership with the SageMaker Inference product crew has been instrumental in shaping the way forward for AI purposes. Collectively, we’ve developed methods to effectively handle inference workloads, permitting us to run fashions with pace and price-performance. The teachings we’ve discovered supported us within the creation of our inside platform, which might function a blueprint for different organizations seeking to harness the facility of AI inference. We consider the options we have now in-built collaboration will broadly assist different enterprises who run inference workloads on SageMaker, unlocking new frontiers of innovation and enterprise transformation, by fixing recurring and essential issues within the universe of machine studying engineering.”

– says Daniel Vieira, ML Platform supervisor at iFood.

Conclusion

Utilizing the capabilities of SageMaker, iFood reworked its strategy to ML and AI, unleashing new prospects for enhancing the client expertise. By constructing a strong and centralized ML platform, iFood has bridged the hole between its information science and engineering groups, streamlining the mannequin lifecycle from growth to deployment. The combination of SageMaker options has enabled iFood to deploy ML fashions for each real-time and batch-oriented use circumstances. For real-time, customer-facing purposes, iFood makes use of SageMaker hosted endpoints to offer fast predictions and improve the person expertise. Moreover, the corporate makes use of SageMaker batch rework to effectively course of massive datasets and generate insights for its restaurant companions. This flexibility in inference choices has been key to iFood’s capability to energy a various vary of clever purposes.

The automation of deployment and retraining by means of ML Go!, supported by SageMaker Pipelines and SageMaker Inference, has been a recreation changer for iFood. This has enabled the corporate to quickly iterate on its ML fashions, deploy updates with confidence, and preserve the continuing efficiency and reliability of its clever purposes. Furthermore, iFood’s strategic partnership with the SageMaker Inference crew has been instrumental in driving the evolution of AI inference capabilities inside the platform. By this collaboration, iFood has helped form price and efficiency optimizations, scale enhancements, and simplify mannequin deployment options—all of which are actually benefiting a wider vary of AWS clients.

By making the most of the capabilities SageMaker provides, iFood has been capable of unlock the transformative potential of AI and ML, delivering revolutionary options that improve the client expertise and strengthen its place because the main food-tech platform in Latin America. This journey serves as a testomony to the facility of cloud-based AI infrastructure and the worth of strategic partnerships in driving technology-driven enterprise transformation.

By following iFood’s instance, you possibly can unlock the total potential of SageMaker for your corporation, driving innovation and staying forward in your trade.

Concerning the Authors

Daniel Vieira is a seasoned Machine Studying Engineering Supervisor at iFood, with a robust tutorial background in laptop science, holding each a bachelor’s and a grasp’s diploma from the Federal College of Minas Gerais (UFMG). With over a decade of expertise in software program engineering and platform growth, Daniel leads iFood’s ML platform, constructing a strong, scalable ecosystem that drives impactful ML options throughout the corporate. In his spare time, Daniel Vieira enjoys music, philosophy, and studying about new issues whereas consuming an excellent cup of espresso.

Debora Fanin serves as a Senior Buyer Options Supervisor AWS for the Digital Native Enterprise phase in Brazil. On this function, Debora manages buyer transformations, creating cloud adoption methods to assist cost-effective, well timed deployments. Her obligations embody designing change administration plans, guiding solution-focused selections, and addressing potential dangers to align with buyer aims. Debora’s tutorial path features a Grasp’s diploma in Administration at FEI and certifications comparable to Amazon Options Architect Affiliate and Agile credentials. Her skilled historical past spans IT and mission administration roles throughout numerous sectors, the place she developed experience in cloud applied sciences, information science, and buyer relations.

Saurabh Trikande is a Senior Product Supervisor for Amazon Bedrock and Amazon SageMaker Inference. He’s keen about working with clients and companions, motivated by the objective of democratizing AI. He focuses on core challenges associated to deploying advanced AI purposes, inference with multi-tenant fashions, price optimizations, and making the deployment of generative AI fashions extra accessible. In his spare time, Saurabh enjoys climbing, studying about revolutionary applied sciences, following TechCrunch, and spending time along with his household.

Gopi Mudiyala is a Senior Technical Account Supervisor at AWS. He helps clients within the monetary providers trade with their operations in AWS. As a machine studying fanatic, Gopi works to assist clients succeed of their ML journey. In his spare time, he likes to play badminton, spend time with household, and journey.

How iFood constructed a platform to run a whole bunch of machine studying fashions with Amazon SageMaker Inference

Avoiding Expensive Errors with Uncertainty Quantification for Algorithmic House Valuations

Repurposing Protein Folding Fashions for Technology with Latent Diffusion – The Berkeley Synthetic Intelligence Analysis Weblog

Repurposing Protein Folding Fashions for Technology with Latent Diffusion – The Berkeley Synthetic Intelligence Analysis Weblog

Leave a Reply Cancel reply

Popular News

How Aviva constructed a scalable, safe, and dependable MLOps platform utilizing Amazon SageMaker

Speed up edge AI improvement with SiMa.ai Edgematic with a seamless AWS integration

Unlocking Japanese LLMs with AWS Trainium: Innovators Showcase from the AWS LLM Growth Assist Program

The Journey from Jupyter to Programmer: A Fast-Begin Information

The right way to run Qwen 2.5 on AWS AI chips utilizing Hugging Face libraries

About Us

Category

Recent Posts