Amazon Nova Sonic is a basis mannequin that creates pure, human-like speech-to-speech conversations for generative AI purposes, permitting customers to work together with AI via voice in real-time, with capabilities for understanding tone, enabling pure movement, and performing actions.
Multi-agent structure presents a modular, sturdy, and scalable design sample for production-level voice assistants. This weblog put up explores Amazon Nova Sonic voice agent purposes and demonstrates how they combine with Strands Brokers framework sub-agents whereas leveraging Amazon Bedrock AgentCore to create an efficient multi-agent system.
Why multi-agent structure?
Think about growing a monetary assistant utility answerable for consumer onboarding, info assortment, id verification, account inquiries, exception dealing with, and handing off to human brokers primarily based on predefined situations. As purposeful necessities increase, the voice agent continues so as to add new inquiry varieties. The system immediate grows monumental, and the underlying logic turns into more and more complicated, illustrates a persistent problem in software program growth: monolithic designs result in techniques which can be tough to keep up and improve.
Consider multi-agent structure as constructing a workforce of specialised AI assistants quite than counting on a single do-it-all helper. Similar to corporations divide duties throughout completely different departments, this strategy breaks complicated duties into smaller, manageable items. Every AI agent turns into an skilled in a selected space—whether or not that’s fact-checking, knowledge processing, or dealing with specialised requests. For the consumer, the expertise feels seamless: there’s no delay, no change in voice, and no seen handoff. The system features behind the scenes, directing every skilled agent to step in on the proper second.
Along with modular and sturdy advantages, multi-agent techniques supply benefits just like a microservice structure, a well-liked enterprise software program design sample, offering scalability, distribution and maintainability whereas permitting organizations to reuse agentic workflows already developed for his or her massive language mannequin (LLM)-powered purposes.
Pattern utility
On this weblog, we discuss with the Amazon Nova Sonic workshop multi-agent lab code, which makes use of the banking voice assistant as a pattern to exhibit easy methods to deploy specialised brokers on Amazon Bedrock AgentCore. It makes use of Nova Sonic as the voice interface layer and acts as an orchestrator to delegate detailed inquiries to sub-agents written in Strands Brokers hosted on AgentCore Runtime. You will discover the pattern supply code on the GitHub repo.
Within the banking voice agent pattern, the dialog movement begins with a greeting and accumulating the consumer’s title, after which it handles inquiries associated to banking or mortgages. We use three secondary degree brokers hosted on AgentCore to deal with specialised logic:
- Authenticate sub-agent: Handles consumer authentication utilizing the account ID and different info
- Banking sub-agent: Handles account steadiness checks, statements, and different banking-related inquiries
- Mortgage sub-agent: Handles mortgage-related inquiries, together with refinancing, charges, and compensation choices

Sub-agents are self-contained, dealing with their very own logic akin to enter validation. As an example, the authentication agent validates account IDs and returns errors to Nova Sonic if wanted. This simplifies the reasoning logic in Nova Sonic whereas maintaining enterprise logic encapsulated, just like the software program engineering modular design patterns.
Combine Nova Sonic with AgentCore via software use occasions
Amazon Nova Sonic depends on software use to combine with agentic workflows. In the course of the Nova Sonic occasion lifecycle, you’ll be able to present software use configurations via the promptStart occasion, which is designed to provoke when Sonic receives particular sorts of enter.
For instance, within the following Sonic software configuration pattern, software use is configured to provoke occasions primarily based on Sonic’s built-in reasoning mannequin, which classifies the inquiry for routing to the banking sub-agents.
When a consumer asks Nova Sonic a query akin to ‘What’s my account steadiness?’, Sonic sends a toolUse occasion to the consumer utility with the required toolName (for instance, bankAgent) outlined within the configuration. The appliance can then invoke the sub-agent hosted on AgentCore to deal with the banking logic and return the response to Sonic, which in flip generates an audio reply for the consumer.
Sub-agent on AgentCore
The next pattern showcases the banking sub-agent developed utilizing the Strands Brokers framework, particularly configured for deployment on Bedrock AgentCore. It leverages Nova Lite via Amazon Bedrock as its reasoning mannequin, offering efficient cognitive capabilities with minimal latency. The agent implementation contains a system immediate that defines its banking assistant duties, complemented by two specialised instruments: one for account steadiness inquiries and one other for financial institution assertion retrieval.
Greatest practices for voice-based multi-agent techniques
Multi-agent structure offers distinctive flexibility and a modular design strategy, permitting builders to construction voice assistants effectively and doubtlessly reuse present specialised agent workflows. When implementing voice-first experiences, there are necessary greatest practices to contemplate that handle the distinctive challenges of this modality.
- Stability flexibility and latency: Though the flexibility to invoke sub-agents utilizing Nova Sonic software use occasions creates highly effective capabilities, it might introduce further latency to voice responses. For the use instances that require a synchronized expertise, every agent handoff represents a possible delay level within the interplay movement. Due to this fact, it’s necessary to design with response time in thoughts.
- Optimize mannequin choice for sub-agents: Beginning with smaller, extra environment friendly fashions like Nova Lite for sub-agents can considerably cut back latency whereas nonetheless dealing with specialised duties successfully. Reserve bigger, extra succesful fashions for complicated reasoning or when refined pure language understanding is important.
- Craft voice-optimized responses: Voice assistants carry out greatest with concise, centered responses that may be adopted by further particulars when wanted. This strategy not solely improves latency but additionally creates a extra pure conversational movement that aligns with human expectations for verbal communication.
Think about stateless vs. stateful sub-agent design
Stateless sub-agents deal with every request independently, with out retaining reminiscence of previous interactions or session-level states. They’re easy to implement, simple to scale, and work nicely for simple, one-off duties. Nevertheless, they can’t present context-aware responses except exterior state administration is launched.
Stateful sub-agents, however, preserve reminiscence throughout interactions to help context-aware responses and session-level states. This allows extra personalised and cohesive consumer experiences, however comes with added complexity and useful resource necessities. They’re greatest suited to situations involving multi-turn interactions and consumer or session-level context caching.
Conclusion
Multi-agent architectures unlock flexibility, scalability, and accuracy for complicated AI-driven workflows. By combining the Nova Sonic conversational capabilities with the orchestration energy of Bedrock AgentCore, you’ll be able to construct clever, specialised brokers that work collectively seamlessly. Should you’re exploring methods to reinforce your AI purposes, multi-agent patterns with Nova Sonic and AgentCore are a strong strategy price testing.
Study extra about Amazon Nova Sonic by visiting the Consumer Information, constructing your utility with the pattern purposes, and exploring the Nova Sonic workshop to get began. You too can discuss with the technical report and mannequin card for added benchmarks.
In regards to the authors
Lana Zhang is a Senior Specialist Options Architect for Generative AI at AWS throughout the Worldwide Specialist Group. She makes a speciality of AI/ML, with a give attention to use instances akin to AI voice assistants and multimodal understanding. She works carefully with clients throughout various industries, together with media and leisure, gaming, sports activities, promoting, monetary companies, and healthcare, to assist them remodel their enterprise options via AI.

