Giant language fashions (LLMs) now help a variety of use circumstances, from content material summarization to the flexibility to cause about advanced duties. One thrilling new matter is taking generative AI to the bodily world by making use of it to robotics and bodily {hardware}.
Impressed by this, we developed a recreation for the AWS re:Invent 2024 Builders Honest utilizing Amazon Bedrock, Strands Brokers, AWS IoT Core, AWS Lambda, and Amazon DynamoDB. Our objective was to show how LLMs can cause about recreation technique, advanced duties, and management bodily robots in actual time.
RoboTic-Tac-Toe is an interactive recreation the place two bodily robots transfer round a tic-tac-toe board, with each the gameplay and robots’ actions orchestrated by LLMs. Gamers can management the robots utilizing pure language instructions, directing them to put their markers on the sport board. On this put up, we discover the structure and immediate engineering methods used to cause a few tic-tac-toe recreation and determine the subsequent greatest recreation technique and motion plan for the present participant.
An interactive expertise
RoboTic-Tac-Toe demonstrates an intuitive interplay between people, robots, and AI. Individuals can entry the sport portal by scanning a QR code, and select from a number of modes:
- Participant vs. Participant – Problem a human opponent
- Participant vs. LLM – Check your abilities in opposition to an AI-powered LLM
- LLM vs. LLM – Watch two AI fashions strategize and compete autonomously
When a participant chooses a goal cell, the 2 robots, positioned beside a tic-tac-toe board, reply to instructions by executing exact actions to put X or O markers. The next video reveals this in motion.
Resolution overview
RoboTic-Tac-Toe incorporates a seamless integration of AWS companies, assuaging the necessity for pre-programmed sequences. As a substitute, AI dynamically generates descriptive directions in actual time. The next diagram describes the structure constructed on AWS IoT Core, which allows communication between Raspberry Pi Managed robots and the cloud.

The answer makes use of the next key companies:
{Hardware} and software program
- The challenge’s bodily setup features a tic-tac-toe board embedded with LED indicators to focus on placements for X and O.
- The 2 robots (modified toy fashions) function by Raspberry Pi controllers outfitted with infrared and RF modules.
- A mounted Raspberry Pi digicam allows vision-based evaluation, capturing the board’s state and transmitting information for additional pc imaginative and prescient processing. Moreover, a devoted {hardware} controller acts as an IoT machine that connects to AWS IoT Core, which promotes clean gameplay interactions.

- On the software program facet, AWS Lambda handles invoking the supervisor Strands Agent, for the core recreation logic and orchestration.
- Laptop imaginative and prescient capabilities, powered by OpenCV, analyze the board’s structure and energy exact robotic actions. Amazon Bedrock brokers orchestrate duties to generate motion plans and recreation methods.
Strands Brokers in motion
Strands Brokers automate duties on your software customers by orchestrating interactions between the muse mannequin (FM), information sources, software program purposes, and person conversations.
Supervisor Agent
The Supervisor Agent acts as an orchestrator that manages each the Transfer Agent and the Recreation Agent, coordinating and streamlining selections throughout the system. This course of consists of the next steps:
- The agent receives high-level directions or gameplay occasions (for instance, “Participant X moved to 2B, generate the robotic’s response”) and determines which specialised agent—Transfer Agent or Recreation Agent—should be invoked.
- The Supervisor AWS Lambda operate serves because the central controller. When triggered, it parses the incoming request, validates the context, after which routes the request to the suitable Strands Agent. Tracing is enabled for all the workflow to permit for monitoring and debugging.
- Relying on the request sort:
- If it includes updating or analyzing the sport state, the Supervisor invokes the Recreation Agent, which retrieves the board standing and generates the subsequent AI-driven transfer.
- If it includes bodily robotic navigation, the Supervisor invokes the Transfer Agent, which produces the motion directions in Python code.
- The Supervisor Agent consolidates the responses from the underlying brokers and buildings them right into a unified output format. This permits for consistency whether or not the end result is a robotic command, a recreation transfer, or a mixture of each.
- The interactions, together with choice paths and remaining outputs, are logged in an S3 bucket. This logging mechanism offers traceability throughout a number of brokers and helps error dealing with by returning structured error messages when points come up.
This module offers a governance layer over the AI-powered surroundings, enabling scalable orchestration throughout brokers. By intelligently directing requests and unifying responses, the Supervisor Agent facilitates dependable execution, simplified monitoring, and enhanced person expertise.
Transfer Agent
The Transfer Agent generates step-by-step Python code. This course of consists of the next steps:
- The agent receives a begin and vacation spot place on a grid (for instance, “3A to 4B North”), determines the required actions, and sends instructions to the suitable robotic.
- The LLM Navigator AWS Lambda operate generates motion directions for robots utilizing Strands Brokers. When triggered, it receives a request containing a session ID and an enter textual content specifying the robotic’s beginning place and vacation spot. The operate then invokes the Strands Agent, sending the request together with tracing enabled to permit for debugging.
- The response from the agent consists of motion instructions akin to turning and transferring ahead in centimeters.
- These instructions are processed and logged in an S3 bucket below a CSV file. If the log file exists, new entries are appended. In any other case, a brand new file is created.
- The operate returns a JSON response containing the generated directions and the time taken to execute the request. If an error happens, a structured error message is returned.
This module offers environment friendly and traceable navigation for robots through the use of AI-powered instruction technology whereas sustaining a sturdy logging mechanism for monitoring and debugging.
Recreation Agent
The Recreation Agent features as an opponent, able to enjoying in opposition to human customers. To reinforce accessibility, gamers use a mobile-friendly net portal to work together with the sport, which incorporates an admin panel for managing AI-driven matches. The LLM participant is a serverless software that mixes AWS Lambda, Amazon DynamoDB, and Strands Agent to handle and automate the strikes. It tracks recreation progress by storing transfer historical past in an Amazon DynamoDB desk, permitting it to reconstruct the present board state every time requested. The gameplay course of consists of the next steps:
- When a participant makes a transfer, the supervisor Strands Agent retrieves this state operate after which calls the Strands Agent operate to generate the subsequent transfer. The agent choice is dependent upon the participant’s marker (
‘X’or‘O’), ensuring that the proper mannequin is used for decision-making. - The agent processes the present recreation board as enter and returns the advisable subsequent transfer by an occasion stream.
- All the workflow is orchestrated by the supervisor Strands Agent. This agent receives API requests, validates inputs, retrieves the board state, invokes the LLM mannequin, and returns a structured response containing the up to date recreation standing.
This technique permits for real-time, AI-driven gameplay, making it attainable for gamers to compete in opposition to an clever opponent powered by LLMs.
Powering robotic navigation with pc imaginative and prescient
In our RoboTic-Tac-Toe challenge, pc imaginative and prescient performs a vital function in producing exact robotic actions and gameplay accuracy. Let’s stroll by how we applied the answer utilizing AWS companies and superior pc imaginative and prescient methods. Our setup features a Raspberry Pi digicam mounted above the sport board, repeatedly monitoring the robots’ positions and actions. The digicam captures pictures which can be robotically uploaded to Amazon S3, forming the muse of our imaginative and prescient processing pipeline.
We use Principal Element Evaluation (PCA) to precisely detect and monitor robotic orientation and place on the sport board. This method helps cut back dimensionality whereas sustaining important options for robotic monitoring. The orientation angle is calculated based mostly on the principal parts of the robotic’s visible options.
Our OpenCV module is containerized and deployed as an Amazon SageMaker endpoint. It processes pictures saved in Amazon S3 to find out the next:
- Exact robotic positioning on the sport board
- Present orientation angles
- Motion validation
A devoted AWS Lambda operate orchestrates the imaginative and prescient processing workflow. It handles the next:
- SageMaker endpoint invocation
- Processing of imaginative and prescient evaluation outcomes
- Actual-time place and orientation updates
This pc imaginative and prescient system facilitates correct robotic navigation and recreation state monitoring, contributing to the seamless gameplay expertise in RoboTic-Tac-Toe. The mixture of PCA for orientation detection, OpenCV for picture processing, and AWS companies for deployment helps create a sturdy and scalable pc imaginative and prescient answer.

Conclusion
RoboTic-Tac-Toe showcases how AI, robotics, and cloud computing can converge to create interactive experiences. This challenge highlights the potential of AWS IoT, machine studying (ML), and generative AI in gaming, training, and past. As AI-driven robotics proceed to evolve, RoboTic-Tac-Toe serves as a glimpse into the way forward for clever, interactive gaming.
Keep tuned for future enhancements, expanded gameplay modes, and much more participating AI-powered interactions.
In regards to the authors
Georges Hamieh is a Senior Technical Account Supervisor at Amazon Net Companies, specialised in Information and AI. Keen about innovation and expertise, he companions with prospects to speed up their digital transformation and cloud adoption journeys. An skilled public speaker and mentor, Georges enjoys capturing life by images and exploring new locations on street journeys along with his household.
Mohamed Salah is a Senior Options Architect at Amazon Net Companies, supporting prospects throughout the Center East and North Africa in constructing scalable and clever cloud options. He’s captivated with Generative AI, Digital Twins, and serving to organizations flip innovation into influence. Outdoors work, Mohamed enjoys enjoying PlayStation, constructing LEGO units, and watching motion pictures along with his household.
Saddam Hussain is a Senior Options Architect at Amazon Net Companies, specializing in Aerospace, Generative AI, and Innovation & Transformation follow areas. Drawing from Amazon.com’s pioneering journey in AI/ML and Generative AI, he helps organizations perceive confirmed methodologies and greatest practices which have scaled throughout hundreds of thousands of shoppers. His major focus helps Public Sector prospects throughout UAE to innovate on AWS, guiding them by complete Cloud adoption framework (CAF) to strategically undertake cutting-edge applied sciences whereas constructing sustainable capabilities.
Dr. Omer Dawelbeit is a Principal Options Architect at AWS. He’s captivated with tackling advanced expertise challenges and dealing intently with prospects to design and implement scalable, high-impact options. Omer has over twenty years of monetary companies, public sector and telecoms expertise throughout startups, enterprises, and large-scale expertise transformations.

