The rise of synthetic intelligence (AI) brokers marks a change in software program growth and the way purposes make selections and work together with customers. Whereas conventional programs observe predictable paths, AI brokers have interaction in complicated reasoning that is still hidden from view. This invisibility creates a problem for organizations: how can they belief what they’ll’t see? That is the place agent observability enters the image, providing deep insights into how agentic purposes carry out, work together, and execute duties.
On this put up, we clarify easy methods to combine Langfuse observability with Amazon Bedrock AgentCore to achieve deep visibility into an AI agent’s efficiency, debug points sooner, and optimize prices. We stroll by means of a whole implementation utilizing Strands brokers deployed on AgentCore Runtime adopted by step-by-step code examples.
Amazon Bedrock AgentCore is a complete agentic platform that may deploy and function extremely succesful AI brokers securely, at scale. It provides purpose-built infrastructure for dynamic agent workloads, highly effective instruments to reinforce brokers, and important controls for real-world deployment. AgentCore is comprised of absolutely managed companies that can be utilized collectively or independently. These companies work with any framework together with CrewAI, LangGraph, LlamaIndex, and Strands Brokers, and any basis mannequin in or exterior of Amazon Bedrock, providing flexibility and reliability. AgentCore emits telemetry knowledge in standardized OpenTelemetry (OTEL)-compatible format, enabling straightforward integration with an present monitoring and observability stack. It provides detailed visualizations of every step within the agent workflow, enabling inspection of an agent’s execution path, audit intermediate outputs, and debugging efficiency bottlenecks and failures.
How Langfuse tracing works
Langfuse makes use of OpenTelemetry to hint and monitor brokers deployed on Amazon Bedrock AgentCore. OpenTelemetry is a Cloud Native Computing Basis (CNCF) mission that gives a set of specs, APIs, and libraries that outline an ordinary strategy to accumulate distributed traces and metrics from an utility. Customers can now observe efficiency metrics together with token utilization, latency, and execution durations throughout totally different processing phases. The system creates hierarchical hint buildings that seize each streaming and non-streaming responses, with detailed operation attributes and error states.
Via the /api/public/otel endpoint, Langfuse features as an OpenTelemetry Backend, mapping traces to its knowledge mannequin utilizing generative AI conventions. That is notably invaluable for complicated giant language mannequin (LLM) purposes using chains and brokers with instruments, the place nested traces assist builders rapidly establish and resolve points. The combination helps systematic debugging, efficiency monitoring, and audit path upkeep, making it simpler for groups to construct and keep dependable AI purposes on Amazon Bedrock AgentCore.
Along with Agent observability, Langfuse provides a collection of built-in instruments overlaying the complete LLM utility growth lifecycle. This contains operating automated llm-as-a-judge evaluators (on-line/offline), organizing knowledge labeling for root trigger evaluation and evaluator alignment, observe experiments (native and in CI), iterate in prompts interactively in a playground, and model management them in UI utilizing immediate administration.
Resolution overview
This put up exhibits easy methods to deploy a Strands agent on Amazon Bedrock AgentCore Runtime with Langfuse observability. The implementation makes use of Anthropic Claude fashions by means of Amazon Bedrock. Telemetry knowledge flows from the Strands agent by means of OTEL exporters to Langfuse for monitoring and debugging. To make use of Langfuse, set disable_otel=True within the AgentCore runtime deployment. This turns off AgentCore’s default observability.
Determine 1: Structure overview
Key elements used within the answer are:
- Strands Brokers: Python framework for constructing LLM-powered brokers with built-in telemetry help
- Amazon Bedrock AgentCore Runtime: Managed runtime service for internet hosting and scaling brokers on Amazon Net Companies (AWS)
- Langfuse: Open-source observability and analysis platform for LLM purposes that receives traces through OTEL
- OpenTelemetry: Business-standard protocol for gathering and exporting telemetry knowledge
Technical implementation information
Now that now we have lined how Langfuse tracing works, we are able to stroll by means of easy methods to implement it with Amazon Bedrock AgentCore.
Conditions
- An AWS account
- Earlier than utilizing Amazon Bedrock, verify all AWS credentials are configured appropriately. They are often arrange utilizing the AWS CLI or by setting setting variables. For this pocket book we assume that the credentials are already configured.
- Amazon Bedrock Mannequin Entry for Anthropic Claude 3.7 in us-west-2 area
- Amazon Bedrock AgentCore permissions
- Python 3.10+
- Docker put in regionally
- A Langfuse account, which is required to create a Langfuse API Key.
- Customers have to register at Langfuse cloud, create a mission, and get API keys
- Alternatively, you may self-host Langfuse inside your individual AWS account utilizing the Terraform module.
Walkthrough
The next steps stroll by means of easy methods to use Langfuse for gathering traces from brokers created utilizing Strands SDK in AgentCore runtime. Customers may also discuss with this pocket book on Github to get began with it instantly.
Clone this Github repo:
As soon as the repo is cloned, go to the Amazon Bedrock AgentCore Samples listing, discover the pocket book runtime_with_strands_and_langfuse.ipynb and begin operating every cell.
Step 1: Python dependencies and necessities packages for our Strands agent
Execute the under cell to put in the dependencies that are outlined in the necessities.txt file.
Step 2: Agent implementation
The agent file (strands_claude.py) implements a journey agent with net search capabilities.
Step 3: Configure AgentCore Runtime deployment
Subsequent, use our starter toolkit to configure the AgentCore Runtime deployment with an entry level, the execution position we created, and a necessities file. Moreover, configure the starter package to auto create the Amazon Elastic Container Registry (ECR) repository on launch.
Throughout the configure step, the docker file is generated primarily based on the applying code. When utilizing the bedrock_agentcore_starter_toolkit to configure the agent, it configures AgentCore Observability by default. Subsequently, to make use of Langfuse, customers ought to disable OTEL by setting the configuration flag as “True” as proven within the following code block.
Determine 2: Configure AgentCore Runtime
Step 4: Deploy to AgentCore Runtime
Now {that a} docker file has been generated, launch the agent to the AgentCore Runtime to create the Amazon ECR repository and the AgentCore Runtime.
Now configure the Langfuse secret key, public key and OTEL endpoints in AWS Programs Supervisor Parameter Retailer, which supplies safe, hierarchical storage for configuration knowledge administration and secrets and techniques administration.
The next desk describes the assorted configuration parameters getting used.
| Parameter | Description | Default |
|---|---|---|
langfuse_public_key |
API key for OTEL endpoint | Atmosphere variable |
langfuse_secret_key |
Secret key for OTEL endpoint | Atmosphere variable |
OTEL_EXPORTER_OTLP_ENDPOINT |
Hint endpoint | https://cloud.langfuse.com/api/public/otel/v1/traces |
OTEL_EXPORTER_OTLP_HEADERS |
Authentication sort | Fundamental |
DISABLE_ADOT_OBSERVABILITY |
AWS Distro for Open Telemetry (ADOT). The implementation disables Agent Core’s default observability to make use of Langfuse as a substitute. | True |
BEDROCK_MODEL_ID |
AWS Bedrock Mannequin ID | us. anthropic.claude-3-7-sonnet-20250219-v1:0 |
Step 5: Verify deployment standing
Look ahead to the runtime to be prepared earlier than invoking:
A profitable deployment exhibits a “Prepared” state for the agent runtime.
Step 6: Invoking AgentCore Runtime
Lastly, invoke our AgentCore Runtime with a payload.
As soon as the AgentCore Runtime has been invoked, customers ought to be capable of see the Langfuse traces within the Langfuse dashboard.
Step 7: View traces in Langfuse
After operating the agent, go to the Langfuse mission to view the detailed traces. The traces embody:
- Agent invocation particulars
- Device calls (net search)
- Mannequin interactions with latency and token utilization
- Request/response payloads
Traces and hierarchy
Langfuse captures all interactions from person requests to particular person mannequin calls. Every hint captures the entire execution path, together with API calls, operate invocations, and mannequin responses, making a complete timeline of agent actions. The nested construction of traces allows builders to drill down into particular interactions and establish efficiency bottlenecks or error patterns at any stage of the execution chain. To additional improve observability capabilities, Langfuse supplies tagging mechanisms that may be applied in agent workflows.
Determine 3: Traces in Langfuse
Combining hierarchical traces with strategic tagging supplies insights into agent operations, enabling data-driven optimization and superior person experiences. As proven within the following picture, builders can drill down into the exact timing of every operation inside their agent’s execution stream. Within the earlier instance, the entire request took 26.57s, with particular person breakdowns for occasion loop cycle, instrument calls, and different elements. Use this timing info to seek out efficiency bottlenecks and scale back response instances. As an illustration, sure LLM operations would possibly take longer than anticipated, or there could also be alternatives to parallelize particular actions to scale back general latency. By leveraging these insights, customers could make data-driven selections to reinforce agent’s efficiency and ship a greater buyer expertise.
Determine 4: Detailed hint hierarchy
Langfuse dashboard
The Langfuse dashboard options three totally different dashboards for monitoring similar to Value, Latency and Utilization Administration.
Determine 5: Langfuse dashboard
Value monitoring
Value monitoring helps observe bills at each the combination and particular person request ranges to take care of management over AI infrastructure bills. The platform supplies detailed price breakdowns per mannequin, person, and performance name, enabling groups to establish cost-intensive operations and optimize their implementation. This granular price visibility helps in making data-driven selections about mannequin choice, immediate engineering, and useful resource allocation whereas sustaining funds constraints. Dashboard price knowledge is supplied for estimation functions; precise expenses must be verified by means of official billing statements.
Determine 6: Value dashboard
Langfuse latency dashboard
Latency metrics could be monitored throughout traces and generations for efficiency optimization. The dashboard exhibits the next metrics by default and you’ll create customized charts and dashboard relying in your wants:
- P 95 Latency by Stage (Observations)
- P 95 Latency by Use Case
- Max Latency by Consumer Id (Traces)
- Avg Time To First Token by Immediate Identify (Observations)
- P 95 Time To First Token by Mannequin
- P 95 Latency by Mannequin
- Avg Output Tokens Per Second by Mannequin
Determine 7: Latency dashboard
Langfuse utilization administration
This dashboard exhibits metrics throughout traces, observations, and scores to handle useful resource allocation.
Determine 8: Utilization administration dashboard
Conclusion
This put up demonstrated easy methods to combine Langfuse with AgentCore for complete observability of AI brokers. Customers can now observe efficiency, debug interactions, and optimize prices throughout workflows. We count on extra Langfuse observability options and integration choices sooner or later to assist scale AI purposes.
Begin implementing Langfuse with AgentCore immediately to achieve deeper insights into brokers’ efficiency, observe dialog flows, and optimize AI purposes. For extra info, go to the next assets:
In regards to the authors
Richa Gupta is a Senior Options Architect at Amazon Net Companies, specializing in AI/ML, Generative AI, and Agentic AI. She is enthusiastic about serving to clients on their AI transformation journey, architecting end-to-end options from proof-of-concept to manufacturing deployment and drive enterprise income. Past her skilled pursuits, Richa likes to make latte arts and is an journey fanatic.
Ishan Singh is a Sr. Generative AI Information Scientist at Amazon Net Companies, the place he companions with clients to architect progressive and accountable generative AI options. With deep experience in AI and machine studying, Ishan leads the event of manufacturing Generative AI options at scale, with a deal with evaluations and observability. Exterior of labor, he enjoys taking part in volleyball, exploring native bike trails, and spending time together with his spouse, child, and canine, Beau.
Yanyan Zhang is a Senior Generative AI Information Scientist at Amazon Net Companies, the place she has been engaged on cutting-edge AI/ML applied sciences as a Generative AI Specialist, serving to clients use generative AI to realize their desired outcomes. Yanyan graduated from Texas A&M College with a PhD in Electrical Engineering. Exterior of labor, she loves touring, understanding, and exploring new issues.
Madhu Samhitha is a Specialist Resolution Architect at Amazon Net Companies, centered on serving to clients implement generative AI options. She combines her information of huge language fashions with strategic innovation to ship enterprise worth. She has a Grasp’s in Laptop Science from the College of Massachusetts, Amherst and has labored in varied industries. Past her technical position, Madhu is a educated classical dancer, an artwork fanatic, and enjoys exploring nationwide parks.
Marc Klingen is the co-founder and CEO of Langfuse, the Open Supply LLM Engineering Platform. After constructing LLM Brokers in 2023 collectively together with his co-founders, Marc and group realized that new tooling is important to convey brokers into manufacturing and scale them reliably. With Langfuse they’ve constructed the main Open Supply LLM Engineering Platform (Observability, Analysis, Immediate Administration) with over 18,000 GitHub stars, 14.8M+ SDK installs per 30 days, and 6M+ Docker pulls. Langfuse is utilized by high engineering groups similar to Khan Academy, Samsara, Twilio, and Merck.


