GenAI fashions are good at a handful of duties akin to textual content summarization, query answering, and code era. When you have a enterprise course of which may be damaged down right into a set of steps, and a number of these steps entails one in all these GenAI superpowers, then it is possible for you to to partially automate your enterprise course of utilizing GenAI. We name the software program utility that automates such a step an agent.
Whereas brokers use LLMs simply to course of textual content and generate responses, this primary functionality can present fairly superior conduct akin to the power to invoke backend companies autonomously.
Let’s say that you simply need to construct an agent that is ready to reply questions akin to “Is it raining in Chicago?”. You can’t reply a query like this utilizing simply an LLM as a result of it isn’t a process that may be carried out by memorizing patterns from massive volumes of textual content. As a substitute, to reply this query, you’ll want to succeed in out to real-time sources of climate data.
There’s an open and free API from the US Nationwide Climate Service (NWS) that gives the short-term climate forecast for a location. Nonetheless, utilizing this API to reply a query like “Is it raining in Chicago?” requires a number of extra steps (see Determine 1):
- We might want to arrange an agentic framework to coordinate the remainder of these steps.
- What location is the person taken with? The reply in our instance sentence is “Chicago”. It’s not so simple as simply extracting the final phrase of the sentence — if the person had been to ask “Is Orca Island scorching immediately?”, the placement of curiosity can be “Orca Island”. As a result of extracting the placement from a query requires with the ability to perceive pure language, you possibly can immediate an LLM to establish the placement the person is taken with.
- The NWS API operates on latitudes and longitudes. If you’d like the climate in Chicago, you’ll should convert the string “Chicago” into a degree latitude and longitude after which invoke the API. That is known as geocoding. Google Maps has a Geocoder API that, given a spot title akin to “Chicago”, will reply with the latitude and longitude. Inform the agent to make use of this instrument to get the coordinates of the placement.
- Ship the placement coordinates to the NWS climate API. You’ll get again a JSON object containing climate knowledge.
- Inform the LLM to extract the corresponding climate forecast (for instance, if the query is about now, tonight, or subsequent Monday) and add it to the context of the query.
- Based mostly on this enriched context, the agent is ready to lastly reply the person’s query.
Let’s undergo these steps one after the other.
First, we are going to use Autogen, an open-source agentic framework created by Microsoft. To comply with alongside, clone my Git repository, get API keys following the instructions supplied by Google Cloud and OpenAI. Swap to the genai_agents folder, and replace the keys.env file along with your keys.
GOOGLE_API_KEY=AI…
OPENAI_API_KEY=sk-…
Subsequent, set up the required Python modules utilizing pip:
pip set up -r necessities.txt
This may set up the autogen module and shopper libraries for Google Maps and OpenAI.
Comply with the dialogue under by ag_weather_agent.py.
Autogen treats agentic duties as a dialog between brokers. So, step one in Autogen is to create the brokers that can carry out the person steps. One would be the proxy for the end-user. It is going to provoke chats with the AI agent that we are going to confer with because the Assistant:
user_proxy = UserProxyAgent("user_proxy",
code_execution_config={"work_dir": "coding", "use_docker": False},
is_termination_msg=lambda x: autogen.code_utils.content_str(x.get("content material")).discover("TERMINATE") >= 0,
human_input_mode="NEVER",
)
There are three issues to notice in regards to the person proxy above:
- If the Assistant responds with code, the person proxy is able to executing that code in a sandbox.
- The person proxy terminates the dialog if the Assistant response incorporates the phrase TERMINATE. That is how the LLM tells us that the person query has been totally answered. Making the LLM do that is a part of the hidden system immediate that Autogen sends to the LLM.
- The person proxy by no means asks the end-user any follow-up questions. If there have been follow-ups, we’d specify the situation underneath which the human is requested for extra enter.
Regardless that Autogen is from Microsoft, it isn’t restricted to Azure OpenAI. The AI assistant can use OpenAI:
openai_config = {
"config_list": [
{
"model": "gpt-4",
"api_key": os.environ.get("OPENAI_API_KEY")
}
]
}
or Gemini:
gemini_config = {
"config_list": [
{
"model": "gemini-1.5-flash",
"api_key": os.environ.get("GOOGLE_API_KEY"),
"api_type": "google"
}
],
}
Anthropic and Ollama are supported as nicely.
Provide the suitable LLM configuration to create the Assistant:
assistant = AssistantAgent(
"Assistant",
llm_config=gemini_config,
max_consecutive_auto_reply=3
)
Earlier than we wire the remainder of the agentic framework, let’s ask the Assistant to reply our pattern question.
response = user_proxy.initiate_chat(
assistant, message=f"Is it raining in Chicago?"
)
print(response)
The Assistant responds with this code to succeed in out an current Google net service and scrape the response:
```python
# filename: climate.py
import requests
from bs4 import BeautifulSoup
url = "https://www.google.com/search?q=climate+chicago"
response = requests.get(url)
soup = BeautifulSoup(response.textual content, 'html.parser')
weather_info = soup.discover('div', {'id': 'wob_tm'})
print(weather_info.textual content)
```
This will get on the energy of an agentic framework when powered by a frontier foundational mannequin — the Assistant has autonomously discovered an internet service that gives the specified performance and is utilizing its code era and execution functionality to supply one thing akin to the specified performance! Nonetheless, it’s not fairly what we needed — we requested whether or not it was raining, and we bought again the complete web site as an alternative of the specified reply.
Secondly, the autonomous functionality doesn’t actually meet our pedagogical wants. We’re utilizing this instance as illustrative of enterprise use instances, and it’s unlikely that the LLM will learn about your inner APIs and instruments to have the ability to use them autonomously. So, let’s proceed to construct out the framework proven in Determine 1 to invoke the particular APIs we need to use.
As a result of extracting the placement from the query is simply textual content processing, you possibly can merely immediate the LLM. Let’s do that with a single-shot instance:
SYSTEM_MESSAGE_1 = """
Within the query under, what location is the person asking about?
Instance:
Query: What is the climate in Kalamazoo, Michigan?
Reply: Kalamazoo, Michigan.
Query:
"""
Now, after we provoke the chat by asking whether or not it’s raining in Chicago:
response1 = user_proxy.initiate_chat(
assistant, message=f"{SYSTEM_MESSAGE_1} Is it raining in Chicago?"
)
print(response1)
we get again:
Reply: Chicago.
TERMINATE
So, step 2 of Determine 1 is full.
Step 3 is to get the latitude and longitude coordinates of the placement that the person is taken with. Write a Python operate that can known as the Google Maps API and extract the required coordinates:
def geocoder(location: str) -> (float, float):
geocode_result = gmaps.geocode(location)
return (spherical(geocode_result[0]['geometry']['location']['lat'], 4),
spherical(geocode_result[0]['geometry']['location']['lng'], 4))
Subsequent, register this operate in order that the Assistant can name it in its generated code, and the person proxy can execute it in its sandbox:
autogen.register_function(
geocoder,
caller=assistant, # The assistant agent can recommend calls to the geocoder.
executor=user_proxy, # The person proxy agent can execute the geocder calls.
title="geocoder", # By default, the operate title is used because the instrument title.
description="Finds the latitude and longitude of a location or landmark", # An outline of the instrument.
)
Notice that, at the time of writing, operate calling is supported by Autogen just for GPT-4 fashions.
We now broaden the instance within the immediate to incorporate the geocoding step:
SYSTEM_MESSAGE_2 = """
Within the query under, what latitude and longitude is the person asking about?
Instance:
Query: What is the climate in Kalamazoo, Michigan?
Step 1: The person is asking about Kalamazoo, Michigan.
Step 2: Use the geocoder instrument to get the latitude and longitude of Kalmazoo, Michigan.
Reply: (42.2917, -85.5872)
Query:
"""
Now, after we provoke the chat by asking whether or not it’s raining in Chicago:
response2 = user_proxy.initiate_chat(
assistant, message=f"{SYSTEM_MESSAGE_2} Is it raining in Chicago?"
)
print(response2)
we get again:
Reply: (41.8781, -87.6298)
TERMINATE
Now that we now have the latitude and longitude coordinates, we’re able to invoke the NWS API to get the climate knowledge. Step 4, to get the climate knowledge, is just like geocoding, besides that we’re invoking a unique API and extracting a unique object from the online service response. Please have a look at the code on GitHub to see the complete particulars.
The upshot is that the system immediate expands to embody all of the steps within the agentic utility:
SYSTEM_MESSAGE_3 = """
Comply with the steps within the instance under to retrieve the climate data requested.
Instance:
Query: What is the climate in Kalamazoo, Michigan?
Step 1: The person is asking about Kalamazoo, Michigan.
Step 2: Use the geocoder instrument to get the latitude and longitude of Kalmazoo, Michigan.
Step 3: latitude, longitude is (42.2917, -85.5872)
Step 4: Use the get_weather_from_nws instrument to get the climate from the Nationwide Climate Service on the latitude, longitude
Step 5: The detailed forecast for tonight reads 'Showers and thunderstorms earlier than 8pm, then showers and thunderstorms seemingly. A few of the storms may produce heavy rain. Principally cloudy. Low round 68, with temperatures rising to round 70 in a single day. West southwest wind 5 to eight mph. Likelihood of precipitation is 80%. New rainfall quantities between 1 and a pair of inches potential.'
Reply: It is going to rain tonight. Temperature is round 70F.
Query:
"""
Based mostly on this immediate, the response to the query about Chicago climate extracts the best data and solutions the query appropriately.
On this instance, we allowed Autogen to pick the following agent within the dialog autonomously. We will additionally specify a unique subsequent speaker choice technique: specifically, setting this to be “guide” inserts a human within the loop, and permits the human to pick the following agent within the workflow.
The place Autogen treats agentic workflows as conversations, LangGraph is an open supply framework that permits you to construct brokers by treating a workflow as a graph. That is impressed by the lengthy historical past of representing knowledge processing pipelines as directed acyclic graphs (DAGs).
Within the graph paradigm, our climate agent would look as proven in Determine 2.
There are just a few key variations between Figures 1 (Autogen) and a pair of (LangGraph):
- In Autogen, every of the brokers is a conversational agent. Workflows are handled as conversations between brokers that speak to one another. Brokers leap into the dialog after they consider it’s “their flip”. In LangGraph, workflows are handled as a graph whose nodes the workflow cycles via based mostly on guidelines that we specify.
- In Autogen, the AI assistant shouldn’t be able to executing code; as an alternative the Assistant generates code, and it’s the person proxy that executes the code. In LangGraph, there’s a particular ToolsNode that consists of capabilities made out there to the Assistant.
You may comply with alongside this part by referring to the file lg_weather_agent.py in my GitHub repository.
We arrange LangGraph by creating the workflow graph. Our graph consists of two nodes: the Assistant Node and a ToolsNode. Communication throughout the workflow occurs by way of a shared state.
workflow = StateGraph(MessagesState)
workflow.add_node("assistant", call_model)
workflow.add_node("instruments", ToolNode(instruments))
The instruments are Python features:
@instrument
def latlon_geocoder(location: str) -> (float, float):
"""Converts a spot title akin to "Kalamazoo, Michigan" to latitude and longitude coordinates"""
geocode_result = gmaps.geocode(location)
return (spherical(geocode_result[0]['geometry']['location']['lat'], 4),
spherical(geocode_result[0]['geometry']['location']['lng'], 4))
instruments = [latlon_geocoder, get_weather_from_nws]
The Assistant calls the language mannequin:
mannequin = ChatOpenAI(mannequin='gpt-3.5-turbo', temperature=0).bind_tools(instruments)
def call_model(state: MessagesState):
messages = state['messages']
response = mannequin.invoke(messages)
# This message will get appended to the present record
return {"messages": [response]}
LangGraph makes use of langchain, and so altering the mannequin supplier is simple. To make use of Gemini, you possibly can create the mannequin utilizing:
mannequin = ChatGoogleGenerativeAI(mannequin='gemini-1.5-flash',
temperature=0).bind_tools(instruments)
Subsequent, we outline the graph’s edges:
workflow.set_entry_point("assistant")
workflow.add_conditional_edges("assistant", assistant_next_node)
workflow.add_edge("instruments", "assistant")
The primary and final strains above are self-explanatory: the workflow begins with a query being despatched to the Assistant. Anytime a instrument is named, the following node within the workflow is the Assistant which is able to use the results of the instrument. The center line units up a conditional edge within the workflow, because the subsequent node after the Assistant shouldn’t be mounted. As a substitute, the Assistant calls a instrument or ends the workflow based mostly on the contents of the final message:
def assistant_next_node(state: MessagesState) -> Literal["tools", END]:
messages = state['messages']
last_message = messages[-1]
# If the LLM makes a instrument name, then we path to the "instruments" node
if last_message.tool_calls:
return "instruments"
# In any other case, we cease (reply to the person)
return END
As soon as the workflow has been created, compile the graph after which run it by passing in questions:
app = workflow.compile()
final_state = app.invoke(
{"messages": [HumanMessage(content=f"{system_message} {question}")]}
)
The system message and query are precisely what we employed in Autogen:
system_message = """
Comply with the steps within the instance under to retrieve the climate data requested.
Instance:
Query: What is the climate in Kalamazoo, Michigan?
Step 1: The person is asking about Kalamazoo, Michigan.
Step 2: Use the latlon_geocoder instrument to get the latitude and longitude of Kalmazoo, Michigan.
Step 3: latitude, longitude is (42.2917, -85.5872)
Step 4: Use the get_weather_from_nws instrument to get the climate from the Nationwide Climate Service on the latitude, longitude
Step 5: The detailed forecast for tonight reads 'Showers and thunderstorms earlier than 8pm, then showers and thunderstorms seemingly. A few of the storms may produce heavy rain. Principally cloudy. Low round 68, with temperatures rising to round 70 in a single day. West southwest wind 5 to eight mph. Likelihood of precipitation is 80%. New rainfall quantities between 1 and a pair of inches potential.'
Reply: It is going to rain tonight. Temperature is round 70F.
Query:
"""
query="Is it raining in Chicago?"
The result’s that the agent framework makes use of the steps to give you a solution to our query:
Step 1: The person is asking about Chicago.
Step 2: Use the latlon_geocoder instrument to get the latitude and longitude of Chicago.
[41.8781, -87.6298]
[{"number": 1, "name": "This Afternoon", "startTime": "2024–07–30T14:00:00–05:00", "endTime": "2024–07–30T18:00:00–05:00", "isDaytime": true, …]
There's a probability of showers and thunderstorms after 8pm tonight. The low will probably be round 73 levels.
Between Autogen and LangGraph, which one do you have to select? A couple of issues:
In fact, the extent of Autogen help for non-OpenAI fashions and different tooling may enhance by the point you might be studying this. LangGraph may add autonomous capabilities, and Autogen may present you extra fine-grained management. The agent house is shifting quick!
- ag_weather_agent.py: https://github.com/lakshmanok/lakblogs/blob/predominant/genai_agents/ag_weather_agent.py
- lg_weather_agent.py: https://github.com/lakshmanok/lakblogs/blob/predominant/genai_agents/lg_weather_agent.py
This text is an excerpt from a forthcoming O’Reilly e-book “Visualizing Generative AI” that I’m writing with Priyanka Vergadia. All diagrams on this publish had been created by the creator.