Intro to Agents
If you’re reading this, you’ve probably heard people say:
“2025 will be the year of AI Agents.”
But what are AI agents, and why is everyone talking about them? How are they different—or better—than the Large Language Models (LLMs) we already use?
Why LLMs Are Powerful (But Limited)
Since 2020, LLMs like ChatGPT have changed how we work. They’ve doubled our productivity by doing tasks like:
- Writing code
- Drafting emails
- Creating templates
- And much more
But here’s the catch: LLMs can’t take real-world actions. They can only give you text or images as replies. Let’s break this down.
What’s an “Action”?
An “action” means doing something that changes your current situation. For example, booking a flight or paying a bill—not just talking about it.
Tools Make LLMs Powerful Agents
What if LLMs could use tools, like humans use hammers or calculators? That’s exactly what AI agents do!
Real-Life Example: Booking Flights
Imagine asking ChatGPT: “What’s the total cost of the cheapest round-trip flights from New Delhi to Bali?” Today, it can’t answer this—it only knows data up to 2023.
But if we give it two tools:
- A Google Flights API (to check live prices)
- A calculator (to add numbers)
Now, here’s how ChatGPT becomes an AI agent:
- It uses the Google Flights tool to find all flight prices for your dates.
- Picks the cheapest options for both legs of the trip.
- Adds the two prices using the calculator tool and gives you the total cost.
Why use a calculator? LLMs like GPT-4 aren’t great at math. Tools let them focus on what they do best—analyzing data and making decisions—while specialized tools handle the rest.
What Tools Can AI Agents Use? (Spoiler: No Limits!)
You might wonder: "How many tools can we connect to an AI agent? Is there a limit?"
The answer is exciting: there's no limit. If a tool can help the LLM solve a problem, it can be added. The more tools, the smarter the agent becomes.
🔍 The RAG Agent
Tools: Web search APIs, company databases, Wikipedia access
Need a report on "AI trends in 2024"? Instead of giving outdated info, this agent scours the internet and internal data in real-time, summarizes findings, and even adds sources.
🔄 The Unit Converter Agent
Tools: Currency converters, measurement APIs
Ask, "How much is $50 in Euros today?" It checks live exchange rates and calculates the exact amount—no outdated approximations.
📱 The Social Media Agent
Tools: Canva API, Instagram/Twitter schedulers
Say, "Post a motivational quote every Friday at 5 PM." The agent designs the graphic, writes the caption, and schedules it automatically.
🛒 The Personal Shopper Agent
Tools: Amazon/Flipkart APIs, price trackers
Tell it, "Find a wireless keyboard under ₹2000 with 4+ star reviews." It hunts across shopping sites, compares prices, and shares the top 3 options.
💪 The Health Coach Agent
Tools: Fitness trackers, calorie databases
Ask, "Did I hit my fitness goals this week?" It analyzes your step count, sleep data, and meals, then gives a personalized report.
How Do AI Agents Actually Work? (Step-by-Step)
Let's break down the process of how an AI agent uses tools to solve problems. Imagine it like a chef in a kitchen—thinking, choosing tools, and combining ingredients to serve a dish:
- 1
Question
The user asks something, e.g., "What's the cheapest way to fly from Delhi to Bali next week?"
- 2
Thought
The LLM thinks: "I need live flight prices and a way to add numbers. Let me check which tools I have."
- 3
Action
The LLM picks a tool from its toolkit:
[Flights]
→ To search for flights[Calculator]
→ To add prices[None]
→ If no tool is needed
- 4
Action Input
The LLM tells the tool what to do: "Search Delhi→Bali flights for June 10-17, 2024."
- 5
Observation
The tool returns raw data: "Cheapest flights: ₹18,500 (outbound), ₹20,300 (return)."
- 6
Repeat
The LLM checks if it needs more steps. If yes, it loops back to Thought → Action.
- 7
Final Answer
The LLM combines all observations: "Total cost: ₹38,800. Here are the flight details…"
Your Questions, Answered
Q1: Does the LLM actually "call" the tools itself?
Not directly. While modern AI interfaces (like ChatGPT, Gemini, Claude) make it appear as if the LLM is directly executing tools, there's a technical distinction: the LLM generates the instructions for tool use, and a separate execution environment handles the actual API calls and tool operations.
Here's how it works under the hood:
- The LLM determines which tool to use and formats the required parameters
- An execution layer (built by OpenAI, Google, etc.) receives these instructions
- This execution layer makes the actual API calls and runs the necessary code
- Results are fed back to the LLM to continue the conversation
From a user perspective, this appears seamless - you simply ask the AI to perform a task and it happens. But architecturally, the text generation and tool execution are separate components working together in an integrated system.
Q2: Will the LLM always choose the right tool?
Not always. It depends on two factors:
Model Strength
GPT-4 will make better decisions than smaller models.
Example: For "What's 25% of ₹40,000?"
- Weak model: Tries to calculate itself
- Strong model: Uses Calculator tool
Prompt Engineering
Clear instructions matter.
Bad prompt: "Answer the user."
Good prompt: "Use Calculator for math."
Real-World Agent Frameworks
Companies are already building systems to automate this "thought → action" loop:
🤖 AutoGPT
Breaks down complex goals (e.g., "Plan a vacation") into smaller tasks, uses web search, booking tools, and calendars.
👶 BabyAGI
Creates and prioritizes task lists, like a virtual project manager.
LangfLow
Langflow is a low-code tool for developers that makes it easier to build powerful AI agents and workflows that can use any API, model, or database
LangGraph
Gain control with LangGraph to design agents that reliably handle complex tasks. Build and scale agentic applications with LangGraph Platform.
Phidata
An open-source platform to build, ship and monitor agentic systems.
Crewai
Streamline workflows across industries with powerful AI agents. Build and deploy automated workflows using any LLM and cloud platform.
Microsoft AutoGen
AutoGen is an open-source programming framework for building AI agents and facilitating cooperation among multiple agents to solve tasks.
Understanding Function Execution in AI Agents
Anatomy of a Function Call
A flight-search function requires structured inputs to work reliably. For example, the LLM might request flight data like this:
get_flight_prices(
departure_city="DEL", # Delhi
arrival_city="DPS", # Bali
dates="2024-11-15 to 2024-11-22"
)
The function connects to a flight API, returning raw data about airlines, prices, and timings for both outbound and return flights.
The LLM's Decision Process:
- Identifies the cheapest outbound flight (e.g., ₹18,500 with Air India)
- Finds the cheapest return flight (e.g., ₹20,300 with IndiGo)
- Calls
calculator_add(a=18500, b=20300)
to sum the fares - Returns the final answer: "Total cost: ₹38,800. Cheapest flights: Air India (Departure) + IndiGo (Return)."
🔑 Key Point
The LLM never executes the function itself—it only triggers the workflow. Frameworks like LangChain handle the actual API calls, validate inputs, and pass results back to the model.
Function Code Example (Python)
def get_google_flights(
departure_code: str, # IATA code (e.g., "DEL" for Delhi)
arrival_code: str, # IATA code (e.g., "DPS" for Bali)
departure_date: str, # Format: "YYYY-MM-DD"
return_date: str, # Format: "YYYY-MM-DD"
cabin_class: str = "economy"
) -> list[dict]:
"""
Fetches flight prices from Google Flights API.
Returns list of flights sorted by price.
"""
response = requests.get(
"https://api.google.com/flights",
params={
"departure_code": departure_code,
"arrival_code": arrival_code,
"departure_date": departure_date,
"return_date": return_date,
"cabin_class": cabin_class
}
)
# Process response
flights = response.json()["flights"]
return sorted(flights, key=lambda x: x["price"]) # Sort by cheapest
Sample Input (from LLM):
get_google_flights(
departure_code="DEL",
arrival_code="DPS",
departure_date="2024-11-15",
return_date="2024-11-22"
)
Sample Output (API Response):
[
{
"airline": "Air India",
"price": 18500,
"departure_time": "08:30",
"flight_number": "AI-876"
},
{
"airline": "IndiGo",
"price": 20300,
"departure_time": "14:15",
"flight_number": "6E-132"
}
]
Execution Flow
- The LLM triggers
get_google_flights
with validated inputs. - The system calls the API and returns sorted flight data.
- The LLM selects the cheapest option (₹18,500 Air India).
- Repeats steps 1-3 for return flights.
- Calls
calculator_add(18500, 20300)
to sum prices. - Returns final answer: "Total: ₹38,800. Cheapest flights: Air India (Depart) + IndiGo (Return)."
Want to Learn More?
For a more detailed understanding, check out the LangChain tutorial that walks through building agents step by step.
Why This Changes Everything
Today's LLMs are like smart assistants who can only talk. AI agents, though, are like giving them hands, eyes, and access to the real world. They don't just answer questions—they act on them.
- Need to book a hotel? They'll compare prices, check your calendar, and reserve a room.
- Need to fix a bug in your code? They'll test solutions, debug, and even deploy the fix.
This is the future: AI agents won't just talk—they'll act, solve problems, and automate tasks we hate. And that's why 2025 might just be their year.
My Views and Conclusion
I personally feel, Agents are definetly are an awesome way to boost the power of LLMs, and they definetly hold the future of AI. But let’s keep perspective: while they’re leagues ahead of traditional language models in taking action, the hype about them "taking over jobs overnight" feels overhyped. They can definetly make our life is easy.
Will they make our lives easier? Absolutely ! But we should give them some time and then come to a final conclusion
If you enjoyed this piece or have thoughts to share, I’d love to hear from you! Let’s chat →
Happy Learning ❤️