Artificial Intelligence

AI Agents — What They Are and How They Work

In 2026, almost every AI product pitch includes the word agent. Chatbots became copilots. Copilots became agents. The terminology moved faster than the understanding.

This post cuts through that. An AI agent is a specific thing — with a specific architecture and specific capabilities. Understanding what it actually is helps you evaluate AI tools honestly, design better systems, and have more grounded conversations about where AI automation makes sense.

🔗 Foundation for this post

This post builds on What is a Large Language Model? (what LLMs can do), AI Hallucinations — Why They Happen (where they fail), and RAG — Retrieval Augmented Generation (how agents access knowledge). Those three posts are the context this one builds on.

Chatbot vs copilot vs agent — the actual differences

These three terms describe genuinely different things. The distinction is not about marketing; it is about what the system can do autonomously.

ChatbotCopilotAgent
What it doesResponds to a single messageAssists a human doing a taskPlans and executes a multi-step task autonomously
AutonomyNone — responds onlyLow — suggests, human decidesHigh — acts independently across steps
Tool useNo — text onlyLimited — may invoke one toolYes — calls multiple tools across a workflow
MemoryNone — each message is freshUsually session-onlyCan have persistent memory across sessions
ExampleCustomer service FAQ botGitHub Copilot suggesting codeAn agent that reads your calendar, drafts emails and books meetings without prompting
SAP exampleSimple FAQ bot on an SAP Help pageSAP Joule suggesting a field valueA Joule agent that analyses an exception, creates a task, assigns it and follows up

What makes something an agent

The defining characteristic of an agent is the planning loop — the ability to take a goal, break it into steps, take actions, observe the results, and adjust the plan based on what happened.

A chatbot takes an input and produces an output. An agent takes a goal and pursues it across multiple steps, making decisions along the way.

The four components of an agent

ComponentWhat it doesWhy it matters
LLM (the brain)The language model that plans, reasons and generates textThe intelligence layer — decides what to do and how to respond
ToolsExternal capabilities the agent can use — APIs, search, code execution, databasesWithout tools, the agent can only produce text. Tools let it act in the world.
MemoryStorage of past interactions, retrieved context, intermediate resultsAllows the agent to maintain state across steps and across sessions
Planning loopThe cycle of perceive → plan → act → observe → repeatWhat turns a single LLM call into a multi-step autonomous workflow

AI agent planning loop diagram showing four steps — Perceive, Plan, Act, Observe — in a cycle with a goal-complete decision exit

Tools — what agents actually use to act

A tool is any external capability the agent can invoke. The LLM decides which tool to call, what inputs to pass, and how to interpret the result. The tool itself is just a function — it does not need to be AI.

Tool typeWhat it doesExample
Web searchRetrieves current information from the internetLook up today’s stock price, find recent news
Code executionRuns Python, JavaScript or other codeCalculate a formula, process a spreadsheet, generate a chart
Database queryReads from or writes to a databaseFetch customer data, create a record, update a status
API callCalls any REST APICreate a Jira ticket, send a Slack message, book a calendar slot
File operationsRead, write or process filesSummarise a PDF, extract data from a CSV, update a document
RAG / searchRetrieves from a knowledge baseFind the relevant policy, look up a product specification
Another agentDelegates to a specialised sub-agentAn orchestrator agent calling a specialist coding agent

How does the LLM actually decide to call a tool? Through a mechanism called function calling (also called tool use). When an agent is set up, each available tool is described to the LLM as a function — its name, what it does, and what parameters it takes. When the LLM determines that a tool is needed, instead of generating a text response it generates a structured tool call — a JSON object naming the function and the parameters to pass. The agent framework intercepts this, executes the actual tool (the real API call or database query), and feeds the result back to the LLM as context. The LLM then decides whether the goal is complete or whether another tool call is needed. This loop is how a single LLM becomes an agent that acts in the world. Every major LLM provider — including those powering SAP Joule — supports function calling natively.

💡 MCP is how tools connect

Model Context Protocol (MCP) is the open standard that defines how agents discover and call tools. Instead of each tool requiring custom integration code, MCP provides a standardised interface. The next post in this series covers MCP in full — it is the infrastructure layer that makes the tool ecosystem in the table above practical at scale.

Where agents work well

ScenarioWhy agents suit it
Multi-step research tasksAgents can search, read, synthesise and summarise across multiple sources in one workflow
Process automation with decisionsWhere a workflow has conditional steps — if X then do Y else do Z — an agent handles this naturally
Exception handlingReview an exception, look up related data, propose a resolution and create a task — all in one agent run
Data aggregationPull data from multiple systems, combine and format — agents can call multiple APIs in one session
Document-heavy workflowsRead a contract, extract key terms, cross-check against a database and flag discrepancies

Where agents fail — and why

This is the part most agent articles skip. Agents are powerful but they have well-documented failure modes. Knowing them is essential before deploying agents in production.

Failure modeWhat happensMitigation
Hallucination in tool useThe agent calls a non-existent API endpoint or fabricates a tool parameterConstrain tool schemas strictly — validate inputs before execution
Infinite loopsThe agent gets stuck in a planning loop, repeatedly trying variations that all failSet maximum iteration limits — exit if goal not achieved within N steps
Prompt injectionMalicious content in a tool result hijacks the agent’s next actionTreat all tool results as untrusted input — sanitise before feeding back to the LLM
Compounding errorsAn early wrong decision causes every subsequent step to be wrongHuman checkpoints at critical decision points — especially for irreversible actions
Scope creepThe agent takes actions beyond what was intendedDefine clear tool scope — only expose the tools needed for the specific task
Slow and expensiveMulti-step planning with multiple LLM calls is slower and more costly than a single callAgents suit complex multi-step tasks — not simple Q&A where a direct LLM call is enough

⚠️ Never give agents irreversible actions without a human checkpoint

An agent that can send emails, delete records, or make financial transactions without human approval can cause serious damage if it goes wrong. The rule: reversible actions can be automated. Irreversible ones need a human in the loop — at least until you have built and verified confidence in the agent’s reliability for that specific task.

Agents in the SAP context — 2026

SAP scenarioAgent approach
SAP Joule with agentic capabilitiesJoule in S/4HANA and SuccessFactors can now initiate multi-step actions — not just answer questions but create records, trigger workflows and follow up
SAP Build Process AutomationThe low-code automation platform supports AI-assisted decision steps within workflows — human approval before irreversible actions
SAP AI Core agents on BTPCustom agents built on BTP using SAP AI Core — define tools, memory and the planning loop using SAP’s orchestration framework
Integration exception handlingAn agent monitors CPI integration exceptions, looks up related master data, classifies the error and creates an incident with context — reducing manual triage
Purchase order exception agentReads blocked purchase orders, checks reason codes, looks up supplier history and proposes resolution — a human approves before unblocking

The mental model — in one view

ConceptOne-line summary
AI AgentA system that uses an LLM to plan and execute multi-step tasks autonomously using tools
Planning loopThe cycle of perceive, plan, act, observe — repeated until the goal is complete or the attempt limit is reached
ToolsExternal capabilities the agent can call — APIs, search, code, databases
MemoryContext that persists across steps — working memory in the session, long-term memory across sessions
CopilotAssists a human — suggests actions, human decides. Joule is largely a copilot.
AgentActs independently — plans and executes without human prompting each step
MCPThe open standard for connecting agents to tools — covered in the next post
Key riskIrreversible actions without human checkpoints. Always build approval gates.

What to take away

An AI agent is not a chatbot with a bigger brain. It is a fundamentally different architecture — one that plans, acts, observes and adjusts. That architecture makes agents genuinely powerful for complex multi-step tasks and genuinely risky without proper guardrails.

In the SAP and enterprise world, agents are moving from pilots to production in 2026. Understanding what they actually are — and where they fail — is what separates the people who deploy them well from the people who deploy them and then deal with the consequences.

🔗 Related posts on this site

What is a Large Language Model? — the brain inside every agent. RAG — Retrieval Augmented Generation — how agents access knowledge from documents and databases. AI Hallucinations — Why They Happen — hallucination compounds in multi-step agent workflows — essential reading. Coming next: MCP — Model Context Protocol — the open standard that defines how agents discover and connect to tools. rakeshnarayan.com/articles/

Published on rakeshnarayan.com — Articles

URL: https://rakeshnarayan.com/articles/ai-agents/