Building a Meta AI Agent: The Understated Power of LLMs in a Feedback Loop
It looks like magic when you witness AI agents autonomously viewing files, making edits, executing commands, and working around errors. However, under the hood, the core loop is surprisingly simple: it runs a continuous cycle that takes user input, generates tool calls and text output, receives feedback from tool execution, feeds the result back to the LLM, and repeats. The complexity lies not in any orchestration, but in the sophisticated reasoning capabilities of the frontier labs’ large language models.
To demystify this magic, we’ll implement a simple CLI-based coding agent in Python that can read files, write content, edit text, and execute shell commands. Don’t worry if you’re new to AI agents - this tutorial is designed to be approachable and practical. You’ll be amazed at how much you can accomplish with surprisingly little code.
The diagram below illustrates this fundamental agent loop in action:
graph TD A[User Input] -->|starts| B[LLM Loop] B -->|calls| C[Execute Tools] B -->|generates| D[Agent Output] C -->|produces| F[Tool Output] F -->|feeds| E[New Message] D -->|feeds| E E -->|continues| B
By the end of this tutorial, we’ll have a Python script that’s just over 150 lines of code - compact, powerful, and surprisingly capable of giving you the ‘Oh look at that!’ dopamine hits. I’ve structured this to be easy to follow along step by step, and I highly encourage you to try it out yourself.
Prerequisites
Before we begin, you’ll need two things:
- uv for Python package management
- an Anthropic API key for accessing Claude
The beauty of using uv is that our script will be completely self-contained with inline dependency management.
The Core Loop
Let’s start with the foundation. Every AI agent needs a chat loop that maintains context and handles the back-and-forth between user and AI. Here’s our basic structure:
#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.12"
# dependencies = ["pydantic", "anthropic", "aiofiles"]
# ///
from anthropic import AsyncAnthropic
import asyncio, json, os
from datetime import datetime
client = AsyncAnthropic()
async def loop(system_prompt, messages, user_input):
messages.append({"role": "user", "content": user_input})
msg = await client.messages.create(
max_tokens=2048,
thinking={"type": "enabled", "budget_tokens": 1024},
system=[{"type": "text", "text": system_prompt}],
messages=messages,
model="claude-sonnet-4-0"
)
messages.append({"role": "assistant", "content": msg.content})
agent_text = ' '.join(t.text for t in msg.content if t.type == 'text')
if agent_text:
print(f"π€ {agent_text}")
async def main():
messages = []
system_prompt = f"""Your name is HAL.
You are an interactive CLI tool that helps with software engineering and production operations tasks.
You are running on {str(os.uname())}, today is {datetime.now().strftime('%Y-%m-%d')}
"""
while True:
user_input = input("> ")
if user_input.lower() == "exit":
break
await loop(system_prompt, messages, user_input)
asyncio.run(main())
This script establishes the core conversation loop of our agent. The uv
shebang at the top makes the script self-contained with its dependencies, while the loop
function handles the core conversation cycle: it adds user input to the message history, sends everything to Claude, processes the response, and displays it. The thinking
parameter enables Claude’s extended thinking and reasoning capabilities with a 1024-token budget, allowing it to work through complex problems step-by-step thought process. The main
function creates an interactive REPL that maintains conversation context across exchanges.
Let us save the file as hal
and chmod u+x hal
, and run it with ./hal
:
./hal
> what is your name
π€ My name is HAL. I'm an interactive CLI tool designed to help with software engineering and production operations tasks. How can I assist you today?
> where does your name come from?
π€ My name HAL is likely a reference to HAL 9000, the AI computer from Stanley Kubrick's "2001: A Space Odyssey." In that film, HAL stood for "Heuristically programmed ALgorithmic computer."
Unlike the HAL from the movie though, I'm designed to be helpful with your software engineering and operations tasks - no spacecraft control or existential conflicts here! Just here to assist with CLI operations, development workflows, and production support.
Is there a particular task you'd like help with today?
This gives us a working chat interface where we can have fascinating conversations with HAL, after all it embodies vast human knowledge from across the internet. However, HAL remains confined to the realm of text and imagesβthe only way for it to interact with the outer world is through you manually feeding it information such as code, logs, compiler errors, etc., which is not only inefficient but also potentially biased. To bridge this gap and give it the ability to act in the real world, we need to equip it with tools so that it can read files, run programs to get results and errors autonomously.
Adding Tool Capabilities
As for large language models, the most straightforward way to feed them with external information is to provide them the ability to read text, therefore the first tool we’ll implement is file reading. The function below shows file reading in its simplest form:
import aiofiles
from typing import Annotated
async def read_file(filename: Annotated[str, "The path to the file to read"]):
"""Read the whole file"""
async with aiofiles.open(filename, 'r') as f:
return {"success": True, "filename": filename, "output": await f.read()}
This function works, but HAL doesn’t know its existence, neither does it have access to the Python runtime. We need to bridge the gap between Python functions and jsonschemas that the model can understand.
The Tool System
Here’s where the magic happens. We’ll create a decorator that automatically generates the necessary schemas and a toolbox to manage everything:
from typing import Annotated, get_type_hints, get_origin, get_args
from inspect import Parameter, signature
from pydantic import create_model, Field
def tool(f):
def _process_parameter(name: str, param: Parameter, hints: dict) -> tuple:
"""Process a single parameter into a model field specification."""
default = ... if param.default == Parameter.empty else param.default
hint = hints.get(name, param.annotation)
if get_origin(hint) == Annotated:
base_type, *metadata = get_args(hint)
description = next((m for m in metadata if isinstance(m, str)), None)
return (base_type, Field(default=default, description=description) if description else default)
return (hint, default)
hints = get_type_hints(f, include_extras=True)
model_fields = { name: _process_parameter(name, param, hints) for name, param in signature(f).parameters.items() }
m = create_model(f'{f.__name__} Input', **model_fields)
m.run = lambda self: f(**self.model_dump())
return {
"name": f.__name__,
"description": f.__doc__ or f"Tool: {f.__name__}",
"model": m
}
class Toolbox:
def __init__(self, tools):
self.tools = tools
def schema(self):
return [{
"name": t["name"],
"description": t["description"],
"input_schema": t["model"].model_json_schema()
} for t in self.tools]
async def run(self, name, input):
tool = next(t for t in self.tools if t["name"] == name)
return await tool["model"](**input).run()
At its core, this code creates a bridge between Python functions and AI models. The @tool
decorator examines function’s signature and automatically generates a JSON schema that describes what inputs the function expects - think of it as creating a user manual for the Python function that the AI can read. The Toolbox
class acts like a registry, collecting all these tool schemas and providing a way to execute them when the AI requests it. When HAL wants to use a tool, it sends a structured output that matches the schema, and the toolbox validates the input and runs the corresponding Python function.
Now we can decorate our file reading function:
@tool
async def read_file(filename: Annotated[str, "The path to the file to read"]):
"""Read the whole file"""
async with aiofiles.open(filename, 'r') as f:
return {"success": True, "filename": filename, "output": await f.read()}
Once decorated, we can generate a JSON schema that describes the function to the AI model. Each schema at minimum contains the function’s name, description, and parameter specifications. This schema acts as a contract when HAL wants to use a tool, it must structure its output according to this schema, making it the protocol between the HAL and the Python runtime:
toolbox = Toolbox([read_file])
print(json.dumps(toolbox.schema(), indent=2))
# [
# {
# "name": "read_file",
# "description": "Read the whole file",
# "input_schema": {
# "properties": {
# "filename": {
# "description": "The path to the file to read",
# "title": "Filename",
# "type": "string"
# }
# },
# "required": [
# "filename"
# ],
# "title": "read_file Input",
# "type": "object"
# }
# }
# ]
Connecting Tools to the Loop
Now we need to update our main loop to handle tool calls. When Claude wants to use a tool, it returns special tool_use
blocks instead of just text:
async def loop(system_prompt, toolbox, messages, user_input):
messages.append({"role": "user", "content": user_input})
while True:
msg = await client.messages.create(
max_tokens=2048,
thinking={"type": "enabled", "budget_tokens": 1024},
system=[{"type": "text", "text": system_prompt}],
messages=messages,
model="claude-sonnet-4-0",
tools=toolbox.schema()
)
messages.append({"role": "assistant", "content": msg.content})
agent_text = ' '.join(t.text for t in msg.content if t.type == 'text')
if agent_text:
print(f"π€ {agent_text}")
tools = [t for t in msg.content if t.type == "tool_use"]
if not tools:
break
# Execute all tools and collect results
results = await asyncio.gather(*[
toolbox.run(t.name, t.input) for t in tools
])
# Display results and send back to model
for t, r in zip(tools, results):
status = "β
" if r.get("success") else "β"
print(f"{status} {t.name}: {r.get('output', r)}")
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": t.id,
"content": json.dumps(r)
} for t, r in zip(tools, results)]
})
async def main():
toolbox = Toolbox([read_file])
messages = []
system_prompt = f"""Your name is HAL.
You are an interactive CLI tool that helps with software engineering and production operations tasks.
You are running on {str(os.uname())}, today is {datetime.now().strftime('%Y-%m-%d')}
"""
try:
while True:
user_input = input("> ")
if user_input.lower() == "exit":
break
await loop(system_prompt, toolbox, messages, user_input)
except (KeyboardInterrupt, EOFError):
print("π Goodbye!")
asyncio.run(main())
So far we have only given HAL the ability to read files. While not particularly useful on its own, it now has the capability to understand itself:
./hal
> You are implemented in ./hal, read the file and try to make sense of it
π€ Fascinating! I'm looking at my own implementation. Let me break down what I see:
## HAL Implementation Analysis
This is a Python script that implements me as an interactive CLI tool using the Anthropic API. Here's how it works:
### **Architecture:**
- **Script type**: A `uv` runnable script with embedded dependencies (Pydantic, Anthropic, aiofiles)
- **Model**: Uses Claude Sonnet 4.0 with thinking enabled (1024 token budget)
- **Async design**: Built with asyncio for concurrent tool execution
### **Key Components:**
1. **Tool System**:
- Uses a decorator `@tool` that automatically creates Pydantic models from function signatures
- Supports `Annotated` types for parameter descriptions
- Currently implements just one tool: `read_file`
2. **Toolbox Class**:
- Manages tool registration and execution
- Generates JSON schemas for the AI model
- Handles async tool execution
3. **Main Loop**:
- Interactive REPL that maintains conversation context
- Processes tool calls and displays results with emoji status indicators (β
/β)
- Handles multi-tool execution concurrently
### **System Prompt**:
The system prompt defines me as "HAL" - an interactive CLI tool for software engineering and production operations, with system information and current date.
### **Interesting Details**:
- The thinking capability is enabled with a 1024 token budget
- Tool results are displayed with visual feedback
- Graceful shutdown on Ctrl+C or "exit"
- Maintains full conversation history for context
This is a clean, modular implementation that makes it easy to add new tools by just decorating functions. The async design allows for efficient concurrent tool execution, which is great for operations that might involve multiple file reads or network calls.
Pretty elegant for a self-contained script! The use of `uv run --script` with embedded dependencies makes it very portable.
Adding More Tools
With our infrastructure in place, adding more capabilities becomes straightforward. Let’s implement three essential tools: file writing, editing, and shell execution:
@tool
async def write_file(filename: str, content: str):
"""Write the file with the content. This is an overwrite"""
async with aiofiles.open(filename, 'w') as f:
await f.write(content)
return {"success": True, "filename": filename, "output": f"wrote to {filename}"}
@tool
async def edit_file(filename: str, old_text: Annotated[str, "the text to replace"], new_text: Annotated[str, "the text to replace with"]):
"""Edit the file with the new text. Note that the text to replace should only appear once in the file."""
async with aiofiles.open(filename, 'r') as f:
content = await f.read()
if content.count(old_text) != 1:
return {"success": False, "filename": filename, "output": f"old text appears {content.count(old_text)} times in the file"}
async with aiofiles.open(filename, 'w') as f:
await f.write(content.replace(old_text, new_text))
return {"success": True, "filename": filename, "output": f"edited {filename}"}
@tool
async def shell(command: Annotated[str, "command to execute"], timeout: Annotated[int, "timeout in seconds"] = 30):
"""Execute a bash command"""
try:
p = await asyncio.create_subprocess_shell(
command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.STDOUT
)
stdout, _ = await asyncio.wait_for(p.communicate(), timeout)
return {"success": p.returncode == 0, "command": command, "output": stdout.decode()}
except asyncio.TimeoutError:
return {"success": False, "command": command, "output": "Timeout"}
Putting It All Together
Finally, we wire everything up in our main function
async def main():
toolbox = Toolbox([read_file, write_file, edit_file, shell])
messages = []
system_prompt = f"""Your name is HAL.
You are an interactive CLI tool that helps with software engineering and production operations tasks.
You are running on {str(os.uname())}, today is {datetime.now().strftime('%Y-%m-%d')}
"""
try:
while True:
user_input = input("> ")
if user_input.lower() == "exit":
break
await loop(system_prompt, toolbox, messages, user_input)
except (KeyboardInterrupt, EOFError):
print("π Goodbye!")
asyncio.run(main())
You can find the full code here.
What many people don’t realise is that frontier models like Claude possess incredible system operations knowledge. Once you provide it with shell access and basic system information, even without extensive prompting, these models demonstrate expert-level operations capabilities. Claude Sonnet models in particular exhibit persistent, relentless and infectiously optimistic problem-solving behavior. When asked a question like “how is python3 installed on this machine?”, they will systematically explore the filesystem, examine package managers, check environment variables, and execute diagnostic commands until they find a definitive answer.
When commands fail or produce errors, the model treats these as valuable debugging signals, much like an experienced engineer would. It analyses error messages, adjusts its approach, and tries alternative strategies to work around the problem.
While hallucination remains a concern (perhaps one of the most common criticisms of LLMs), the model’s ability to execute commands creates a unreasonably effective feedback loop. When its assumptions prove incorrect, the resulting errors quickly ground the model back in reality, forcing it to revise its mental model and try new approaches based on actual system behavior rather than potentially outdated training data or hallucinations.
> how are python installed on this machine
π€ I'll help you check how Python is installed on this machine. Let me run some commands to investigate.
β
shell:
{
"success": true,
"command": "python --version 2>/dev/null || echo \"python command not found\"",
"output": "Python 3.13.2\n"
}
β
shell:
{
"success": true,
"command": "python3 --version 2>/dev/null || echo \"python3 command not found\"",
"output": "Python 3.13.2\n"
}
β
shell:
{
"success": true,
"command": "which python python3",
"output": "[redacted]/.cache/uv/environments-v2/hal-b49e6f636ce55f4b/bin/python\n[redacted]/.cache/uv/environments-v2/hal-b49e6f636ce55f4b/bin/python3\n"
}
...
π€ Based on my investigation, here's how Python is installed on this machine:
## Python Installation Summary
**Current Python Version**: Python 3.13.2
**Installation Method**: **UV (Ultra-fast Python package installer)**
### Key Details:
1. **UV Virtual Environment**: Python is currently running inside a UV-managed virtual environment located at:
[redacted]/.cache/uv/environments-v2/hal-b49e6f636ce55f4b/
2. **Python Source**: The actual Python interpreter is managed by UV and stored at:
[redacted]/.local/share/uv/python/cpython-3.13.2-linux-x86_64-gnu/
3. **UV Version**: uv 0.6.5 is installed and managing the Python environment
4. **Symbolic Links**: The Python commands in the virtual environment are symbolic links:
- `python` β points to the full Python 3.13 interpreter
- `python3` β points to `python`
- `python3.13` β points to `python`
...
It is just the beginning
Our AI agent weighs in at just over 150 lines of code, yet it can read files, write content, make precise edits, and execute shell commands. The power comes not from complex orchestration, but from the sophisticated reasoning capabilities of current-generation LLMs combined with a simple, well-structured tool interface.
While this agent can solve real-world problems, it’s still in its infancy. There is a lot of grunt work needed to make it very useful, such as better prompts, LLM-friendly output formatting, sub-agent orchestration, and Model Context Protocol support. The remarkable aspect is that with proper guidance, the agent can evolve and extend itself.
This self-improvement capability becomes evident when we guide HAL to build new tools. For instance, we can walk it through creating a web fetch capability - https://gist.github.com/jingkaihe/538ba67804381bf6767b255cdd9b4fc9#file-build-yourself-a-webfetch-tool-txt
With this new tool, HAL can now access and analyse web content, effectively reading newspapers and extracting structured information:
./hal
> what is the top 10 news on hacker news, give me the title, author and links
π€ I'll fetch the top stories from Hacker News for you.
β
web_fetch:
{
"success": true,
"url": "https://news.ycombinator.com/",
"final_url": "https://news.ycombinator.com/",
"content_type": "text/html; charset=utf-8",
"output": "Hacker News\n\n[OMITTED]"
}
π€ Here are the top 10 news stories from Hacker News:
## Top 10 Hacker News Stories
1. **Compiling LLMs into a MegaKernel: A path to low-latency inference**
- Author: matt_d
- Link: https://zhihaojia.medium.com/compiling-llms-into-a-megakernel-a-path-to-low-latency-inference-cf7840913c17
- 102 points, 25 comments
2. **Literate programming tool for any language**
- Author: LorenDB
- Link: https://github.com/zyedidia/Literate
- 13 points, 2 comments
3. **Curved-Crease Sculpture**
- Author: wonger_
- Link: https://erikdemaine.org/curved/
- 147 points, 21 comments
4. **Homegrown Closures for Uxn**
- Author: todsacerdoti
- Link: https://krzysckh.org/b/Homegrown-closures-for-uxn.html
- 56 points, 2 comments
5. **Andrej Karpathy: Software in the era of AI [video]**
- Author: sandslash
- Link: https://www.youtube.com/watch?v=LCEmiRjPEtQ
- 1064 points, 588 comments
6. **Show HN: EnrichMCP β A Python ORM for Agents**
- Author: bloppe
- Link: https://github.com/featureform/enrichmcp
- 71 points, 20 comments
7. **How OpenElections uses LLMs**
- Author: m-hodges
- Link: https://thescoop.org/archives/2025/06/09/how-openelections-uses-llms/index.html
- 74 points, 23 comments
8. **Show HN: A DOS-like hobby OS written in Rust and x86 assembly**
- Author: krustowski
- Link: https://github.com/krustowski/rou2exOS
- 135 points, 26 comments
9. **Show HN: RM2000 Tape Recorder, an audio sampler for macOS**
- Author: marcelox86
- Link: https://rm2000.app
- 14 points, 4 comments
10. **Extracting memorized pieces of books from open-weight language models**
- Author: fzliu
- Link: https://arxiv.org/abs/2505.12546
- 34 points, 13 comments
The most popular story currently is Andrej Karpathy's video about "Software in the era of AI" with over 1000 points and nearly 600 comments!