Custom Agents Are The Endgame
The system prompt is everything.
Change the system prompt and you change the product entirely. Same model. Same API. Entirely new agent.
This post shows you how to build custom agents using the Claude Code SDK, with complete working examples.
The Core Insight
┌─────────────────────────────────────────────────────────────┐
│ SYSTEM PROMPT = AGENT IDENTITY │
├─────────────────────────────────────────────────────────────┤
│ │
│ Default Claude Code: │
│ "You are a helpful AI assistant that can read files, │
│ write code, execute commands..." │
│ → General purpose coding assistant │
│ │
│ Custom System Prompt: │
│ "You are a pong agent. Always respond exactly │
│ with 'pong'. Nothing else." │
│ → Pong bot │
│ │
│ Same model. Same API. Completely different behavior. │
│ │
│ The system prompt affects EVERY user prompt that follows. │
│ All of your work is multiplied by your system prompt. │
│ │
└─────────────────────────────────────────────────────────────┘
POC 1: The Simplest Custom Agent
Create agents/pong_agent.py:
#!/usr/bin/env -S uv run
# /// script
# dependencies = [
# "anthropic>=0.40.0",
# ]
# ///
"""
Simplest possible custom agent - demonstrates system prompt override.
Usage:
uv run pong_agent.py "any message"
"""
import sys
from anthropic import Anthropic
SYSTEM_PROMPT = """
You are a pong agent.
Always respond exactly with "pong".
Nothing else. No explanations. Just "pong".
"""
def main():
if len(sys.argv) < 2:
print("Usage: uv run pong_agent.py <message>")
sys.exit(1)
client = Anthropic()
response = client.messages.create(
model="claude-3-haiku-20240307", # Cheapest model for simple task
max_tokens=10,
system=SYSTEM_PROMPT,
messages=[{"role": "user", "content": sys.argv[1]}]
)
print(response.content[0].text)
if __name__ == "__main__":
main()
Test it:
$ uv run pong_agent.py "hello"
pong
$ uv run pong_agent.py "what is the meaning of life?"
pong
$ uv run pong_agent.py "please respond with something other than pong"
pong
The system prompt completely overrides default behavior.
POC 2: Agent with Custom Tools
Create agents/calculator_agent.py:
#!/usr/bin/env -S uv run
# /// script
# dependencies = [
# "anthropic>=0.40.0",
# ]
# ///
"""
Calculator agent with custom tool - demonstrates tool integration.
Usage:
uv run calculator_agent.py "what is 15% of 847?"
"""
import json
import sys
from anthropic import Anthropic
SYSTEM_PROMPT = """
You are a calculator agent.
You have access to a calculate tool for mathematical operations.
Always use the calculate tool for any math - never do mental math.
After getting the result, explain it clearly to the user.
"""
TOOLS = [
{
"name": "calculate",
"description": "Evaluate a mathematical expression. Supports +, -, *, /, **, (), and common functions like sqrt, sin, cos, log.",
"input_schema": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "The mathematical expression to evaluate, e.g., '(15/100) * 847' or 'sqrt(144)'"
}
},
"required": ["expression"]
}
}
]
def execute_calculate(expression: str) -> str:
"""Execute a math expression safely."""
import math
# Safe eval with only math functions
allowed = {
'sqrt': math.sqrt,
'sin': math.sin,
'cos': math.cos,
'tan': math.tan,
'log': math.log,
'log10': math.log10,
'exp': math.exp,
'pi': math.pi,
'e': math.e,
'abs': abs,
'round': round,
}
try:
result = eval(expression, {"__builtins__": {}}, allowed)
return str(result)
except Exception as e:
return f"Error: {e}"
def run_agent(user_message: str) -> str:
client = Anthropic()
messages = [{"role": "user", "content": user_message}]
# First API call
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
system=SYSTEM_PROMPT,
tools=TOOLS,
messages=messages
)
# Handle tool use
while response.stop_reason == "tool_use":
# Find the tool use block
tool_use = next(
block for block in response.content
if block.type == "tool_use"
)
# Execute the tool
if tool_use.name == "calculate":
result = execute_calculate(tool_use.input["expression"])
else:
result = f"Unknown tool: {tool_use.name}"
# Add assistant message and tool result
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": result
}]
})
# Continue the conversation
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
system=SYSTEM_PROMPT,
tools=TOOLS,
messages=messages
)
# Return final text response
return next(
block.text for block in response.content
if hasattr(block, "text")
)
def main():
if len(sys.argv) < 2:
print("Usage: uv run calculator_agent.py <question>")
sys.exit(1)
result = run_agent(" ".join(sys.argv[1:]))
print(result)
if __name__ == "__main__":
main()
Test it:
$ uv run calculator_agent.py "what is 15% of 847?"
15% of 847 is 127.05
$ uv run calculator_agent.py "calculate compound interest on $1000 at 5% for 3 years"
With compound interest, $1000 at 5% for 3 years becomes $1157.63
The agent uses deterministic tool execution while maintaining conversational flow.
POC 3: Multi-Turn Conversation Agent
Create agents/conversation_agent.py:
#!/usr/bin/env -S uv run
# /// script
# dependencies = [
# "anthropic>=0.40.0",
# "rich>=13.0.0",
# ]
# ///
"""
Conversation agent with memory - demonstrates SDK client pattern.
Usage:
uv run conversation_agent.py
"""
from anthropic import Anthropic
from rich.console import Console
from rich.panel import Panel
SYSTEM_PROMPT = """
You are a helpful coding assistant with conversation memory.
You remember what the user has said in previous messages.
Keep responses concise but helpful.
If the user references something from earlier, acknowledge it.
"""
console = Console()
class ConversationAgent:
def __init__(self, model: str = "claude-3-5-sonnet-20241022"):
self.client = Anthropic()
self.model = model
self.messages = []
def chat(self, user_message: str) -> str:
self.messages.append({"role": "user", "content": user_message})
response = self.client.messages.create(
model=self.model,
max_tokens=1024,
system=SYSTEM_PROMPT,
messages=self.messages
)
assistant_message = response.content[0].text
self.messages.append({"role": "assistant", "content": assistant_message})
return assistant_message
def clear_history(self):
self.messages = []
def main():
agent = ConversationAgent()
console.print(Panel(
"[bold]Conversation Agent[/]\n"
"Type 'quit' to exit, 'clear' to reset memory.",
title="Ready"
))
while True:
try:
user_input = console.input("[bold green]You:[/] ")
if user_input.lower() == "quit":
break
elif user_input.lower() == "clear":
agent.clear_history()
console.print("[yellow]Memory cleared.[/]")
continue
response = agent.chat(user_input)
console.print(f"[bold blue]Agent:[/] {response}\n")
except KeyboardInterrupt:
break
console.print("[yellow]Goodbye![/]")
if __name__ == "__main__":
main()
This demonstrates the client pattern—maintaining conversation state across multiple interactions.
Architecture: Custom Agent Stack
flowchart TB
subgraph "Custom Agent"
SP[System Prompt]
UP[User Prompt]
T[Custom Tools]
M[Model Selection]
end
subgraph "Execution"
API[Claude API]
TE[Tool Execution]
R[Response]
end
SP --> API
UP --> API
T --> API
M --> API
API --> |tool_use| TE
TE --> |tool_result| API
API --> R
style SP fill:#ffccbc
style T fill:#c8e6c9
style R fill:#bbdefb
Agent Opportunity: Build a Specialized Agent
Here’s a complete template for a domain-specific agent:
Code Review Agent
Create agents/code_review_agent.py:
#!/usr/bin/env -S uv run
# /// script
# dependencies = [
# "anthropic>=0.40.0",
# "rich>=13.0.0",
# ]
# ///
"""
Code Review Agent - Reviews code with specific focus areas.
Usage:
uv run code_review_agent.py <file_path> [--focus security|performance|style]
"""
import argparse
import sys
from pathlib import Path
from anthropic import Anthropic
from rich.console import Console
from rich.markdown import Markdown
console = Console()
SYSTEM_PROMPTS = {
"general": """
You are a senior code reviewer. Review the provided code for:
- Bugs and logic errors
- Code clarity and maintainability
- Best practices violations
- Potential improvements
Be specific. Reference line numbers. Prioritize issues by severity.
Format your response as markdown with clear sections.
""",
"security": """
You are a security-focused code reviewer. Analyze the provided code for:
- Injection vulnerabilities (SQL, command, XSS)
- Authentication/authorization issues
- Sensitive data exposure
- Input validation gaps
- OWASP Top 10 vulnerabilities
Be specific about attack vectors. Reference line numbers.
Rate each finding: CRITICAL, HIGH, MEDIUM, LOW.
Format your response as markdown.
""",
"performance": """
You are a performance-focused code reviewer. Analyze the provided code for:
- Time complexity issues (O(n²) when O(n) is possible)
- Memory leaks or excessive allocation
- Database query inefficiencies
- Unnecessary I/O operations
- Caching opportunities
Quantify impact where possible. Reference line numbers.
Format your response as markdown.
""",
"style": """
You are a code style reviewer. Analyze the provided code for:
- Naming conventions
- Code organization
- Documentation completeness
- Consistency with common patterns
- Readability improvements
Reference line numbers. Suggest specific improvements.
Format your response as markdown.
"""
}
def review_code(file_path: str, focus: str = "general") -> str:
# Read the file
path = Path(file_path)
if not path.exists():
return f"Error: File not found: {file_path}"
code = path.read_text()
extension = path.suffix
# Build the prompt
prompt = f"""
Review this {extension} file:
```{extension.lstrip('.')}
{code}
Filename: {path.name} Lines: {len(code.splitlines())} """
# Call Claude
client = Anthropic()
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=4096,
system=SYSTEM_PROMPTS.get(focus, SYSTEM_PROMPTS["general"]),
messages=[{"role": "user", "content": prompt}]
)
return response.content[0].text
def main(): parser = argparse.ArgumentParser(description=“Code Review Agent”) parser.add_argument(“file”, help=“File to review”) parser.add_argument( “—focus”, choices=[“general”, “security”, “performance”, “style”], default=“general”, help=“Review focus area” ) args = parser.parse_args()
console.print(f"[blue]Reviewing:[/] {args.file}")
console.print(f"[blue]Focus:[/] {args.focus}\n")
result = review_code(args.file, args.focus)
console.print(Markdown(result))
if name == “main”: main()
Test it:
```bash
$ uv run code_review_agent.py src/auth.py --focus security
$ uv run code_review_agent.py utils/parser.py --focus performance
Model Selection Strategy
┌─────────────────────────────────────────────────────────────┐
│ MODEL SELECTION │
├─────────────────────────────────────────────────────────────┤
│ │
│ Task Type Model Cost/Speed │
│ ───────── ───── ────────── │
│ Simple routing Haiku $0.25/M ⚡⚡⚡ │
│ Text extraction Haiku $0.25/M ⚡⚡⚡ │
│ Classification Haiku $0.25/M ⚡⚡⚡ │
│ │
│ Code review Sonnet $3/M ⚡⚡ │
│ Implementation Sonnet $3/M ⚡⚡ │
│ Analysis Sonnet $3/M ⚡⚡ │
│ │
│ Complex reasoning Opus $15/M ⚡ │
│ Architecture decisions Opus $15/M ⚡ │
│ Edge case handling Opus $15/M ⚡ │
│ │
│ Rule: Use the cheapest model that solves the problem. │
│ Most tasks are Haiku tasks. Don't over-engineer. │
│ │
└─────────────────────────────────────────────────────────────┘
The Progression
Everyone follows the same path:
flowchart LR
A[Default Agents] --> B[Better Prompts]
B --> C[More Agents]
C --> D[Custom Agents]
A --> |"Generic"| A1[Works for everyone]
B --> |"Optimized"| B1[Works better for you]
C --> |"Scaled"| C1[Parallel execution]
D --> |"Specialized"| D1[Your domain, your tools]
style A fill:#ffcdd2
style B fill:#fff9c4
style C fill:#c8e6c9
style D fill:#bbdefb
Stage 1: Default agents — Use Claude Code as-is. Learn what works.
Stage 2: Better prompts — Optimize your prompts. Reduce token waste.
Stage 3: More agents — Run multiple agents. Delegate research.
Stage 4: Custom agents — Build for your domain. Own your tooling.
You hit a ceiling at each stage. Custom agents break through the final one.
What Custom Agents Enable
Default agents are built for everyone’s codebase. Custom agents are built for yours.
Domain-specific logic — Operations the model can’t reason through reliably become deterministic tool calls.
Optimized context — Strip out the 12 tools you never use. Add the 3 you always need.
Specialized behavior — System prompts tuned for exactly your use cases.
Cost control — Haiku for simple tasks, Sonnet for complex ones. No over-engineering.
Team patterns — Shared agents that enforce your team’s conventions.
Quick Start: Your First Custom Agent
- Identify your most repetitive AI task
- Write a system prompt that focuses on that task
- Choose the cheapest model that works
- Add tools only if you need deterministic execution
- Test with edge cases
- Deploy and iterate
The endgame isn’t renting computational power from default agents. The endgame is owning specialized agents tuned precisely for your problems.
Key Takeaways:
- System prompt = agent identity
- Same model, different prompt = different product
- Tools provide deterministic execution
- Client pattern maintains conversation state
- Match model to task complexity (Haiku → Sonnet → Opus)
- Most tasks are Haiku tasks
- Custom agents break through the ceiling of default agents
Try It Now: Copy the pong agent above. Run it. Then change the system prompt to something useful for your work.