Agentic AI - Prompt Engineering

AI Agent System Prompt Builder

Build and lint a structured agent system prompt with role, tools, memory, fallback policy, output format, and eval criteria across LangGraph, CrewAI, and Anthropic patterns.

Author: Mudassir Khan. Last updated May 17, 2026.

AI Agent System Prompt Builder illustrationA responsive schematic diagram representing the tool workflow from inputs through calculation to recommendation.inputsmodelanswer
You are Tier-1 Support Agent.
Role: Resolve routine customer support tickets using approved tools.
Tools: {"lookup_account":{"type":"object","properties":{"email":{"type":"string"}}}}
Fallback policy: Escalate to human support after one failed tool retry.
Output format: structured JSON
Success criteria: accurate, grounded, concise, and escalated when uncertain.

Estimated tokens

91

Lint warnings

0

No basic lint warnings.

Direct answer

An AI agent system prompt should define role, success criteria, tool inventory and rules, memory policy, output format, fallback behavior, and example traces. A good system prompt reads like an operating contract for the agent, not a personality description. This builder generates that structure for LangGraph, CrewAI, and Anthropic tool use formats.

Tier 1 support agent prompt

Input: Role, account lookup tool schema, escalation fallback, structured JSON output, and support success criteria.

Output: The output should generate framework ready prompt text and deterministic lint warnings for missing sections.

How to use this tool

  1. 1. Describe the agent role.
  2. 2. Add tools and JSON schemas.
  3. 3. Choose memory and fallback policy.
  4. 4. Copy the generated framework specific prompt.

Anatomy of a strong agent system prompt

A strong agent system prompt has six labelled parts: role and identity, success criteria, tool inventory with usage rules, memory and state policy, output format and refusal policy, and worked examples. Each part has a deterministic purpose, which is why a structured builder produces stronger prompts than free form writing.

Role is what the agent is. Success criteria are how the agent and the team know the task is done. Tool rules are when to call which tool and what to do when a tool fails. Memory rules say what the agent should remember, summarize, or forget. Output format constrains the response shape. Examples ground the rest by showing the desired trace for one or two canonical cases.

Example prompts for LangGraph, CrewAI, and Anthropic tool use

For LangGraph, the system prompt is usually shorter because much of the contract lives in graph nodes and tool definitions. The prompt focuses on identity, refusal policy, and inter node hand off conventions. For CrewAI, the prompt is heavier because role personalities and collaboration patterns are expressed in prose. For Anthropic tool use, the prompt names tools by their declared schema and is explicit about when not to call any tool at all.

The builder outputs framework specific text for each so you can paste directly into your runtime without rewriting the same prompt three times.

ReAct vs CoT vs tool use prompt styles

ReAct style prompts interleave Thought, Action, and Observation tags so the model reasons step by step with tool calls in the middle. Chain of thought style prompts ask the model to reason before answering but do not assume external tools. Tool use style prompts let the model call declared functions natively and skip the textual scaffolding.

Modern frameworks favour native tool use because the runtime handles the call structure and the model wastes fewer tokens on Thought scaffolding. ReAct is still useful for debugging or for runtimes that lack first class tool support. Chain of thought is most useful when no tools are needed at all.

Common mistakes in agent system prompts

The most common mistake is missing fallback policy. When a tool fails or returns an unexpected shape, the agent improvises. A short paragraph that names the failure modes and the correct fallback removes most of that improvisation.

The second mistake is vague success criteria. If the prompt cannot say what a finished response looks like, neither evals nor the agent itself can. The third mistake is missing examples. A single worked example calibrates tone, format, and tool call shape far better than another paragraph of instructions.

Why agent prompts are not chatbot prompts

A chatbot prompt can define tone and response style and largely stop there. An agent prompt must define role boundaries, tools, tool use rules, memory policy, fallback behavior, output schema, and evaluation criteria. The agent has to be able to act, recover, and stop, not just respond.

Ambiguity in an agent prompt becomes operational risk. If the prompt does not say what to do when a tool fails, the agent improvises. Production systems need fewer improvisations and more explicit contracts.

What the lint checks for

The lint pass checks for missing role, weak success criteria, absent fallback policy, no output format, missing examples, and JSON schemas that do not parse. It is deterministic and local, not an LLM generated review. That makes it fast and reproducible across teams.

Treat lint warnings as a checklist rather than a verdict. A passing lint does not guarantee a good agent, but a failing lint almost always predicts a brittle one.

Assumptions and methodology

This tool uses transparent browser-side calculations and curated assumptions rather than LLM-generated recommendations. Outputs are planning estimates. They should be validated against provider pricing, production traces, engineering quotes, or domain review before money, compliance, safety, or hiring decisions are made.

Numerical defaults are dated and surfaced on the page. The methodology favours explicit assumptions over false precision: every estimate is meant to expose the variable that drives the result, not to pretend that early planning data is exact.

Turn the result into an implementation plan

Bring the scenario to a strategy call and I will pressure-test the workflow, assumptions, failure modes, and delivery path.

Book a strategy call

Frequently asked questions

What should be in an AI agent system prompt?
An AI agent system prompt should include role and identity, success criteria, tool inventory with usage rules, memory and state policy, output format and refusal rules, and one or two worked examples. Skipping any of these usually produces an agent that improvises in failure modes and is hard to evaluate after launch.
How do I write a system prompt for an AI agent?
Start by writing the role and the one sentence success criterion. Add tools with explicit usage rules and failure handling. Define memory policy and output format. Add a worked example trace. Then lint the result for missing sections. The builder generates this structure automatically and outputs framework specific text for LangGraph, CrewAI, or Anthropic tool use.
What is the difference between an agent prompt and a chatbot prompt?
An agent prompt includes tool rules, state assumptions, fallback policy, output constraints, and eval criteria. A chatbot prompt usually focuses on conversational style and task framing. The agent has to act and recover, not just respond, which is why the prompt is longer and more structured.
How do I write a good tool description for an agent?
A good tool description states when to use the tool, the required arguments, what the tool returns, and when not to use it. The agent should not infer tool policy from the tool name alone. A two sentence description plus an explicit failure case usually performs better than a long paragraph of capability marketing.
What memory strategy should an AI agent use?
Use no memory for stateless tasks, sliding windows for short sessions, summaries for long conversations, vector memory for retrieval over past interactions, and hybrid memory when both history and knowledge matter. Memory choice usually has more impact on cost and quality than model choice for long running agents.
Does the agent system prompt builder work for any model?
The prompt structure works broadly across modern instruction tuned models. Tool call syntax differs across providers and frameworks, so the builder generates framework specific text for LangGraph, CrewAI, and Anthropic tool use. Always verify the generated format against current framework docs before shipping.
What does the system prompt lint check for?
The lint checks whether the prompt has a clear role, success criteria, output format, fallback policy, examples, and valid looking tool schemas. It catches common omissions, not semantic correctness. A passing lint does not guarantee a good agent, but a failing lint almost always predicts a brittle one.