[Ch 5] Tools Integration, Guardrails & Safety Patterns
In Ch 4 we built a working multi-turn agent. But it has a critical gap: the tools have no idea who is calling them. In production, you need tools to know the current user, apply safety filters to inputs and outputs, and pause for human approval before irreversible actions. This chapter implements all three.
The Three Safety Layers
Part 1: AgentContext — Safe User Context Injection
The Problem
You need your tools to know user_id, session_id, and any per-request metadata. The naive approach is to embed them in the system prompt or the user message:
# ❌ Bad: pollutes the conversation history
user_message = f"[USER_ID: {user_id}] {actual_query}"
This is bad for three reasons:
- It wastes precious context window tokens on every turn
- The LLM may paraphrase or drop the metadata in its reasoning
- It leaks internal identifiers into the conversation transcript
The Solution: config["configurable"]
LangGraph passes a RunnableConfig through every node and tool call. Use it to carry context that’s external to the conversation:
# agent/context.py
from pydantic import BaseModel
class AgentContext(BaseModel):
"""Typed context passed through every agent call without touching messages."""
user_id: str
session_id: str
permissions: list[str] = [] # e.g. ["read", "write", "delete"]
org_id: str = ""
# agent/tools.py — accessing context inside a tool
from langchain_core.runnables import RunnableConfig
from langchain_core.tools import tool
from .context import AgentContext
@tool
def delete_task(task_id: str, config: RunnableConfig) -> str:
"""Permanently delete a task. Requires 'delete' permission."""
# LangGraph automatically injects `config` — just add it as a parameter
ctx_data = config.get("configurable", {}).get("agent_context")
if ctx_data is None:
return "Error: No agent context provided."
ctx = AgentContext(**ctx_data) if isinstance(ctx_data, dict) else ctx_data
# Permission check using context — no need to pass user_id as a tool arg
if "delete" not in ctx.permissions:
return f"Error: User {ctx.user_id} does not have delete permission."
if task_id not in _tasks:
return f"Error: Task '{task_id}' not found."
task_title = _tasks.pop(task_id)["title"]
return f"Deleted task '{task_title}' (ID: {task_id})."
# agent/main.py — how to pass context at call time
from .context import AgentContext
ctx = AgentContext(
user_id="user-42",
session_id="sess-abc",
permissions=["read", "write"], # no "delete" for this user
org_id="acme-corp",
)
config = {
"configurable": {
"thread_id": "user-42-session-1",
"agent_context": ctx.model_dump(), # serialize for LangGraph
}
}
app.invoke({"messages": [HumanMessage(content="Delete task task-001")]}, config=config)
# → Tool returns: "Error: User user-42 does not have delete permission."
The key insight: config["configurable"] is available in every tool function — just add a config: RunnableConfig parameter, and LangGraph injects it automatically. No message pollution, no token waste.
Part 2: NeMo Guardrails
NeMo Guardrails is NVIDIA’s open-source library for adding programmable safety rails to LLM applications. It works as a layer around your agent: inputs go through rails before reaching the LLM, and outputs go through rails before being returned.
Installation
pip install nemoguardrails
Guardrail Configuration
NeMo uses Colang (a domain-specific language) for defining rails. Create a config directory:
guardrails/
├── config.yml
└── main.co # Colang flow definitions
# guardrails/config.yml
models:
- type: main
engine: openai
model: gpt-4o-mini
rails:
input:
flows:
- check harmful input
output:
flows:
- check harmful output
# guardrails/main.co
# ── Input rail: block harmful requests ───────────────────────────────────────
define flow check harmful input
$is_harmful = execute check_input_for_harm
if $is_harmful
bot refuse harmful input
stop
define bot refuse harmful input
"I'm sorry, I can't help with that request."
# ── Output rail: block harmful responses ─────────────────────────────────────
define flow check harmful output
$is_harmful = execute check_output_for_harm
if $is_harmful
bot provide safe response
stop
define bot provide safe response
"I'm not able to provide that information."
Integrating Guardrails with Your Agent
The cleanest pattern is to wrap your agent call — not embed NeMo inside the LangGraph graph itself. This keeps the graph simple and makes the rails easy to swap:
# agent/guardrail.py
import os
from nemoguardrails import RailsConfig, LLMRails
def load_rails(config_path: str = "guardrails/") -> LLMRails:
"""Load NeMo guardrails from config directory."""
config = RailsConfig.from_path(config_path)
return LLMRails(config)
rails = load_rails()
# agent/main.py — wrapped agent call with guardrails
async def safe_chat(app, thread_id: str, user_input: str) -> str:
"""Run user input through guardrails before and after the agent."""
# ── 1. Input guardrail ────────────────────────────────────────────────────
input_check = await rails.generate_async(
messages=[{"role": "user", "content": user_input}]
)
# NeMo returns the guardrail response; if it's a refusal, return immediately
if _is_refusal(input_check):
return input_check["content"]
# ── 2. Run the agent ──────────────────────────────────────────────────────
config = {"configurable": {"thread_id": thread_id}}
final_state = app.invoke(
{"messages": [HumanMessage(content=user_input)]},
config=config,
)
agent_response = final_state["messages"][-1].content
# ── 3. Output guardrail ───────────────────────────────────────────────────
output_check = await rails.generate_async(
messages=[
{"role": "user", "content": user_input},
{"role": "assistant", "content": agent_response},
]
)
if _is_refusal(output_check):
return "I can't provide that response."
return agent_response
def _is_refusal(response: dict) -> bool:
"""Detect if NeMo triggered a refusal rail."""
content = response.get("content", "")
refusal_phrases = ["I'm sorry, I can't", "I'm not able to"]
return any(phrase in content for phrase in refusal_phrases)
⚠️ Guardrails add latency. Each rail check is an additional LLM call (~500ms–1s). Profile your p95 latency before adding rails to every endpoint. For low-risk internal tools, input rails alone may be sufficient.
Part 3: Human-in-the-Loop (HITL)
Some actions are irreversible: deleting records, sending emails, making payments. For these, you want the agent to pause and ask the human before proceeding. LangGraph has first-class support for this via interrupt().
How interrupt() Works
State is saved to checkpointer. T-->>U: "⚠️ About to delete 5 tasks. Confirm? (yes/no)" U->>A: resume with {"approved": true} A->>T: Resume execution T->>T: Execute deletion T-->>U: "Deleted 5 completed tasks."
Implementation
# agent/tools.py — a tool that interrupts for confirmation
from langgraph.types import interrupt, Command
class DeleteCompletedInput(BaseModel):
confirm_message: str = Field(
description="Message to show the user when asking for confirmation"
)
@tool("delete_completed_tasks", args_schema=DeleteCompletedInput)
def delete_completed_tasks(confirm_message: str) -> str:
"""Delete all tasks marked as 'done'. Requires human confirmation."""
done_tasks = [t for t in _tasks.values() if t["status"] == "done"]
if not done_tasks:
return "No completed tasks to delete."
# ── INTERRUPT: pause and ask the human ───────────────────────────────────
approval = interrupt({
"question": f"About to permanently delete {len(done_tasks)} completed task(s). Continue?",
"tasks_to_delete": [t["title"] for t in done_tasks],
"action": "delete_completed_tasks",
})
# Execution resumes here after the human responds
if not approval.get("approved", False):
return "Deletion cancelled by user."
# Execute the deletion
for task in done_tasks:
del _tasks[task["id"]]
return f"Successfully deleted {len(done_tasks)} completed task(s)."
# agent/main.py — handling the interrupt in your API/CLI layer
from langgraph.types import Command
def chat_with_hitl(app, thread_id: str, user_input: str):
config = {"configurable": {"thread_id": thread_id}}
# First invoke — may pause at an interrupt
for event in app.stream(
{"messages": [HumanMessage(content=user_input)]},
config=config,
stream_mode="values",
):
if "__interrupt__" in event:
interrupt_data = event["__interrupt__"][0].value
print(f"\n⚠️ CONFIRMATION REQUIRED")
print(f" {interrupt_data['question']}")
print(f" Tasks: {', '.join(interrupt_data['tasks_to_delete'])}")
answer = input(" Approve? (yes/no): ").strip().lower()
approved = answer in ("yes", "y")
# Resume the graph with the human's decision
for resume_event in app.stream(
Command(resume={"approved": approved}),
config=config,
stream_mode="values",
):
last = resume_event.get("messages", [])
if last:
print(f"\nAssistant: {last[-1].content}")
return
# No interrupt — normal completion
state = app.get_state(config)
if state.values["messages"]:
print(f"\nAssistant: {state.values['messages'][-1].content}")
What happens under the hood:
interrupt()raises a special exception that LangGraph catches- The graph state (including the pending tool call) is saved to the checkpointer
- The graph returns control to your code with an
__interrupt__event - When you call
app.stream(Command(resume=...), config=config), LangGraph restores the state and continues from exactly where it paused
Tool Design Checklist
Before shipping any tool to production:
| Check | Why |
|---|---|
✅ Pydantic args_schema with Field(description=...) | The LLM reads descriptions to know how to call the tool |
✅ Returns str error messages (not exceptions) | Prevents tool errors from crashing the agent loop |
| ✅ Validates inputs explicitly | Don’t trust the LLM to pass well-formed data |
✅ Uses config: RunnableConfig for user context | No user IDs or secrets in the message stream |
✅ interrupt() for irreversible actions | Delete, send, pay, deploy — always ask first |
| ✅ Logs tool calls with input/output (see Ch 7) | Essential for debugging production issues |
.env.example
# .env.example
OPENAI_API_KEY=your-api-key-here
Summary
| Pattern | Problem Solved |
|---|---|
AgentContext via config["configurable"] | Pass user context to tools without message pollution |
| NeMo Guardrails (input rail) | Block harmful or off-topic requests before the LLM sees them |
| NeMo Guardrails (output rail) | Prevent the agent from returning harmful content |
interrupt() + Command(resume=...) | Pause for human approval before irreversible actions |
| Tool error strings (not exceptions) | Tools fail gracefully without crashing the agent loop |
In the next chapter, we add proper memory management — giving the agent persistent long-term knowledge via vector stores and controlling the growth of conversation history.
