What Gets Tracked Automatically
LLM Calls
Every call to GPT-4, Claude, Gemini, etc. with token usage and cost.
Tool Executions
All tool calls with inputs, outputs, and timing.
Agent Reasoning
Chain-of-thought and decision reasoning.
Errors
Tool failures and LLM errors with context.
Installation
pip install sentrial langchain-core
Basic Usage
Add SentrialCallbackHandler to your agent’s callbacks:
from sentrial import SentrialClient
from sentrial.langchain import SentrialCallbackHandler
from langchain.agents import AgentExecutor
# Initialize Sentrial
client = SentrialClient(api_key="sentrial_live_xxx")
# Create session for this agent run
session_id = client.create_session(
name="Customer Support Request",
agent_name="support-agent",
user_id="user_123"
)
# Create callback handler
handler = SentrialCallbackHandler(client, session_id)
handler.set_input("Help me reset my password")
# Add to your agent - that's it!
result = agent_executor.invoke(
{"input": "Help me reset my password"},
{"callbacks": [handler]}
)
# Finish session with accumulated metrics
handler.finish(success=True)
# Access usage stats
print(f"Total cost: ${handler.total_cost:.4f}")
print(f"Total tokens: {handler.total_tokens}")
print(f"LLM calls: {handler.llm_calls}")
Handler Options
handler = SentrialCallbackHandler(
client, # SentrialClient instance
session_id, # Active session ID
track_llm_calls=True, # Track individual LLM calls (default: True)
verbose=False # Print tracking info (default: False)
)
Set the user’s original query before running the agent. This is stored as the session’s input.
handler.set_input("What's the weather in San Francisco?")
handler.finish()
Complete the session with all accumulated metrics. This is the recommended way to finalize a session — it automatically includes token counts, cost, and duration.
handler.finish(
success=True, # Whether the agent succeeded
failure_reason=None, # Reason if success=False
custom_metrics={"quality": 4.5} # Optional custom metrics
)
This replaces manually calling client.complete_session() with handler stats — finish() does it all in one call.
Accessing Usage Stats
After your agent runs, access real metrics from the handler:
# Token counts
handler.total_prompt_tokens # Input tokens across all LLM calls
handler.total_completion_tokens # Output tokens across all LLM calls
handler.total_tokens # Total tokens used
# Cost (calculated from actual usage)
handler.total_cost # Total cost in USD
# Call counts
handler.llm_calls # Number of LLM API calls
# Duration
handler.duration_ms # Total duration in milliseconds
# Model info
handler.model_name # Model used (e.g., "gpt-4o")
# Get everything as a dict
summary = handler.get_usage_summary()
# {
# "llm_calls": 3,
# "total_prompt_tokens": 2500,
# "total_completion_tokens": 800,
# "total_tokens": 3300,
# "total_cost": 0.0425,
# "model": "gpt-4o",
# "duration_ms": 4521
# }
Completing the Session
Use handler.finish() to complete the session with all accumulated metrics:
handler.set_input(user_query)
try:
result = agent_executor.invoke(
{"input": user_query},
{"callbacks": [handler]}
)
handler.finish(success=True)
except Exception as e:
handler.finish(success=False, failure_reason=str(e))
finish() automatically includes token counts, cost, duration, user input, and assistant output — no need to pass them manually.
Quick Setup with create_agent_with_sentrial()
For the simplest possible integration, use create_agent_with_sentrial(). It handles all LangChain version differences (including LangChain 1.0+ with LangGraph) automatically.
from sentrial import SentrialClient
from sentrial.langchain import create_agent_with_sentrial
from langchain_openai import ChatOpenAI
client = SentrialClient(api_key="sentrial_live_xxx")
llm = ChatOpenAI(model="gpt-4o")
agent = create_agent_with_sentrial(
llm=llm,
tools=[search_tool, calculator_tool],
client=client,
agent_name="my-agent",
user_id="user-123",
)
# One call — session creation, tracking, and completion are all automatic
result = agent("What's the weather in San Francisco?")
create_agent_with_sentrial() auto-detects your LangChain version. On LangChain 1.0+ it uses LangGraph’s create_react_agent. On older versions it uses AgentExecutor.
LangChain 1.0+ / LangGraph
LangChain 1.0+ deprecated AgentExecutor in favor of LangGraph. The Sentrial callback handler works with both. For LangGraph agents, pass the handler via config:
from langgraph.prebuilt import create_react_agent
# Create LangGraph agent
agent = create_react_agent(llm, tools)
# Track with Sentrial
handler = SentrialCallbackHandler(client, session_id)
handler.set_input(user_query)
result = agent.invoke(
{"messages": [("user", user_query)]},
config={"callbacks": [handler]},
)
handler.finish(success=True)
LangGraph’s create_react_agent never fires on_agent_finish, so handler.finish() uses the last LLM output as the assistant response automatically.
Full Production Example
from sentrial import SentrialClient
from sentrial.langchain import SentrialCallbackHandler
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
# Initialize
client = SentrialClient(api_key="sentrial_live_xxx")
# Define tools
tools = [
Tool(
name="search_kb",
description="Search the knowledge base for answers",
func=lambda q: search_knowledge_base(q)
),
Tool(
name="create_ticket",
description="Create a support ticket",
func=lambda data: create_support_ticket(data)
)
]
# Create agent
llm = ChatOpenAI(model="gpt-4o", temperature=0)
agent = create_react_agent(llm, tools, prompt_template)
executor = AgentExecutor(agent=agent, tools=tools)
def handle_support_request(user_input: str, user_id: str) -> str:
"""Handle a customer support request with full tracking."""
# Create session
session_id = client.create_session(
name=f"Support: {user_input[:40]}...",
agent_name="customer-support",
user_id=user_id,
metadata={"channel": "web_chat"}
)
# Create handler with verbose logging in dev
handler = SentrialCallbackHandler(
client,
session_id,
verbose=True # Set to False in production
)
handler.set_input(user_input)
try:
# Run agent
result = executor.invoke(
{"input": user_input},
{"callbacks": [handler]}
)
output = result.get("output", "")
# Complete session — finish() auto-includes tokens, cost, duration
handler.finish(
success=True,
custom_metrics={
"llm_calls": handler.llm_calls,
"response_length": len(output)
}
)
return output
except Exception as e:
handler.finish(success=False, failure_reason=str(e))
raise
# Usage
response = handle_support_request(
"I can't log into my account",
user_id="user_12345"
)
Supported Models
Cost calculation is built-in for popular models:
| Provider | Models |
|---|
| OpenAI | gpt-5, gpt-4.1, gpt-4o, gpt-4o-mini, o3, o4-mini |
| Anthropic | claude-opus-4, claude-sonnet-4, claude-haiku-3.5, claude-3.5-sonnet |
| Google | gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-pro |
Next Steps