Skip to main content

What Gets Tracked Automatically

LLM Calls

Every call to GPT-4, Claude, Gemini, etc. with token usage and cost.

Tool Executions

All tool calls with inputs, outputs, and timing.

Agent Reasoning

Chain-of-thought and decision reasoning.

Errors

Tool failures and LLM errors with context.

Installation

pip install sentrial langchain-core

Basic Usage

Add SentrialCallbackHandler to your agent’s callbacks:
from sentrial import SentrialClient
from sentrial.langchain import SentrialCallbackHandler
from langchain.agents import AgentExecutor

# Initialize Sentrial
client = SentrialClient(api_key="sentrial_live_xxx")

# Create session for this agent run
session_id = client.create_session(
    name="Customer Support Request",
    agent_name="support-agent",
    user_id="user_123"
)

# Create callback handler
handler = SentrialCallbackHandler(client, session_id)
handler.set_input("Help me reset my password")

# Add to your agent - that's it!
result = agent_executor.invoke(
    {"input": "Help me reset my password"},
    {"callbacks": [handler]}
)

# Finish session with accumulated metrics
handler.finish(success=True)

# Access usage stats
print(f"Total cost: ${handler.total_cost:.4f}")
print(f"Total tokens: {handler.total_tokens}")
print(f"LLM calls: {handler.llm_calls}")

Handler Options

handler = SentrialCallbackHandler(
    client,                  # SentrialClient instance
    session_id,              # Active session ID
    track_llm_calls=True,    # Track individual LLM calls (default: True)
    verbose=False            # Print tracking info (default: False)
)

handler.set_input()

Set the user’s original query before running the agent. This is stored as the session’s input.
handler.set_input("What's the weather in San Francisco?")

handler.finish()

Complete the session with all accumulated metrics. This is the recommended way to finalize a session — it automatically includes token counts, cost, and duration.
handler.finish(
    success=True,                    # Whether the agent succeeded
    failure_reason=None,             # Reason if success=False
    custom_metrics={"quality": 4.5}  # Optional custom metrics
)
This replaces manually calling client.complete_session() with handler stats — finish() does it all in one call.

Accessing Usage Stats

After your agent runs, access real metrics from the handler:
# Token counts
handler.total_prompt_tokens      # Input tokens across all LLM calls
handler.total_completion_tokens  # Output tokens across all LLM calls
handler.total_tokens             # Total tokens used

# Cost (calculated from actual usage)
handler.total_cost               # Total cost in USD

# Call counts
handler.llm_calls                # Number of LLM API calls

# Duration
handler.duration_ms              # Total duration in milliseconds

# Model info
handler.model_name               # Model used (e.g., "gpt-4o")

# Get everything as a dict
summary = handler.get_usage_summary()
# {
#   "llm_calls": 3,
#   "total_prompt_tokens": 2500,
#   "total_completion_tokens": 800,
#   "total_tokens": 3300,
#   "total_cost": 0.0425,
#   "model": "gpt-4o",
#   "duration_ms": 4521
# }

Completing the Session

Use handler.finish() to complete the session with all accumulated metrics:
handler.set_input(user_query)

try:
    result = agent_executor.invoke(
        {"input": user_query},
        {"callbacks": [handler]}
    )
    handler.finish(success=True)

except Exception as e:
    handler.finish(success=False, failure_reason=str(e))
finish() automatically includes token counts, cost, duration, user input, and assistant output — no need to pass them manually.

Quick Setup with create_agent_with_sentrial()

For the simplest possible integration, use create_agent_with_sentrial(). It handles all LangChain version differences (including LangChain 1.0+ with LangGraph) automatically.
from sentrial import SentrialClient
from sentrial.langchain import create_agent_with_sentrial
from langchain_openai import ChatOpenAI

client = SentrialClient(api_key="sentrial_live_xxx")
llm = ChatOpenAI(model="gpt-4o")

agent = create_agent_with_sentrial(
    llm=llm,
    tools=[search_tool, calculator_tool],
    client=client,
    agent_name="my-agent",
    user_id="user-123",
)

# One call — session creation, tracking, and completion are all automatic
result = agent("What's the weather in San Francisco?")
create_agent_with_sentrial() auto-detects your LangChain version. On LangChain 1.0+ it uses LangGraph’s create_react_agent. On older versions it uses AgentExecutor.

LangChain 1.0+ / LangGraph

LangChain 1.0+ deprecated AgentExecutor in favor of LangGraph. The Sentrial callback handler works with both. For LangGraph agents, pass the handler via config:
from langgraph.prebuilt import create_react_agent

# Create LangGraph agent
agent = create_react_agent(llm, tools)

# Track with Sentrial
handler = SentrialCallbackHandler(client, session_id)
handler.set_input(user_query)

result = agent.invoke(
    {"messages": [("user", user_query)]},
    config={"callbacks": [handler]},
)

handler.finish(success=True)
LangGraph’s create_react_agent never fires on_agent_finish, so handler.finish() uses the last LLM output as the assistant response automatically.

Full Production Example

from sentrial import SentrialClient
from sentrial.langchain import SentrialCallbackHandler
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool

# Initialize
client = SentrialClient(api_key="sentrial_live_xxx")

# Define tools
tools = [
    Tool(
        name="search_kb",
        description="Search the knowledge base for answers",
        func=lambda q: search_knowledge_base(q)
    ),
    Tool(
        name="create_ticket",
        description="Create a support ticket",
        func=lambda data: create_support_ticket(data)
    )
]

# Create agent
llm = ChatOpenAI(model="gpt-4o", temperature=0)
agent = create_react_agent(llm, tools, prompt_template)
executor = AgentExecutor(agent=agent, tools=tools)


def handle_support_request(user_input: str, user_id: str) -> str:
    """Handle a customer support request with full tracking."""

    # Create session
    session_id = client.create_session(
        name=f"Support: {user_input[:40]}...",
        agent_name="customer-support",
        user_id=user_id,
        metadata={"channel": "web_chat"}
    )

    # Create handler with verbose logging in dev
    handler = SentrialCallbackHandler(
        client,
        session_id,
        verbose=True  # Set to False in production
    )
    handler.set_input(user_input)

    try:
        # Run agent
        result = executor.invoke(
            {"input": user_input},
            {"callbacks": [handler]}
        )

        output = result.get("output", "")

        # Complete session — finish() auto-includes tokens, cost, duration
        handler.finish(
            success=True,
            custom_metrics={
                "llm_calls": handler.llm_calls,
                "response_length": len(output)
            }
        )

        return output

    except Exception as e:
        handler.finish(success=False, failure_reason=str(e))
        raise


# Usage
response = handle_support_request(
    "I can't log into my account",
    user_id="user_12345"
)

Supported Models

Cost calculation is built-in for popular models:
ProviderModels
OpenAIgpt-5, gpt-4.1, gpt-4o, gpt-4o-mini, o3, o4-mini
Anthropicclaude-opus-4, claude-sonnet-4, claude-haiku-3.5, claude-3.5-sonnet
Googlegemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-pro

Next Steps