Learn how to debug your AI agent using message audit tools and logs to identify and fix issues quickly.

What is Message Audit?

Message Audit is your debugging console for AI agents. It shows:

Every conversation with your agent
Full message history for each thread
Debug information: tokens, reasoning, timing, errors
Trace IDs to follow requests through the system
Model responses and tool calls

Think of it as your agent's black box recorder - when something goes wrong, you start here.

Accessing Message Audit

Log in to console.flutch.ai
Navigate to Agents → Select your agent
Go to Message Audit tab

URL: https://console.flutch.ai/agents/{agentId}/message-audit

Understanding the Interface

Conversation List View

The main view shows all conversations:

bash
┌─────────────────────────────────────────────────┐
│ Thread ID              User    Time    Messages │
├─────────────────────────────────────────────────┤
│ thread-abc123         john    2m ago    5       │
│ thread-def456         sarah   5m ago    12      │
│ thread-ghi789         mike    10m ago   3       │
└─────────────────────────────────────────────────┘

Columns:

Thread ID: Unique conversation identifier
User: User who started the conversation (if authenticated)
Time: When the last message was sent
Messages: Total message count in this thread

Filters:

By date range
By user ID
By error status (show only failed)
By thread ID (search)

Conversation Details View

Click on any thread to see full conversation:

bash
┌─────────────────────────────────────────────────┐
│ Thread: thread-abc123                           │
├─────────────────────────────────────────────────┤
│ 👤 User: What are your pricing plans?          │
│    ⏱️  Sent: 2025-01-20 14:30:15               │
├─────────────────────────────────────────────────┤
│ 🤖 Agent: We offer three pricing plans:        │
│    - Basic: $9/month                            │
│    - Pro: $49/month                             │
│    - Enterprise: Custom pricing                 │
│                                                 │
│    ⏱️  Generated: 2025-01-20 14:30:17          │
│    🔍 Trace ID: trace-xyz789                   │
│    💰 Tokens: 245 (prompt: 120, completion: 125)│
│    ⚡ Duration: 1.8s                            │
└─────────────────────────────────────────────────┘

Debug Information Per Message:

Trace ID: Unique identifier for this request
Token usage: Input and output tokens
Duration: Time to generate response
Model used: Which LLM model processed this
Temperature: Model temperature setting
Error details: If message failed, why?

Debugging Common Issues

Issue 1: Agent Not Responding

Symptoms:

User sends message, no response
Message appears in audit but agent message missing
Error indicator on conversation

How to debug:

Open conversation in Message Audit
Check for error message on agent response
Look at trace ID and error details

Common causes:

bash
❌ "API key invalid or expired"
→ Fix: Update API key in agent settings

❌ "Model rate limit exceeded"
→ Fix: Wait or upgrade plan with LLM provider

❌ "Timeout after 30 seconds"
→ Fix: Optimize system prompt, reduce context length

❌ "Tool execution failed: [tool_name]"
→ Fix: Check tool configuration, ensure service is accessible

Issue 2: Wrong Responses

Symptoms:

Agent gives incorrect information
Agent doesn't follow system prompt
Agent ignores context or tools

How to debug:

Check the system prompt being used:
- Settings → System Prompt
- Verify it's what you expect
Check model settings:
- Temperature too high? (> 0.9 = creative but unpredictable)
- Wrong model? (gpt-3.5 vs gpt-4)
Check conversation context:
- Is previous context being sent correctly?
- Are tool results being passed to model?
Look at reasoning chains (if available):
- What did the model "think" before responding?
- Did it consider using a tool but decided not to?

Example debug session:

bash
User asked: "What's the status of order #12345?"

Agent responded: "I don't have access to order information."

Debug steps:
1. Check Message Audit → See trace ID: trace-abc123
2. Check tool calls → No tool was called
3. Check system prompt → Missing instruction to use order lookup tool
4. Fix: Update system prompt to mention order tool
5. Test again → Now works correctly

Issue 3: Slow Responses

Symptoms:

Users complain about wait time
Message Audit shows high duration (> 10s)

How to debug:

Check duration in Message Audit for slow messages
Identify bottleneck:

If model generation is slow (5-10s+):

Large context (too many previous messages)
Complex system prompt
Using slow model (gpt-4 vs gpt-3.5-turbo)

If tool execution is slow:

External API timeout
Database query taking too long
Network issues

Solutions:

Reduce context window (limit message history)
Simplify system prompt
Switch to faster model for simple queries
Cache expensive tool calls
Optimize external service calls

Issue 4: Token Usage Too High

Symptoms:

Bills are higher than expected
Message Audit shows high token counts per message

How to debug:

Open Message Audit
Sort by token usage (highest first)
Identify patterns:

bash
Message with 5000 tokens (cost: $0.05):
- Prompt tokens: 4500
- Completion tokens: 500

Why so high?
→ Full conversation history sent (100 messages)
→ Large system prompt (1000 tokens)
→ Tool descriptions (500 tokens each, 5 tools)

Solutions:

Limit conversation history (e.g., last 10 messages only)
Shorten system prompt
Remove unused tools from configuration
Use smaller model for simple queries
Implement token-based conversation summarization

Issue 5: Agent Stopped Working After Update

Symptoms:

Agent worked before, now broken
All conversations failing
Error messages in Message Audit

How to debug:

Check recent deployments:

bash
flutch info <agent-id>

Look at deployment history:
- What version is currently active?
- When was it deployed?
- What changed?
Check Message Audit for first failing message:
- Compare with last successful message
- What's different?

Rollback if needed:

bash
flutch rollback <agent-id> --to-version 1.0.0

Fix issue in code, redeploy:

bash
flutch deploy

Using CLI Logs

For real-time debugging, use CLI logs:

Stream Live Logs

bash
# Follow logs in real-time
flutch logs <agent-id> --follow

# Output:
# [2025-01-20 14:30:15] [INFO] New message received: thread-abc123
# [2025-01-20 14:30:16] [DEBUG] Invoking model: gpt-4
# [2025-01-20 14:30:17] [INFO] Response generated (245 tokens)
# [2025-01-20 14:30:17] [INFO] Message sent to user

View Recent Logs

bash
# Last 100 lines
flutch logs <agent-id> --lines 100

# Last 24 hours
flutch logs <agent-id> --since 24h

# Only errors
flutch logs <agent-id> --level error

Search Logs

bash
# Find specific trace ID
flutch logs <agent-id> --grep "trace-xyz789"

# Find errors related to a tool
flutch logs <agent-id> --grep "weather_tool" --level error

Save Logs for Analysis

bash
# Save to file
flutch logs <agent-id> --lines 1000 > debug.log

# Share with team
cat debug.log | grep ERROR

Advanced Debugging Techniques

Debug with Trace IDs

Every message has a trace ID. Use it to follow a request through the entire system:

User reports issue: "My message at 2:30 PM didn't work"
Find message in Message Audit around that time
Copy trace ID: trace-xyz789

Search backend logs:

bash
flutch logs <agent-id> --grep "trace-xyz789"

See full request lifecycle:

[14:30:15] Request received: trace-xyz789
[14:30:16] Model invoked: gpt-4, temperature: 0.7
[14:30:16] Tool called: search_docs, query: "pricing"
[14:30:16] Tool result: 3 documents found
[14:30:17] Response generated: 245 tokens
[14:30:17] Response sent to user

Compare Working vs Broken

When something breaks:

Find last working conversation in Message Audit
Find first broken conversation
Compare side-by-side:
- System prompt (same?)
- Model settings (same?)
- Tools enabled (same?)
- Input format (same?)
- Error messages (what's new?)

Test Locally with Same Input

Reproduce issue locally by running your agent in development mode and sending the same message that failed in production.

Check External Services

If using tools that call external APIs:

Verify API keys are valid
Check service status pages

Test API directly:

bash
curl -X GET "https://api.external-service.com/status" \
  -H "Authorization: Bearer YOUR_API_KEY"

Check rate limits
Verify network connectivity

Performance Monitoring

Token Usage Dashboard

Message Audit shows token statistics:

Per conversation:

Total tokens used
Cost estimate
Average tokens per message

Per time period:

Daily token usage
Cost trends
Most expensive conversations

Response Time Trends

Track response times over time:

bash
Average response time:
- Last hour: 1.8s ✅
- Last 24h: 2.1s ✅
- Last 7d: 2.5s ⚠️ (trending up)

If trending up:

Check if context window is growing
Verify external tool performance
Consider model optimization

Debugging Checklist

When something goes wrong, follow this checklist:

Check Message Audit for error messages
Look at trace ID and full request details
Verify API keys are valid
Check model settings (correct model, temperature)
Review system prompt for issues
Verify tools are configured correctly
Test external services manually
Check token usage for context bloat
Compare with last working version
Search CLI logs for trace ID
Test locally with same input
Check recent deployments

Common Error Codes

Error Code	Meaning	Solution
`AUTH_FAILED`	Invalid API key	Update key in settings
`RATE_LIMIT`	Too many requests	Wait or upgrade plan
`TIMEOUT`	Response took > 30s	Optimize prompt or context
`TOOL_ERROR`	Tool execution failed	Check tool configuration
`INVALID_INPUT`	Malformed message	Validate input format
`MODEL_ERROR`	LLM service issue	Check provider status
`CONTEXT_TOO_LARGE`	Too many tokens	Reduce context window

Best Practices

1. Use Verbose Logging Locally

When developing locally, enable verbose logging in your agent code to see:

Every tool call
Model reasoning (if available)
State changes
Timing for each operation

2. Add Custom Logging

In your graph nodes:

Python:

python
import logging

logger = logging.getLogger(__name__)

async def my_node(state, config):
    logger.info(f"Processing message: {state['messages'][-1].content}")
    # ... node logic
    logger.debug(f"Generated response: {response}")
    return {"messages": [response]}

TypeScript:

typescript
import {Logger} from '@nestjs/common';

export class MyNode {
    private readonly logger = new Logger(MyNode.name);

    async execute(state: State, config: Config) {
        this.logger.log(`Processing message: ${state.messages[state.messages.length - 1].content}`);
        // ... node logic
        this.logger.debug(`Generated response: ${response}`);
        return {messages: [response]};
    }
}

These logs appear in CLI output when using flutch logs.

3. Use Structured Logging

Log important data in structured format:

python
logger.info("Tool executed", extra={
    "tool_name": "search_docs",
    "query": query,
    "results_count": len(results),
    "duration_ms": duration
})

Makes it easier to search and analyze logs.

4. Monitor Key Metrics

Set up alerts for:

Error rate > 5%
Average response time > 10s
Token usage spike (> 2x normal)
Rate limit errors

5. Document Known Issues

Keep a runbook of common issues:

markdown
## Issue: Agent forgets context

**Symptoms:** Agent doesn't remember previous messages

**Cause:** State not being passed correctly between nodes

**Fix:** Verify state definition includes messages as `Annotated[list, add_messages]`

**How to test:** Send 3 messages in same thread, verify 3rd references 1st

Troubleshooting Tips

"I don't see my conversation in Message Audit"

Check if you're looking at the right agent
Verify time filter (might be filtered out)
Try searching by thread ID
Check if conversation was actually created (vs just page loaded)

"Trace ID not found in logs"

Logs might be delayed (wait 1-2 minutes)
Check if log level is set high enough
Verify you're searching the right agent
Try --follow mode to see live logs

"Can't reproduce issue locally"

Using different model version?
Different API keys? (dev vs prod)
Different system prompt? (hardcoded vs from UI)

"Too many logs, can't find issue"

Use --grep to filter: flutch logs --grep "ERROR"
Search for trace ID: flutch logs --grep "trace-xyz789"
Filter by time: flutch logs --since 10m
Save to file and use text editor search

Next Steps

Fix Issues: Configure Agent to update settings
Prevent Regressions: Acceptance Testing to catch bugs early
Monitor Performance: Measures & Analytics for ongoing monitoring

Remember: Every bug is an opportunity to add a test case to your acceptance test suite!

Screenshots Needed

TODO: Add screenshots for:

Message Audit list view
Conversation details modal
Debug information panel
Error message example
CLI logs output