Middleware: Summarization

Overview

The Summarization middleware automatically compresses conversation history when the token count exceeds a configured threshold. This helps maintain context continuity in long conversations while staying within the model’s token limits.

💡 This middleware was introduced in v0.8.0.Beta.

Quick Start

import (
    "context"
    "github.com/cloudwego/eino/adk/middlewares/summarization"
)

// Create middleware with minimal configuration
mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,  // Required: model used for generating summaries
})
if err != nil {
    // Handle error
}

// Use with ChatModelAgent
agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Model:       yourChatModel,
    Middlewares: []adk.ChatModelAgentMiddleware{mw},
})

Configuration Options

Field	Type	Required	Default	Description
Model	model.BaseChatModel	Yes		Chat model used for generating summaries
ModelOptions	[]model.Option	No		Options passed to the model when generating summaries
TokenCounter	TokenCounterFunc	No	~4 chars/token	Custom token counting function
Trigger	*TriggerCondition	No	190,000 tokens	Condition to trigger summarization
Instruction	string	No	Built-in prompt	Custom summarization instruction
TranscriptFilePath	string	No		Full conversation transcript file path
Prepare	PrepareFunc	No		Custom preprocessing function before summary generation
Finalize	FinalizeFunc	No		Custom post-processing function for final messages
Callback	CallbackFunc	No		Called after Finalize to observe state changes (read-only)
EmitInternalEvents	bool	No	false	Whether to emit internal events
PreserveUserMessages	*PreserveUserMessages	No	Enabled: true	Whether to preserve original user messages in summary

TriggerCondition Structure

type TriggerCondition struct {
    // ContextTokens triggers summarization when total token count exceeds this threshold
    ContextTokens int
}

PreserveUserMessages Structure

type PreserveUserMessages struct {
    // Enabled whether to enable user message preservation
    Enabled bool
    
    // MaxTokens maximum tokens for preserved user messages
    // Only preserves the most recent user messages until this limit is reached
    // Defaults to 1/3 of TriggerCondition.ContextTokens
    MaxTokens int
}

Configuration Examples

Custom Token Threshold

mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,
    Trigger: &summarization.TriggerCondition{
        ContextTokens: 100000,  // Trigger at 100k tokens
    },
})

Custom Token Counter

mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,
    TokenCounter: func(ctx context.Context, input *summarization.TokenCounterInput) (int, error) {
        // Use your tokenizer
        return yourTokenizer.Count(input.Messages)
    },
})

Set Transcript File Path

mw, err := summarization.New(ctx, &summarization.Config{
    Model:              yourChatModel,
    TranscriptFilePath: "/path/to/transcript.txt",
})

Custom Finalize Function

mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,
    Finalize: func(ctx context.Context, originalMessages []adk.Message, summary adk.Message) ([]adk.Message, error) {
        // Custom logic to build final messages
        return []adk.Message{
            schema.SystemMessage("Your system prompt"),
            summary,
        }, nil
    },
})

Using Callback to Observe State Changes/Store

mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,
    Callback: func(ctx context.Context, before, after adk.ChatModelAgentState) error {
        log.Printf("Summarization completed: %d messages -> %d messages", 
            len(before.Messages), len(after.Messages))
        return nil
    },
})

Control User Message Preservation

mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,
    PreserveUserMessages: &summarization.PreserveUserMessages{
        Enabled:   true,
        MaxTokens: 50000, // Preserve up to 50k tokens of user messages
    },
})

How It Works

flowchart TD
    A[BeforeModelRewriteState] --> B{Token count exceeds threshold?}
    B -->|No| C[Return original state]
    B -->|Yes| D[Emit BeforeSummary event]
    D --> E{Has custom Prepare?}
    E -->|Yes| F[Call Prepare]
    E -->|No| G[Call model to generate summary]
    F --> G
    G --> H{Has custom Finalize?}
    H -->|Yes| I[Call Finalize]
    H -->|No| L{Has custom Callback?}
    I --> L
    L -->|Yes| M[Call Callback]
    L -->|No| J[Emit AfterSummary event]
    M --> J
    J --> K[Return new state]

    style A fill:#e3f2fd
    style G fill:#fff3e0
    style D fill:#e8f5e9
    style J fill:#e8f5e9
    style K fill:#c8e6c9
    style C fill:#f5f5f5
    style M fill:#fce4ec
    style F fill:#fff3e0
    style I fill:#fff3e0

Internal Events

When EmitInternalEvents is set to true, the middleware emits events at key points:

Event Type	Trigger Timing	Carried Data
ActionTypeBeforeSummary	Before generating summary	Original message list
ActionTypeAfterSummary	After completing summary	Final message list

Usage Example

mw, err := summarization.New(ctx, &summarization.Config{
    Model:              yourChatModel,
    EmitInternalEvents: true,
})

// Listen for events in your event handler

Best Practices

Set TranscriptFilePath: It’s recommended to always provide a conversation transcript file path so the model can reference the original conversation when needed.
Adjust Token Threshold: Adjust Trigger.MaxTokens based on the model’s context window size. Generally recommended to set it to 80-90% of the model’s limit.
Custom Token Counter: In production environments, it’s recommended to implement a custom TokenCounter that matches the model’s tokenizer for accurate counting.

Feedback

Was this page helpful?

Please tell us how we can improve.

Last modified March 2, 2026: feat: sync English translations for middleware and release notes (49774c94)