Learn how to fix "context length exceeded" errors in n8n AI workflows by using text chunking, summarization, recursive processing, and token management to handle large inputs efficiently.
Book a call with an Expert
Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.
When encountering "context length exceeded" errors in n8n AI workflows, you need to implement strategies to manage large amounts of text that exceed the token limits of AI models. These errors typically occur when you're sending too much text to AI nodes like OpenAI or similar services. The most effective solutions include chunking your text into smaller segments, summarizing content before processing, using specific node combinations to handle large inputs, and implementing recursive workflows for processing extensive documents.
Understanding Context Length Exceeded Errors in n8n
Context length exceeded errors occur when you attempt to send more tokens to an AI model than it can process at once. Every AI model has a maximum token limit (for example, GPT-3.5 has a 4,096 token limit, while GPT-4 can handle up to 8,192 or 32,768 tokens depending on the version). When your input exceeds this limit, the AI node in n8n will throw an error and fail to execute.
Step 1: Analyze Your Workflow to Identify the Problem
Before implementing any solution, you need to understand where and why the context length is being exceeded:
Step 2: Implement Text Chunking
The most common solution is to break your text into smaller chunks that fit within the token limits:
// Basic text chunking function
const inputText = items[0].json.text; // Replace with your actual input field
const maxChunkSize = 3000; // Set this lower than your model's token limit (in characters, not exactly tokens)
const chunks = [];
// Split by paragraphs first
const paragraphs = inputText.split('\n\n');
let currentChunk = '';
for (const paragraph of paragraphs) {
// If adding this paragraph would exceed our limit, save current chunk and start a new one
if (currentChunk.length + paragraph.length > maxChunkSize && currentChunk.length > 0) {
chunks.push(currentChunk);
currentChunk = paragraph;
} else {
// Add paragraph to current chunk (with paragraph break if not the first paragraph)
currentChunk = currentChunk.length === 0 ? paragraph : currentChunk + '\n\n' + paragraph;
}
}
// Add the last chunk if it's not empty
if (currentChunk.length > 0) {
chunks.push(currentChunk);
}
// Return an array of items, one for each chunk
return chunks.map(chunk => ({
json: {
text: chunk,
chunkInfo: `Part ${chunks.indexOf(chunk) + 1} of ${chunks.length}`
}
}));
Step 3: Process Chunks Sequentially
After chunking, you'll need to process each chunk individually:
// In a Function node after the AI node to combine results
const results = items.map(item => item.json.aiResponse); // Adjust field name as needed
const combinedResult = results.join('\n\n');
return [{
json: {
combinedResponse: combinedResult
}
}];
Step 4: Implement Smart Text Summarization
If your text is extremely long, consider summarizing it first:
To implement this in n8n:
Step 5: Create a Recursive Document Processing Workflow
For very large documents, implement a recursive approach:
The main workflow structure:
Step 6: Implement Token-Aware Chunking
For more precise handling, implement token counting instead of character counting:
// More accurate token-aware chunking using a simple estimation
// Note: This is an estimation, not exact token counting
const estimateTokens = (text) => {
// GPT models use roughly 4 characters per token (rough estimate)
return Math.ceil(text.length / 4);
};
const inputText = items[0].json.text;
const maxTokensPerChunk = 3000; // Set below your model's limit
const chunks = [];
// Split by paragraphs
const paragraphs = inputText.split('\n\n');
let currentChunk = '';
let currentTokenCount = 0;
for (const paragraph of paragraphs) {
const paragraphTokens = estimateTokens(paragraph);
// If adding this paragraph would exceed our token limit
if (currentTokenCount + paragraphTokens > maxTokensPerChunk && currentTokenCount > 0) {
chunks.push(currentChunk);
currentChunk = paragraph;
currentTokenCount = paragraphTokens;
} else {
// Add paragraph to current chunk
if (currentChunk.length === 0) {
currentChunk = paragraph;
} else {
currentChunk += '\n\n' + paragraph;
}
currentTokenCount += paragraphTokens;
}
}
// Add the last chunk
if (currentChunk.length > 0) {
chunks.push(currentChunk);
}
// Return an array of items with chunk info
return chunks.map((chunk, index) => ({
json: {
text: chunk,
chunkNumber: index + 1,
totalChunks: chunks.length,
estimatedTokens: estimateTokens(chunk)
}
}));
Step 7: Implement Sliding Window Processing
For texts where context between chunks is important, use a sliding window approach:
// Sliding window chunking with overlap
const inputText = items[0].json.text;
const maxChunkSize = 3000;
const overlapSize = 500; // Characters to overlap between chunks
const chunks = [];
let startPos = 0;
while (startPos < inputText.length) {
let endPos = Math.min(startPos + maxChunkSize, inputText.length);
// If we're not at the end of the text, try to find a good break point
if (endPos < inputText.length) {
// Look for a paragraph break in the overlap region
const possibleBreak = inputText.lastIndexOf('\n\n', endPos);
if (possibleBreak > startPos && possibleBreak > endPos - overlapSize) {
endPos = possibleBreak;
} else {
// If no paragraph break, look for a sentence break
const sentenceBreak = inputText.lastIndexOf('. ', endPos);
if (sentenceBreak > startPos && sentenceBreak > endPos - overlapSize) {
endPos = sentenceBreak + 1; // Include the period
}
}
}
chunks.push(inputText.substring(startPos, endPos));
// Move start position for next chunk, with potential overlap
startPos = endPos - (endPos < inputText.length ? overlapSize : 0);
}
return chunks.map((chunk, index) => ({
json: {
text: chunk,
chunkNumber: index + 1,
totalChunks: chunks.length,
hasOverlap: index > 0
}
}));
Step 8: Create a Progressive Summarization Workflow
For extremely large documents, implement progressive summarization:
Implementation in n8n:
Step 9: Optimize Prompts to Reduce Token Usage
Minimize the prompt size to leave more tokens for content:
Bad prompt example:
// Too verbose, wastes tokens
"You are an AI assistant tasked with analyzing the following text. I want you to carefully read through every word and provide a detailed analysis covering the main themes, key points, emotional tone, writing style, and potential implications. Please be extremely thorough and make sure to cover every aspect of the text. Do not leave anything out. The text is as follows: {{$input.json.text}}"
Optimized prompt example:
// Concise, preserves tokens for content
"Analyze this text concisely: {{$input.json.text}}"
Step 10: Implement Document Sectioning Based on Content Type
For documents with mixed content types (text, tables, code), process each type separately:
// Separate different content types for specialized processing
const inputText = items[0].json.text;
// Regular expressions to identify different content types
const codeBlockRegex = /`[\s\S]*?`/g;
const tableRegex = /|[\s\S]_?|[\s\S]_?|/g;
// Extract code blocks
const codeBlocks = inputText.match(codeBlockRegex) || [];
let textWithoutCode = inputText.replace(codeBlockRegex, '[CODE_BLOCK_PLACEHOLDER]');
// Extract tables
const tables = textWithoutCode.match(tableRegex) || [];
let textWithoutCodeAndTables = textWithoutCode.replace(tableRegex, '[TABLE\_PLACEHOLDER]');
// Split remaining text into paragraphs
const paragraphs = textWithoutCodeAndTables.split('\n\n').filter(p => p.trim().length > 0);
// Return different content types as separate items
return [
...paragraphs.map(p => ({ json: { contentType: 'text', content: p } })),
...codeBlocks.map(c => ({ json: { contentType: 'code', content: c } })),
...tables.map(t => ({ json: { contentType: 'table', content: t } }))
];
Step 11: Build a Multi-Stage Processing Pipeline
Create a multi-stage pipeline for handling large documents:
Implementation in n8n:
Step 12: Implement Automated Retry with Reduced Content
Add error handling that automatically reduces content size on failure:
// Function node to handle context length errors with automatic retry
const MAX\_RETRIES = 3;
const REDUCTION\_FACTOR = 0.75; // Reduce by 25% each retry
// Check if we're in a retry situation
const retryCount = $input.json.retryCount || 0;
const originalText = $input.json.originalText || $input.json.text;
let textToProcess = $input.json.text;
// If this is a retry after an error
if ($input.json.error && $input.json.error.includes('context length')) {
if (retryCount >= MAX\_RETRIES) {
throw new Error('Maximum retries exceeded. Text is still too long.');
}
// Calculate new length and truncate
const newLength = Math.floor(textToProcess.length \* REDUCTION\_FACTOR);
textToProcess = textToProcess.substring(0, newLength);
return [{
json: {
text: textToProcess,
originalText: originalText,
retryCount: retryCount + 1,
reductionApplied: true,
message: `Retry ${retryCount + 1}: Reduced text from ${originalText.length} to ${textToProcess.length} characters`
}
}];
}
// Normal processing (no error)
return [{
json: {
text: textToProcess,
originalText: originalText,
retryCount: retryCount
}
}];
Step 13: Implement Query-Focused Extraction
For query-based workflows, extract only relevant sections before processing:
// Extract only relevant sections based on query
const document = items[0].json.document;
const query = items[0].json.query;
// First, split the document into sections
const sections = document.split('\n## ').map((section, index) =>
index === 0 ? section : '## ' + section
);
// Define a simple relevance checking function (can be enhanced)
const isRelevant = (text, query) => {
const keywords = query.toLowerCase().split(' ');
const textLower = text.toLowerCase();
return keywords.some(keyword => textLower.includes(keyword));
};
// Filter for relevant sections
const relevantSections = sections.filter(section => isRelevant(section, query));
// If we found relevant sections, use them; otherwise use a summary
if (relevantSections.length > 0) {
return [{
json: {
extractedContent: relevantSections.join('\n\n'),
query: query,
sectionsExtracted: relevantSections.length,
totalSections: sections.length
}
}];
} else {
// No relevant sections found - create a short summary instead
const firstParagraphs = sections.map(section => {
const paragraphs = section.split('\n\n');
return paragraphs[0]; // Just take the first paragraph of each section
}).join('\n\n');
return [{
json: {
extractedContent: firstParagraphs,
query: query,
extractionMethod: 'summary',
message: 'No directly relevant sections found, providing document overview'
}
}];
}
Step 14: Create a Dynamic Token Budget System
Implement a token budget system that allocates tokens between prompt and content:
// Dynamic token budget allocation
const MODEL_MAX_TOKENS = 4096; // Change based on your model
const COMPLETION\_TOKENS = 1000; // Reserve this many tokens for AI response
const PROMPT\_TOKENS = 500; // Reserve for your instructions/prompt
// Calculate available tokens for content
const AVAILABLE_CONTENT_TOKENS = MODEL_MAX_TOKENS - PROMPT_TOKENS - COMPLETION_TOKENS;
// Estimate tokens in content (rough approximation)
const content = items[0].json.content;
const estimatedTokens = Math.ceil(content.length / 4); // ~4 chars per token
// Check if we need to truncate
if (estimatedTokens > AVAILABLE_CONTENT_TOKENS) {
// Calculate how much text we can include (with some margin)
const allowedCharacters = AVAILABLE_CONTENT_TOKENS \* 3.8; // Slightly less than 4 to be safe
// Truncate content
const truncatedContent = content.substring(0, allowedCharacters);
return [{
json: {
content: truncatedContent,
truncated: true,
originalLength: content.length,
truncatedLength: truncatedContent.length,
estimatedOriginalTokens: estimatedTokens,
tokenBudget: AVAILABLE_CONTENT_TOKENS,
message: "Content was truncated to fit token limits"
}
}];
} else {
// Content fits within token budget
return [{
json: {
content: content,
truncated: false,
estimatedTokens: estimatedTokens,
tokenBudget: AVAILABLE_CONTENT_TOKENS,
tokenUtilization: Math.round((estimatedTokens / AVAILABLE_CONTENT_TOKENS) \* 100) + '%'
}
}];
}
Step 15: Implement a Comprehensive Document Processing System
Combine multiple techniques for an advanced document processing system:
This advanced approach requires multiple nodes and careful workflow design, but allows processing documents of virtually any length while maintaining context relationships.
Troubleshooting Common Issues
Best Practices for n8n AI Workflows
Conclusion
Handling context length exceeded errors in n8n AI workflows requires a thoughtful approach to text processing. By implementing techniques like chunking, summarization, sliding windows, and multi-stage processing, you can effectively work with documents of any size. Remember to monitor your token usage, optimize your prompts, and implement proper error handling to build robust AI workflows that can reliably process large volumes of text.
When it comes to serving you, we sweat the little things. That’s why our work makes a big impact.