How to handle “context length exceeded” errors in n8n AI workflows?

Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

How to handle “context length exceeded” errors in n8n AI workflows?

When encountering "context length exceeded" errors in n8n AI workflows, you need to implement strategies to manage large amounts of text that exceed the token limits of AI models. These errors typically occur when you're sending too much text to AI nodes like OpenAI or similar services. The most effective solutions include chunking your text into smaller segments, summarizing content before processing, using specific node combinations to handle large inputs, and implementing recursive workflows for processing extensive documents.

Understanding Context Length Exceeded Errors in n8n

Context length exceeded errors occur when you attempt to send more tokens to an AI model than it can process at once. Every AI model has a maximum token limit (for example, GPT-3.5 has a 4,096 token limit, while GPT-4 can handle up to 8,192 or 32,768 tokens depending on the version). When your input exceeds this limit, the AI node in n8n will throw an error and fail to execute.

Step 1: Analyze Your Workflow to Identify the Problem

Before implementing any solution, you need to understand where and why the context length is being exceeded:

Open your n8n workflow and identify the AI node where the error occurs.
Check the input data being passed to this node by adding a "Debug" node before it.
Look at the error message details, which usually indicate the token count and the maximum allowed.
Determine if the issue is with the prompt, the content being processed, or the combination of both.

Step 2: Implement Text Chunking

The most common solution is to break your text into smaller chunks that fit within the token limits:

Add a "Function" node before your AI node.
Use JavaScript to split the text into manageable chunks.


// Basic text chunking function
const inputText = items[0].json.text; // Replace with your actual input field
const maxChunkSize = 3000; // Set this lower than your model's token limit (in characters, not exactly tokens)
const chunks = [];

// Split by paragraphs first
const paragraphs = inputText.split('\n\n');
let currentChunk = '';

for (const paragraph of paragraphs) {
  // If adding this paragraph would exceed our limit, save current chunk and start a new one
  if (currentChunk.length + paragraph.length > maxChunkSize && currentChunk.length > 0) {
    chunks.push(currentChunk);
    currentChunk = paragraph;
  } else {
    // Add paragraph to current chunk (with paragraph break if not the first paragraph)
    currentChunk = currentChunk.length === 0 ? paragraph : currentChunk + '\n\n' + paragraph;
  }
}

// Add the last chunk if it's not empty
if (currentChunk.length > 0) {
  chunks.push(currentChunk);
}

// Return an array of items, one for each chunk
return chunks.map(chunk => ({
  json: {
    text: chunk,
    chunkInfo: `Part ${chunks.indexOf(chunk) + 1} of ${chunks.length}`
  }
}));

Step 3: Process Chunks Sequentially

After chunking, you'll need to process each chunk individually:

Connect the Function node to your AI node.
Make sure "Split In Batches" is enabled in the AI node settings.
Use a "Merge" node after the AI node to combine results from all chunks.


// In a Function node after the AI node to combine results
const results = items.map(item => item.json.aiResponse); // Adjust field name as needed
const combinedResult = results.join('\n\n');

return [{
  json: {
    combinedResponse: combinedResult
  }
}];

Step 4: Implement Smart Text Summarization

If your text is extremely long, consider summarizing it first:

Add an AI node specifically for summarization before your main AI processing.
Configure it to create a concise summary of the larger text.

To implement this in n8n:

Add an OpenAI node before your main processing.
Set the prompt to request summarization: "Summarize the following text in a concise manner while preserving key information: {{$input.json.text}}"
Use this summary as input for your main AI processing.

Step 5: Create a Recursive Document Processing Workflow

For very large documents, implement a recursive approach:

Create a subworkflow that processes a single chunk.
Use the "Execute Workflow" node to call this subworkflow for each chunk.
Compile the results at the end.

The main workflow structure:

Start node → Load document → Chunk document → Loop through chunks
For each chunk: Execute Subworkflow → Store result
After all chunks: Merge results → Final processing

Step 6: Implement Token-Aware Chunking

For more precise handling, implement token counting instead of character counting:


// More accurate token-aware chunking using a simple estimation
// Note: This is an estimation, not exact token counting
const estimateTokens = (text) => {
  // GPT models use roughly 4 characters per token (rough estimate)
  return Math.ceil(text.length / 4);
};

const inputText = items[0].json.text;
const maxTokensPerChunk = 3000; // Set below your model's limit
const chunks = [];

// Split by paragraphs
const paragraphs = inputText.split('\n\n');
let currentChunk = '';
let currentTokenCount = 0;

for (const paragraph of paragraphs) {
  const paragraphTokens = estimateTokens(paragraph);
  
  // If adding this paragraph would exceed our token limit
  if (currentTokenCount + paragraphTokens > maxTokensPerChunk && currentTokenCount > 0) {
    chunks.push(currentChunk);
    currentChunk = paragraph;
    currentTokenCount = paragraphTokens;
  } else {
    // Add paragraph to current chunk
    if (currentChunk.length === 0) {
      currentChunk = paragraph;
    } else {
      currentChunk += '\n\n' + paragraph;
    }
    currentTokenCount += paragraphTokens;
  }
}

// Add the last chunk
if (currentChunk.length > 0) {
  chunks.push(currentChunk);
}

// Return an array of items with chunk info
return chunks.map((chunk, index) => ({
  json: {
    text: chunk,
    chunkNumber: index + 1,
    totalChunks: chunks.length,
    estimatedTokens: estimateTokens(chunk)
  }
}));

Step 7: Implement Sliding Window Processing

For texts where context between chunks is important, use a sliding window approach:


// Sliding window chunking with overlap
const inputText = items[0].json.text;
const maxChunkSize = 3000;
const overlapSize = 500; // Characters to overlap between chunks
const chunks = [];

let startPos = 0;
while (startPos < inputText.length) {
  let endPos = Math.min(startPos + maxChunkSize, inputText.length);
  
  // If we're not at the end of the text, try to find a good break point
  if (endPos < inputText.length) {
    // Look for a paragraph break in the overlap region
    const possibleBreak = inputText.lastIndexOf('\n\n', endPos);
    if (possibleBreak > startPos && possibleBreak > endPos - overlapSize) {
      endPos = possibleBreak;
    } else {
      // If no paragraph break, look for a sentence break
      const sentenceBreak = inputText.lastIndexOf('. ', endPos);
      if (sentenceBreak > startPos && sentenceBreak > endPos - overlapSize) {
        endPos = sentenceBreak + 1; // Include the period
      }
    }
  }
  
  chunks.push(inputText.substring(startPos, endPos));
  
  // Move start position for next chunk, with potential overlap
  startPos = endPos - (endPos < inputText.length ? overlapSize : 0);
}

return chunks.map((chunk, index) => ({
  json: {
    text: chunk,
    chunkNumber: index + 1,
    totalChunks: chunks.length,
    hasOverlap: index > 0
  }
}));

Step 8: Create a Progressive Summarization Workflow

For extremely large documents, implement progressive summarization:

Split the document into major sections.
Summarize each section independently.
Combine the summaries and create a final meta-summary.

Implementation in n8n:

Split the document into logical sections (chapters, sections, etc.)
For each section: Send to AI for summarization
Combine all summaries
Send the combined summaries to AI for a meta-summary

Step 9: Optimize Prompts to Reduce Token Usage

Minimize the prompt size to leave more tokens for content:

Use concise, clear instructions.
Avoid repetitive examples in the prompt.
Consider storing parts of complex prompts in variables and inject only what's needed.

Bad prompt example:


// Too verbose, wastes tokens
"You are an AI assistant tasked with analyzing the following text. I want you to carefully read through every word and provide a detailed analysis covering the main themes, key points, emotional tone, writing style, and potential implications. Please be extremely thorough and make sure to cover every aspect of the text. Do not leave anything out. The text is as follows: {{$input.json.text}}"

Optimized prompt example:


// Concise, preserves tokens for content
"Analyze this text concisely: {{$input.json.text}}"

Step 10: Implement Document Sectioning Based on Content Type

For documents with mixed content types (text, tables, code), process each type separately:


// Separate different content types for specialized processing
const inputText = items[0].json.text;

// Regular expressions to identify different content types
const codeBlockRegex = /`[\s\S]*?`/g;
const tableRegex = /|[\s\S]_?|[\s\S]_?|/g;

// Extract code blocks
const codeBlocks = inputText.match(codeBlockRegex) || [];
let textWithoutCode = inputText.replace(codeBlockRegex, '[CODE_BLOCK_PLACEHOLDER]');

// Extract tables
const tables = textWithoutCode.match(tableRegex) || [];
let textWithoutCodeAndTables = textWithoutCode.replace(tableRegex, '[TABLE\_PLACEHOLDER]');

// Split remaining text into paragraphs
const paragraphs = textWithoutCodeAndTables.split('\n\n').filter(p => p.trim().length > 0);

// Return different content types as separate items
return [
  ...paragraphs.map(p => ({ json: { contentType: 'text', content: p } })),
  ...codeBlocks.map(c => ({ json: { contentType: 'code', content: c } })),
  ...tables.map(t => ({ json: { contentType: 'table', content: t } }))
];

Step 11: Build a Multi-Stage Processing Pipeline

Create a multi-stage pipeline for handling large documents:

Extract - Pull out the essential parts of the document
Process - Handle each part with appropriate AI instructions
Combine - Merge the results into a coherent output

Implementation in n8n:

Add a "Switch" node to route different content types to specialized processing nodes
Configure separate AI nodes optimized for each content type
Use a "Merge" node to combine all processed content

Step 12: Implement Automated Retry with Reduced Content

Add error handling that automatically reduces content size on failure:


// Function node to handle context length errors with automatic retry
const MAX\_RETRIES = 3;
const REDUCTION\_FACTOR = 0.75; // Reduce by 25% each retry

// Check if we're in a retry situation
const retryCount = $input.json.retryCount || 0;
const originalText = $input.json.originalText || $input.json.text;
let textToProcess = $input.json.text;

// If this is a retry after an error
if ($input.json.error && $input.json.error.includes('context length')) {
  if (retryCount >= MAX\_RETRIES) {
    throw new Error('Maximum retries exceeded. Text is still too long.');
  }
  
  // Calculate new length and truncate
  const newLength = Math.floor(textToProcess.length \* REDUCTION\_FACTOR);
  textToProcess = textToProcess.substring(0, newLength);
  
  return [{
    json: {
      text: textToProcess,
      originalText: originalText,
      retryCount: retryCount + 1,
      reductionApplied: true,
      message: `Retry ${retryCount + 1}: Reduced text from ${originalText.length} to ${textToProcess.length} characters`
    }
  }];
}

// Normal processing (no error)
return [{
  json: {
    text: textToProcess,
    originalText: originalText,
    retryCount: retryCount
  }
}];

Step 13: Implement Query-Focused Extraction

For query-based workflows, extract only relevant sections before processing:


// Extract only relevant sections based on query
const document = items[0].json.document;
const query = items[0].json.query;

// First, split the document into sections
const sections = document.split('\n## ').map((section, index) => 
  index === 0 ? section : '## ' + section
);

// Define a simple relevance checking function (can be enhanced)
const isRelevant = (text, query) => {
  const keywords = query.toLowerCase().split(' ');
  const textLower = text.toLowerCase();
  return keywords.some(keyword => textLower.includes(keyword));
};

// Filter for relevant sections
const relevantSections = sections.filter(section => isRelevant(section, query));

// If we found relevant sections, use them; otherwise use a summary
if (relevantSections.length > 0) {
  return [{
    json: {
      extractedContent: relevantSections.join('\n\n'),
      query: query,
      sectionsExtracted: relevantSections.length,
      totalSections: sections.length
    }
  }];
} else {
  // No relevant sections found - create a short summary instead
  const firstParagraphs = sections.map(section => {
    const paragraphs = section.split('\n\n');
    return paragraphs[0]; // Just take the first paragraph of each section
  }).join('\n\n');
  
  return [{
    json: {
      extractedContent: firstParagraphs,
      query: query,
      extractionMethod: 'summary',
      message: 'No directly relevant sections found, providing document overview'
    }
  }];
}

Step 14: Create a Dynamic Token Budget System

Implement a token budget system that allocates tokens between prompt and content:


// Dynamic token budget allocation
const MODEL_MAX_TOKENS = 4096; // Change based on your model
const COMPLETION\_TOKENS = 1000; // Reserve this many tokens for AI response
const PROMPT\_TOKENS = 500; // Reserve for your instructions/prompt

// Calculate available tokens for content
const AVAILABLE_CONTENT_TOKENS = MODEL_MAX_TOKENS - PROMPT_TOKENS - COMPLETION_TOKENS;

// Estimate tokens in content (rough approximation)
const content = items[0].json.content;
const estimatedTokens = Math.ceil(content.length / 4); // ~4 chars per token

// Check if we need to truncate
if (estimatedTokens > AVAILABLE_CONTENT_TOKENS) {
  // Calculate how much text we can include (with some margin)
  const allowedCharacters = AVAILABLE_CONTENT_TOKENS \* 3.8; // Slightly less than 4 to be safe
  
  // Truncate content
  const truncatedContent = content.substring(0, allowedCharacters);
  
  return [{
    json: {
      content: truncatedContent,
      truncated: true,
      originalLength: content.length,
      truncatedLength: truncatedContent.length,
      estimatedOriginalTokens: estimatedTokens,
      tokenBudget: AVAILABLE_CONTENT_TOKENS,
      message: "Content was truncated to fit token limits"
    }
  }];
} else {
  // Content fits within token budget
  return [{
    json: {
      content: content,
      truncated: false,
      estimatedTokens: estimatedTokens,
      tokenBudget: AVAILABLE_CONTENT_TOKENS,
      tokenUtilization: Math.round((estimatedTokens / AVAILABLE_CONTENT_TOKENS) \* 100) + '%'
    }
  }];
}

Step 15: Implement a Comprehensive Document Processing System

Combine multiple techniques for an advanced document processing system:

Extract document structure (headings, sections, paragraphs)
Create a document map/table of contents
Process each section independently
Store results in a database or structured format
Create a consolidated view with cross-references

This advanced approach requires multiple nodes and careful workflow design, but allows processing documents of virtually any length while maintaining context relationships.

Troubleshooting Common Issues

Inconsistent chunk sizes: If your chunks vary widely in size, improve your chunking algorithm to produce more consistent chunks.
Lost context between chunks: Use overlapping chunks or pass summary information between processing steps.
Slow processing: Implement parallel processing where possible, but be mindful of API rate limits.
Incomplete processing: Implement validation to ensure all text is processed and add error recovery mechanisms.
Token estimation errors: For more accurate token counting, consider integrating a tokenizer library or using the AI provider's token counting endpoint if available.

Best Practices for n8n AI Workflows

Always implement proper error handling in AI workflows.
Use debug nodes liberally to monitor token usage and content size.
Start with small test documents before processing large ones.
Consider using more capable models (like GPT-4 32K) for larger contexts when appropriate.
Store intermediate results to avoid reprocessing if errors occur later in the workflow.
Implement logging to track processing time, token usage, and error rates.
Regularly review and optimize your chunking and processing logic.

Conclusion

Handling context length exceeded errors in n8n AI workflows requires a thoughtful approach to text processing. By implementing techniques like chunking, summarization, sliding windows, and multi-stage processing, you can effectively work with documents of any size. Remember to monitor your token usage, optimize your prompts, and implement proper error handling to build robust AI workflows that can reliably process large volumes of text.

How to handle “context length exceeded” errors in n8n AI workflows?

How to handle “context length exceeded” errors in n8n AI workflows?

Want to explore opportunities to work with us?

Client trust and success are our top priorities