/n8n-tutorials

How to handle “context length exceeded” errors in n8n AI workflows?

Learn how to fix "context length exceeded" errors in n8n AI workflows by using text chunking, summarization, recursive processing, and token management to handle large inputs efficiently.

Matt Graham, CEO of Rapid Developers

Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

Book a free consultation

How to handle “context length exceeded” errors in n8n AI workflows?

When encountering "context length exceeded" errors in n8n AI workflows, you need to implement strategies to manage large amounts of text that exceed the token limits of AI models. These errors typically occur when you're sending too much text to AI nodes like OpenAI or similar services. The most effective solutions include chunking your text into smaller segments, summarizing content before processing, using specific node combinations to handle large inputs, and implementing recursive workflows for processing extensive documents.

 

Understanding Context Length Exceeded Errors in n8n

 

Context length exceeded errors occur when you attempt to send more tokens to an AI model than it can process at once. Every AI model has a maximum token limit (for example, GPT-3.5 has a 4,096 token limit, while GPT-4 can handle up to 8,192 or 32,768 tokens depending on the version). When your input exceeds this limit, the AI node in n8n will throw an error and fail to execute.

 

Step 1: Analyze Your Workflow to Identify the Problem

 

Before implementing any solution, you need to understand where and why the context length is being exceeded:

  • Open your n8n workflow and identify the AI node where the error occurs.
  • Check the input data being passed to this node by adding a "Debug" node before it.
  • Look at the error message details, which usually indicate the token count and the maximum allowed.
  • Determine if the issue is with the prompt, the content being processed, or the combination of both.

 

Step 2: Implement Text Chunking

 

The most common solution is to break your text into smaller chunks that fit within the token limits:

  • Add a "Function" node before your AI node.
  • Use JavaScript to split the text into manageable chunks.

// Basic text chunking function
const inputText = items[0].json.text; // Replace with your actual input field
const maxChunkSize = 3000; // Set this lower than your model's token limit (in characters, not exactly tokens)
const chunks = [];

// Split by paragraphs first
const paragraphs = inputText.split('\n\n');
let currentChunk = '';

for (const paragraph of paragraphs) {
  // If adding this paragraph would exceed our limit, save current chunk and start a new one
  if (currentChunk.length + paragraph.length > maxChunkSize && currentChunk.length > 0) {
    chunks.push(currentChunk);
    currentChunk = paragraph;
  } else {
    // Add paragraph to current chunk (with paragraph break if not the first paragraph)
    currentChunk = currentChunk.length === 0 ? paragraph : currentChunk + '\n\n' + paragraph;
  }
}

// Add the last chunk if it's not empty
if (currentChunk.length > 0) {
  chunks.push(currentChunk);
}

// Return an array of items, one for each chunk
return chunks.map(chunk => ({
  json: {
    text: chunk,
    chunkInfo: `Part ${chunks.indexOf(chunk) + 1} of ${chunks.length}`
  }
}));

 

Step 3: Process Chunks Sequentially

 

After chunking, you'll need to process each chunk individually:

  • Connect the Function node to your AI node.
  • Make sure "Split In Batches" is enabled in the AI node settings.
  • Use a "Merge" node after the AI node to combine results from all chunks.

// In a Function node after the AI node to combine results
const results = items.map(item => item.json.aiResponse); // Adjust field name as needed
const combinedResult = results.join('\n\n');

return [{
  json: {
    combinedResponse: combinedResult
  }
}];

 

Step 4: Implement Smart Text Summarization

 

If your text is extremely long, consider summarizing it first:

  • Add an AI node specifically for summarization before your main AI processing.
  • Configure it to create a concise summary of the larger text.

To implement this in n8n:

  1. Add an OpenAI node before your main processing.
  2. Set the prompt to request summarization: "Summarize the following text in a concise manner while preserving key information: {{$input.json.text}}"
  3. Use this summary as input for your main AI processing.

 

Step 5: Create a Recursive Document Processing Workflow

 

For very large documents, implement a recursive approach:

  • Create a subworkflow that processes a single chunk.
  • Use the "Execute Workflow" node to call this subworkflow for each chunk.
  • Compile the results at the end.

The main workflow structure:

  1. Start node → Load document → Chunk document → Loop through chunks
  2. For each chunk: Execute Subworkflow → Store result
  3. After all chunks: Merge results → Final processing

 

Step 6: Implement Token-Aware Chunking

 

For more precise handling, implement token counting instead of character counting:


// More accurate token-aware chunking using a simple estimation
// Note: This is an estimation, not exact token counting
const estimateTokens = (text) => {
  // GPT models use roughly 4 characters per token (rough estimate)
  return Math.ceil(text.length / 4);
};

const inputText = items[0].json.text;
const maxTokensPerChunk = 3000; // Set below your model's limit
const chunks = [];

// Split by paragraphs
const paragraphs = inputText.split('\n\n');
let currentChunk = '';
let currentTokenCount = 0;

for (const paragraph of paragraphs) {
  const paragraphTokens = estimateTokens(paragraph);
  
  // If adding this paragraph would exceed our token limit
  if (currentTokenCount + paragraphTokens > maxTokensPerChunk && currentTokenCount > 0) {
    chunks.push(currentChunk);
    currentChunk = paragraph;
    currentTokenCount = paragraphTokens;
  } else {
    // Add paragraph to current chunk
    if (currentChunk.length === 0) {
      currentChunk = paragraph;
    } else {
      currentChunk += '\n\n' + paragraph;
    }
    currentTokenCount += paragraphTokens;
  }
}

// Add the last chunk
if (currentChunk.length > 0) {
  chunks.push(currentChunk);
}

// Return an array of items with chunk info
return chunks.map((chunk, index) => ({
  json: {
    text: chunk,
    chunkNumber: index + 1,
    totalChunks: chunks.length,
    estimatedTokens: estimateTokens(chunk)
  }
}));

 

Step 7: Implement Sliding Window Processing

 

For texts where context between chunks is important, use a sliding window approach:


// Sliding window chunking with overlap
const inputText = items[0].json.text;
const maxChunkSize = 3000;
const overlapSize = 500; // Characters to overlap between chunks
const chunks = [];

let startPos = 0;
while (startPos < inputText.length) {
  let endPos = Math.min(startPos + maxChunkSize, inputText.length);
  
  // If we're not at the end of the text, try to find a good break point
  if (endPos < inputText.length) {
    // Look for a paragraph break in the overlap region
    const possibleBreak = inputText.lastIndexOf('\n\n', endPos);
    if (possibleBreak > startPos && possibleBreak > endPos - overlapSize) {
      endPos = possibleBreak;
    } else {
      // If no paragraph break, look for a sentence break
      const sentenceBreak = inputText.lastIndexOf('. ', endPos);
      if (sentenceBreak > startPos && sentenceBreak > endPos - overlapSize) {
        endPos = sentenceBreak + 1; // Include the period
      }
    }
  }
  
  chunks.push(inputText.substring(startPos, endPos));
  
  // Move start position for next chunk, with potential overlap
  startPos = endPos - (endPos < inputText.length ? overlapSize : 0);
}

return chunks.map((chunk, index) => ({
  json: {
    text: chunk,
    chunkNumber: index + 1,
    totalChunks: chunks.length,
    hasOverlap: index > 0
  }
}));

 

Step 8: Create a Progressive Summarization Workflow

 

For extremely large documents, implement progressive summarization:

  1. Split the document into major sections.
  2. Summarize each section independently.
  3. Combine the summaries and create a final meta-summary.

Implementation in n8n:

  1. Split the document into logical sections (chapters, sections, etc.)
  2. For each section: Send to AI for summarization
  3. Combine all summaries
  4. Send the combined summaries to AI for a meta-summary

 

Step 9: Optimize Prompts to Reduce Token Usage

 

Minimize the prompt size to leave more tokens for content:

  • Use concise, clear instructions.
  • Avoid repetitive examples in the prompt.
  • Consider storing parts of complex prompts in variables and inject only what's needed.

Bad prompt example:


// Too verbose, wastes tokens
"You are an AI assistant tasked with analyzing the following text. I want you to carefully read through every word and provide a detailed analysis covering the main themes, key points, emotional tone, writing style, and potential implications. Please be extremely thorough and make sure to cover every aspect of the text. Do not leave anything out. The text is as follows: {{$input.json.text}}"

Optimized prompt example:


// Concise, preserves tokens for content
"Analyze this text concisely: {{$input.json.text}}"

 

Step 10: Implement Document Sectioning Based on Content Type

 

For documents with mixed content types (text, tables, code), process each type separately:


// Separate different content types for specialized processing
const inputText = items[0].json.text;

// Regular expressions to identify different content types
const codeBlockRegex = /`[\s\S]*?`/g;
const tableRegex = /|[\s\S]_?|[\s\S]_?|/g;

// Extract code blocks
const codeBlocks = inputText.match(codeBlockRegex) || [];
let textWithoutCode = inputText.replace(codeBlockRegex, '[CODE_BLOCK_PLACEHOLDER]');

// Extract tables
const tables = textWithoutCode.match(tableRegex) || [];
let textWithoutCodeAndTables = textWithoutCode.replace(tableRegex, '[TABLE\_PLACEHOLDER]');

// Split remaining text into paragraphs
const paragraphs = textWithoutCodeAndTables.split('\n\n').filter(p => p.trim().length > 0);

// Return different content types as separate items
return [
  ...paragraphs.map(p => ({ json: { contentType: 'text', content: p } })),
  ...codeBlocks.map(c => ({ json: { contentType: 'code', content: c } })),
  ...tables.map(t => ({ json: { contentType: 'table', content: t } }))
];

 

Step 11: Build a Multi-Stage Processing Pipeline

 

Create a multi-stage pipeline for handling large documents:

  1. Extract - Pull out the essential parts of the document
  2. Process - Handle each part with appropriate AI instructions
  3. Combine - Merge the results into a coherent output

Implementation in n8n:

  • Add a "Switch" node to route different content types to specialized processing nodes
  • Configure separate AI nodes optimized for each content type
  • Use a "Merge" node to combine all processed content

 

Step 12: Implement Automated Retry with Reduced Content

 

Add error handling that automatically reduces content size on failure:


// Function node to handle context length errors with automatic retry
const MAX\_RETRIES = 3;
const REDUCTION\_FACTOR = 0.75; // Reduce by 25% each retry

// Check if we're in a retry situation
const retryCount = $input.json.retryCount || 0;
const originalText = $input.json.originalText || $input.json.text;
let textToProcess = $input.json.text;

// If this is a retry after an error
if ($input.json.error && $input.json.error.includes('context length')) {
  if (retryCount >= MAX\_RETRIES) {
    throw new Error('Maximum retries exceeded. Text is still too long.');
  }
  
  // Calculate new length and truncate
  const newLength = Math.floor(textToProcess.length \* REDUCTION\_FACTOR);
  textToProcess = textToProcess.substring(0, newLength);
  
  return [{
    json: {
      text: textToProcess,
      originalText: originalText,
      retryCount: retryCount + 1,
      reductionApplied: true,
      message: `Retry ${retryCount + 1}: Reduced text from ${originalText.length} to ${textToProcess.length} characters`
    }
  }];
}

// Normal processing (no error)
return [{
  json: {
    text: textToProcess,
    originalText: originalText,
    retryCount: retryCount
  }
}];

 

Step 13: Implement Query-Focused Extraction

 

For query-based workflows, extract only relevant sections before processing:


// Extract only relevant sections based on query
const document = items[0].json.document;
const query = items[0].json.query;

// First, split the document into sections
const sections = document.split('\n## ').map((section, index) => 
  index === 0 ? section : '## ' + section
);

// Define a simple relevance checking function (can be enhanced)
const isRelevant = (text, query) => {
  const keywords = query.toLowerCase().split(' ');
  const textLower = text.toLowerCase();
  return keywords.some(keyword => textLower.includes(keyword));
};

// Filter for relevant sections
const relevantSections = sections.filter(section => isRelevant(section, query));

// If we found relevant sections, use them; otherwise use a summary
if (relevantSections.length > 0) {
  return [{
    json: {
      extractedContent: relevantSections.join('\n\n'),
      query: query,
      sectionsExtracted: relevantSections.length,
      totalSections: sections.length
    }
  }];
} else {
  // No relevant sections found - create a short summary instead
  const firstParagraphs = sections.map(section => {
    const paragraphs = section.split('\n\n');
    return paragraphs[0]; // Just take the first paragraph of each section
  }).join('\n\n');
  
  return [{
    json: {
      extractedContent: firstParagraphs,
      query: query,
      extractionMethod: 'summary',
      message: 'No directly relevant sections found, providing document overview'
    }
  }];
}

 

Step 14: Create a Dynamic Token Budget System

 

Implement a token budget system that allocates tokens between prompt and content:


// Dynamic token budget allocation
const MODEL_MAX_TOKENS = 4096; // Change based on your model
const COMPLETION\_TOKENS = 1000; // Reserve this many tokens for AI response
const PROMPT\_TOKENS = 500; // Reserve for your instructions/prompt

// Calculate available tokens for content
const AVAILABLE_CONTENT_TOKENS = MODEL_MAX_TOKENS - PROMPT_TOKENS - COMPLETION_TOKENS;

// Estimate tokens in content (rough approximation)
const content = items[0].json.content;
const estimatedTokens = Math.ceil(content.length / 4); // ~4 chars per token

// Check if we need to truncate
if (estimatedTokens > AVAILABLE_CONTENT_TOKENS) {
  // Calculate how much text we can include (with some margin)
  const allowedCharacters = AVAILABLE_CONTENT_TOKENS \* 3.8; // Slightly less than 4 to be safe
  
  // Truncate content
  const truncatedContent = content.substring(0, allowedCharacters);
  
  return [{
    json: {
      content: truncatedContent,
      truncated: true,
      originalLength: content.length,
      truncatedLength: truncatedContent.length,
      estimatedOriginalTokens: estimatedTokens,
      tokenBudget: AVAILABLE_CONTENT_TOKENS,
      message: "Content was truncated to fit token limits"
    }
  }];
} else {
  // Content fits within token budget
  return [{
    json: {
      content: content,
      truncated: false,
      estimatedTokens: estimatedTokens,
      tokenBudget: AVAILABLE_CONTENT_TOKENS,
      tokenUtilization: Math.round((estimatedTokens / AVAILABLE_CONTENT_TOKENS) \* 100) + '%'
    }
  }];
}

 

Step 15: Implement a Comprehensive Document Processing System

 

Combine multiple techniques for an advanced document processing system:

  1. Extract document structure (headings, sections, paragraphs)
  2. Create a document map/table of contents
  3. Process each section independently
  4. Store results in a database or structured format
  5. Create a consolidated view with cross-references

This advanced approach requires multiple nodes and careful workflow design, but allows processing documents of virtually any length while maintaining context relationships.

 

Troubleshooting Common Issues

 

  • Inconsistent chunk sizes: If your chunks vary widely in size, improve your chunking algorithm to produce more consistent chunks.
  • Lost context between chunks: Use overlapping chunks or pass summary information between processing steps.
  • Slow processing: Implement parallel processing where possible, but be mindful of API rate limits.
  • Incomplete processing: Implement validation to ensure all text is processed and add error recovery mechanisms.
  • Token estimation errors: For more accurate token counting, consider integrating a tokenizer library or using the AI provider's token counting endpoint if available.

 

Best Practices for n8n AI Workflows

 

  • Always implement proper error handling in AI workflows.
  • Use debug nodes liberally to monitor token usage and content size.
  • Start with small test documents before processing large ones.
  • Consider using more capable models (like GPT-4 32K) for larger contexts when appropriate.
  • Store intermediate results to avoid reprocessing if errors occur later in the workflow.
  • Implement logging to track processing time, token usage, and error rates.
  • Regularly review and optimize your chunking and processing logic.

 

Conclusion

 

Handling context length exceeded errors in n8n AI workflows requires a thoughtful approach to text processing. By implementing techniques like chunking, summarization, sliding windows, and multi-stage processing, you can effectively work with documents of any size. Remember to monitor your token usage, optimize your prompts, and implement proper error handling to build robust AI workflows that can reliably process large volumes of text.

Want to explore opportunities to work with us?

Connect with our team to unlock the full potential of no-code solutions with a no-commitment consultation!

Book a Free Consultation

Client trust and success are our top priorities

When it comes to serving you, we sweat the little things. That’s why our work makes a big impact.

Rapid Dev was an exceptional project management organization and the best development collaborators I've had the pleasure of working with. They do complex work on extremely fast timelines and effectively manage the testing and pre-launch process to deliver the best possible product. I'm extremely impressed with their execution ability.

CPO, Praction - Arkady Sokolov

May 2, 2023

Working with Matt was comparable to having another co-founder on the team, but without the commitment or cost. He has a strategic mindset and willing to change the scope of the project in real time based on the needs of the client. A true strategic thought partner!

Co-Founder, Arc - Donald Muir

Dec 27, 2022

Rapid Dev are 10/10, excellent communicators - the best I've ever encountered in the tech dev space. They always go the extra mile, they genuinely care, they respond quickly, they're flexible, adaptable and their enthusiasm is amazing.

Co-CEO, Grantify - Mat Westergreen-Thorne

Oct 15, 2022

Rapid Dev is an excellent developer for no-code and low-code solutions.
We’ve had great success since launching the platform in November 2023. In a few months, we’ve gained over 1,000 new active users. We’ve also secured several dozen bookings on the platform and seen about 70% new user month-over-month growth since the launch.

Co-Founder, Church Real Estate Marketplace - Emmanuel Brown

May 1, 2024 

Matt’s dedication to executing our vision and his commitment to the project deadline were impressive. 
This was such a specific project, and Matt really delivered. We worked with a really fast turnaround, and he always delivered. The site was a perfect prop for us!

Production Manager, Media Production Company - Samantha Fekete

Sep 23, 2022