How to prevent duplicate LLM calls in the same n8n workflow?

Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

How to prevent duplicate LLM calls in the same n8n workflow?

To prevent duplicate LLM calls in n8n workflows, you can implement caching mechanisms using Function nodes, the n8n Credentials & Workflow Data feature, or external databases. This allows you to store previously processed inputs and their corresponding outputs, checking for existing results before making new API calls to LLMs like GPT or Claude, thus saving time and API costs.

Comprehensive Guide: Preventing Duplicate LLM Calls in n8n Workflows

Step 1: Understanding the Problem

When working with Large Language Models (LLMs) like OpenAI's GPT or Anthropic's Claude in n8n workflows, you might encounter situations where the same input is sent to the LLM multiple times, resulting in:

Unnecessary API costs
Increased workflow execution time
Potential rate limiting issues

This guide presents multiple approaches to implement caching mechanisms that prevent duplicate LLM calls by storing previous inputs and their corresponding outputs.

Step 2: Setting Up a Basic Caching System with Function Nodes

The simplest approach is to use a Function node to create an in-memory cache:


// Initialize cache if it doesn't exist
if (!$workflow.cache) {
  $workflow.cache = {};
}

// Get the input text that would be sent to the LLM
const inputText = items[0].json.inputText;

// Check if we already have a result for this input
if ($workflow.cache[inputText]) {
  // Return cached result
  items[0].json.llmResponse = $workflow.cache[inputText];
  return items;
}

// If not in cache, let the workflow continue to the LLM
return items;

After the LLM node, add another Function node to save the result:


// Initialize cache if it doesn't exist
if (!$workflow.cache) {
  $workflow.cache = {};
}

// Save the result to the cache
const inputText = items[0].json.inputText;
const llmResponse = items[0].json.llmResponse;

$workflow.cache[inputText] = llmResponse;

return items;

Step 3: Implementing a More Robust Cache with Workflow Data

For persistence between workflow runs, use n8n's Credentials & Workflow Data feature:

Step 3.1: Enable workflow data storage:

Open your workflow settings
Navigate to the "Settings" tab
Enable "Save Workflow Data"

Step 3.2: Create a Function node to check and retrieve from cache:


// Get input that would be sent to LLM
const inputText = items[0].json.inputText;

// Create a hash of the input to use as a key
// This handles large inputs better than using the raw text
function createHash(text) {
  let hash = 0;
  for (let i = 0; i < text.length; i++) {
    const char = text.charCodeAt(i);
    hash = ((hash << 5) - hash) + char;
    hash = hash & hash; // Convert to 32bit integer
  }
  return hash.toString();
}

const inputHash = createHash(inputText);

// Load the cache from workflow data
const workflowData = await $getWorkflowStaticData("global");
if (!workflowData.llmCache) {
  workflowData.llmCache = {};
}

// Check if we have a cached result
if (workflowData.llmCache[inputHash]) {
  items[0].json.llmResponse = workflowData.llmCache[inputHash].response;
  items[0].json.fromCache = true;
  return items;
}

// If not cached, add a flag to process it
items[0].json.fromCache = false;
items[0].json.inputHash = inputHash;
return items;

Step 3.3: Add an IF node to check if the result was found in cache:

Condition: {{$json.fromCache}} == true
If true: Skip the LLM node
If false: Proceed to the LLM node

Step 3.4: After the LLM node, add a Function node to save to cache:


// Load the current cache
const workflowData = await $getWorkflowStaticData("global");
if (!workflowData.llmCache) {
  workflowData.llmCache = {};
}

// Get the input hash and LLM response
const inputHash = items[0].json.inputHash;
const llmResponse = items[0].json.llmResponse;

// Save to cache with timestamp
workflowData.llmCache[inputHash] = {
  response: llmResponse,
  timestamp: Date.now()
};

// Optional: Limit cache size (keep only the last 100 entries)
const cacheKeys = Object.keys(workflowData.llmCache);
if (cacheKeys.length > 100) {
  // Sort by timestamp (oldest first)
  cacheKeys.sort((a, b) => workflowData.llmCache[a].timestamp - workflowData.llmCache[b].timestamp);
  // Remove oldest entries
  for (let i = 0; i < cacheKeys.length - 100; i++) {
    delete workflowData.llmCache[cacheKeys[i]];
  }
}

return items;

Step 4: Advanced Caching with External Database

For more persistent and scalable caching, use an external database:

Step 4.1: Add a MongoDB node (or your preferred database) to your workflow.

Step 4.2: Create a Function node to check the cache:


// Get the input for the LLM
const inputText = items[0].json.inputText;

// Create a more sophisticated hash
function createHash(text) {
  let hash = 0;
  for (let i = 0; i < text.length; i++) {
    const char = text.charCodeAt(i);
    hash = ((hash << 5) - hash) + char;
    hash = hash & hash;
  }
  return hash.toString();
}

const inputHash = createHash(inputText);

// Pass along the original input and the hash
items[0].json.inputHash = inputHash;
items[0].json.dbQuery = { inputHash: inputHash };

return items;

Step 4.3: Add a MongoDB node configured to find documents:

Operation: Find
Collection: llm\_cache
Query: {{$json.dbQuery}}
Limit: 1

Step 4.4: Add an IF node to check if a cached result was found:

Condition: {{$json.length}} > 0

Step 4.5: If a result was found, add a Function node to extract it:


// Extract the cached response
items[0].json.llmResponse = items[0].json[0].response;
items[0].json.fromCache = true;

// Remove the database result array
delete items[0].json[0];

return items;

Step 4.6: If no result was found, proceed to the LLM node, then add a MongoDB node to save the result:

Operation: Insert
Collection: llm\_cache
Fields:
- inputHash: {{$json.inputHash}}
- inputText: {{$json.inputText}}
- response: {{$json.llmResponse}}
- timestamp: {{Date.now()}}

Step 5: Implementing Cache Expiration

For time-sensitive applications, add cache expiration:


// In your cache checking function
const workflowData = await $getWorkflowStaticData("global");
if (!workflowData.llmCache) {
  workflowData.llmCache = {};
}

const inputHash = createHash(items[0].json.inputText);

// Check if we have a cached result AND it's not expired
const CACHE\_TTL = 24 _ 60 _ 60 \* 1000; // 24 hours in milliseconds
const now = Date.now();

if (
  workflowData.llmCache[inputHash] &&
  (now - workflowData.llmCache[inputHash].timestamp) < CACHE\_TTL
) {
  // Use cached result
  items[0].json.llmResponse = workflowData.llmCache[inputHash].response;
  items[0].json.fromCache = true;
} else {
  // If expired or not in cache, mark for processing
  items[0].json.fromCache = false;
  
  // Optionally clean up expired cache entries
  if (workflowData.llmCache[inputHash]) {
    delete workflowData.llmCache[inputHash];
  }
}

return items;

Step 6: Implementing Semantic Caching

For more intelligent caching that can match semantically similar inputs:

Step 6.1: Create an embedding for each input using an embedding model (e.g., OpenAI's embeddings API).

Step 6.2: Store both the input text and its embedding in your cache:


// After getting an embedding from OpenAI or another service
const workflowData = await $getWorkflowStaticData("global");
if (!workflowData.semanticCache) {
  workflowData.semanticCache = [];
}

// Store the input, embedding, and response
workflowData.semanticCache.push({
  inputText: items[0].json.inputText,
  embedding: items[0].json.embedding,  // This comes from an embedding API
  response: items[0].json.llmResponse,
  timestamp: Date.now()
});

return items;

Step 6.3: To check for similar inputs, calculate cosine similarity:


// Function to calculate cosine similarity between embeddings
function cosineSimilarity(a, b) {
  let dotProduct = 0;
  let normA = 0;
  let normB = 0;
  
  for (let i = 0; i < a.length; i++) {
    dotProduct += a[i] \* b[i];
    normA += a[i] \* a[i];
    normB += b[i] \* b[i];
  }
  
  normA = Math.sqrt(normA);
  normB = Math.sqrt(normB);
  
  return dotProduct / (normA \* normB);
}

// Current input embedding (from an embedding API)
const currentEmbedding = items[0].json.embedding;

// Load the cache
const workflowData = await $getWorkflowStaticData("global");
if (!workflowData.semanticCache) {
  workflowData.semanticCache = [];
  items[0].json.fromCache = false;
  return items;
}

// Find the most similar cached entry
let bestMatch = null;
let highestSimilarity = 0;
const SIMILARITY\_THRESHOLD = 0.92; // Adjust as needed

for (const entry of workflowData.semanticCache) {
  const similarity = cosineSimilarity(currentEmbedding, entry.embedding);
  
  if (similarity > highestSimilarity) {
    highestSimilarity = similarity;
    bestMatch = entry;
  }
}

// If we found a sufficiently similar entry, use it
if (bestMatch && highestSimilarity >= SIMILARITY\_THRESHOLD) {
  items[0].json.llmResponse = bestMatch.response;
  items[0].json.similarity = highestSimilarity;
  items[0].json.fromCache = true;
} else {
  items[0].json.fromCache = false;
}

return items;

Step 7: Implementing a Cache Pruning Strategy

For long-running workflows, implement a strategy to prevent the cache from growing too large:


// Load the cache
const workflowData = await $getWorkflowStaticData("global");
if (!workflowData.llmCache) {
  workflowData.llmCache = {};
}

// Maximum cache size
const MAX_CACHE_SIZE = 500;

// If the cache is too large, prune it
const cacheKeys = Object.keys(workflowData.llmCache);
if (cacheKeys.length > MAX_CACHE_SIZE) {
  console.log(`Cache too large (${cacheKeys.length}), pruning...`);
  
  // Strategy 1: Remove oldest entries
  const sortedEntries = cacheKeys
    .map(key => ({
      key,
      timestamp: workflowData.llmCache[key].timestamp || 0
    }))
    .sort((a, b) => a.timestamp - b.timestamp);
  
  // Keep only the newest MAX_CACHE_SIZE entries
  const keysToRemove = sortedEntries
    .slice(0, sortedEntries.length - MAX_CACHE_SIZE)
    .map(entry => entry.key);
  
  // Remove the oldest entries
  for (const key of keysToRemove) {
    delete workflowData.llmCache[key];
  }
  
  console.log(`Removed ${keysToRemove.length} old cache entries`);
}

// Continue with normal cache operations...

Step 8: Creating a Reusable Caching Subworkflow

To make your caching system reusable across multiple workflows:

Step 8.1: Create a new workflow dedicated to cache management.

Step 8.2: Add an "Execute Workflow" trigger to make it callable from other workflows.

Step 8.3: Add input parameters to the "Execute Workflow" trigger:

operation: "check" or "save"
inputText: The text to check or save
response: The LLM response (for "save" operation)

Step 8.4: Implement the cache logic in this workflow with a Switch node based on the operation.

Step 8.5: In your main workflow, use the "Execute Workflow" node to call this caching workflow before and after your LLM node.

Step 9: Monitoring Cache Performance

To track how effective your cache is:


// At the end of your workflow, add this to a Function node
const workflowData = await $getWorkflowStaticData("global");

// Initialize stats if needed
if (!workflowData.cacheStats) {
  workflowData.cacheStats = {
    totalRequests: 0,
    cacheHits: 0,
    cacheMisses: 0,
    lastReset: Date.now()
  };
}

// Update stats
workflowData.cacheStats.totalRequests++;
if (items[0].json.fromCache) {
  workflowData.cacheStats.cacheHits++;
} else {
  workflowData.cacheStats.cacheMisses++;
}

// Calculate hit rate
const hitRate = (workflowData.cacheStats.cacheHits / workflowData.cacheStats.totalRequests) \* 100;

// Add stats to the output
items[0].json.cacheStats = {
  totalRequests: workflowData.cacheStats.totalRequests,
  cacheHits: workflowData.cacheStats.cacheHits,
  cacheMisses: workflowData.cacheStats.cacheMisses,
  hitRate: `${hitRate.toFixed(2)}%`,
  runningFor: `${Math.floor((Date.now() - workflowData.cacheStats.lastReset) / (1000 * 60 * 60))} hours`
};

return items;

Step 10: Handling Multiple Input Parameters

If your LLM calls include multiple parameters beyond just the input text:


// Create a composite key that includes all relevant parameters
function createCacheKey(params) {
  // Sort keys to ensure consistent order
  const sortedKeys = Object.keys(params).sort();
  
  // Create a string representation of all parameters
  const paramsString = sortedKeys
    .map(key => `${key}:${JSON.stringify(params[key])}`)
    .join('|');
  
  // Create a hash of the parameters string
  function hashString(str) {
    let hash = 0;
    for (let i = 0; i < str.length; i++) {
      const char = str.charCodeAt(i);
      hash = ((hash << 5) - hash) + char;
      hash = hash & hash;
    }
    return hash.toString();
  }
  
  return hashString(paramsString);
}

// Example usage
const cacheKey = createCacheKey({
  inputText: items[0].json.inputText,
  temperature: items[0].json.temperature || 0.7,
  maxTokens: items[0].json.maxTokens || 1000,
  model: items[0].json.model || "gpt-3.5-turbo"
});

// Use this key in your cache lookups
items[0].json.cacheKey = cacheKey;

return items;

Step 11: Troubleshooting Common Issues

If you encounter issues with your caching implementation:

Issue 1: Cache not persisting between workflow runs

Check that "Save Workflow Data" is enabled in workflow settings
Verify that you're using $getWorkflowStaticData("global") correctly
Check if the workflow is being executed with different execution IDs

Issue 2: Cache always misses despite same inputs

Verify that your hash/key generation is consistent
Check if there are hidden characters or whitespace differences in inputs
Add logging to compare the stored and current hash values

Issue 3: Cache grows too large

Implement the pruning strategy from Step 7
Consider using an external database for larger caches
Implement time-based expiration for cache entries

Step 12: Optimizing for Production Use

For production workflows:


// Add error handling to your cache operations
try {
  const workflowData = await $getWorkflowStaticData("global");
  if (!workflowData.llmCache) {
    workflowData.llmCache = {};
  }
  
  // Cache operations...
  
} catch (error) {
  console.error("Cache operation failed:", error.message);
  // Provide a fallback behavior
  items[0].json.fromCache = false;
  items[0].json.cacheError = error.message;
}

return items;

Add logging for debugging:


// Add verbose logging for cache operations
console.log(`Cache operation: ${operation}`);
console.log(`Input hash: ${inputHash}`);
console.log(`Cache hit: ${!!workflowData.llmCache[inputHash]}`);

if (workflowData.llmCache[inputHash]) {
  console.log(`Cached at: ${new Date(workflowData.llmCache[inputHash].timestamp).toISOString()}`);
}

// Continue with cache operations...

Conclusion

By implementing a caching mechanism for your LLM calls in n8n, you can significantly reduce API costs, improve workflow execution times, and avoid rate limiting issues. Choose the approach that best fits your specific needs, considering factors like persistence requirements, cache size, and the complexity of your inputs.

Remember that caching works best for deterministic queries where the same input should always produce the same output. For applications requiring fresh, non-deterministic, or context-dependent responses, you may need to modify these caching strategies or disable them entirely.

How to prevent duplicate LLM calls in the same n8n workflow?

How to prevent duplicate LLM calls in the same n8n workflow?

Want to explore opportunities to work with us?

Client trust and success are our top priorities