/n8n-tutorials

How to handle large JSON in n8n?

Learn how to efficiently handle large JSON in n8n with techniques like splitting, pagination, batch processing, external storage, streaming, and error handling to optimize performance and avoid memory issues.

Matt Graham, CEO of Rapid Developers

Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

Book a free consultation

How to handle large JSON in n8n?

To handle large JSON in n8n, you can use several approaches including splitting JSON data, using pagination, implementing batch processing, leveraging external storage, and optimizing memory usage through streaming. These techniques help prevent memory issues when working with substantial amounts of data while maintaining efficient workflow execution.

 

Step 1: Understanding the Challenges with Large JSON in n8n

 

Before diving into solutions, it's important to understand why large JSON files can be problematic in n8n:

  • Memory limitations: n8n has memory constraints based on your server configuration
  • Performance issues: Processing large files can slow down workflow execution
  • Timeout errors: Operations that take too long may time out
  • Data loss: If a process fails due to memory issues, you might lose data

Let's explore various methods to handle large JSON data effectively in n8n.

 

Step 2: Splitting Large JSON Files

 

One effective approach is to split large JSON files into smaller chunks that can be processed individually.

Method 1: Using the Function Node to Split JSON Arrays


// Assuming input JSON is an array in items[0].json
const jsonArray = items[0].json;
const chunkSize = 100; // Adjust based on your needs
const result = [];

// Split the array into chunks
for (let i = 0; i < jsonArray.length; i += chunkSize) {
    const chunk = jsonArray.slice(i, i + chunkSize);
    result.push({json: chunk});
}

return result;

Method 2: Using Split In Batches Node

If you have the Split In Batches node (available in newer n8n versions):

  1. Connect your JSON source to the Split In Batches node
  2. Configure the batch size (e.g., 100 items)
  3. Set the "Batch Format" to "Each item in its own batch"
  4. Connect subsequent nodes to process each batch

 

Step 3: Implementing Pagination for API Requests

 

When fetching large datasets from APIs, implement pagination instead of retrieving all data at once.

Example: Paginated API Request Using HTTP Request Node


// Set up in a Function node to control pagination
let page = 1;
const pageSize = 100;
const maxPages = 10; // Set a limit or use a condition to stop

// Function to make paginated requests
async function fetchDataWithPagination() {
    let allData = [];
    let hasMoreData = true;
    
    while (hasMoreData && page <= maxPages) {
        // Make HTTP request with pagination parameters
        const response = await $node["HTTP Request"].makeRequest({
            url: `https://api.example.com/data?page=${page}&limit=${pageSize}`,
            method: 'GET',
            returnFullResponse: true
        });
        
        const data = response.data;
        
        // Add data to our collection
        allData = allData.concat(data.items);
        
        // Check if there's more data to fetch
        hasMoreData = data.items.length === pageSize;
        page++;
    }
    
    return allData;
}

// Execute the function and return the result
const result = await fetchDataWithPagination();
return [{json: result}];

 

Step 4: Using Batch Processing

 

Process large datasets in batches to maintain performance and avoid memory issues.

Method 1: Implementing Manual Batching with Function Node


// Assuming items is an array of data from previous node
const batchSize = 50;
const totalItems = items.length;
const results = [];

// Process in batches
for (let i = 0; i < totalItems; i += batchSize) {
    const batch = items.slice(i, i + batchSize);
    
    // Process each item in the batch
    for (const item of batch) {
        // Perform your operations here
        // For example, transform the data
        const processedItem = {
            id: item.json.id,
            name: item.json.name.toUpperCase(),
            // Other processing...
        };
        
        results.push({json: processedItem});
    }
    
    // Optional: Add a small delay between batches
    // await new Promise(resolve => setTimeout(resolve, 100));
}

return results;

Method 2: Using the Loop Over Items Node

  1. Add a Loop Over Items node after your JSON source
  2. Configure it to process a certain number of items at a time
  3. Add your processing logic inside the loop
  4. Use the Loop End node to consolidate results

 

Step 5: Using External Storage Solutions

 

For extremely large JSON files, consider using external storage services.

Method 1: Using Temporary File Storage


// In a Function node, using the built-in n8n filesystem module
const fs = require('fs');
const path = require('path');
const os = require('os');

// Create a temporary file path
const tempFilePath = path.join(os.tmpdir(), `large-json-${Date.now()}.json`);

// Write large JSON to the file
fs.writeFileSync(tempFilePath, JSON.stringify(items[0].json));

// Return the file path for the next node
return [{
    json: {
        filePath: tempFilePath
    }
}];

// In a subsequent Function node, read and process the file in chunks
const fs = require('fs');
const filePath = items[0].json.filePath;

// Read the file in chunks or streams if necessary
const data = JSON.parse(fs.readFileSync(filePath, 'utf8'));

// Process the data as needed
// ...

// Clean up the temporary file
fs.unlinkSync(filePath);

return [{json: { result: 'Processing completed' }}];

Method 2: Using Cloud Storage (AWS S3, Google Cloud Storage, etc.)

  1. Upload your large JSON to a cloud storage service
  2. Use the respective n8n nodes (S3, Google Cloud Storage) to access the file
  3. Process the data in manageable chunks
  4. Write results back to storage if needed

 

Step 6: Optimizing Memory Usage with Streaming

 

Use streaming techniques to process large JSON files without loading the entire content into memory.

Example: Using a Streaming Approach with Function Node


// This requires setting up a custom function that processes data in a streaming manner
// Example using a streaming JSON parser like 'stream-json'

// Note: This requires installing the module in your n8n installation
// You may need to add it to your Docker container or n8n installation

const StreamArray = require('stream-json/streamers/StreamArray');
const {Readable} = require('stream');
const fs = require('fs');

// Assuming we have a file path from a previous node
const filePath = items[0].json.filePath;

// Create a function that returns a promise for the streaming operation
function processJsonStream() {
    return new Promise((resolve, reject) => {
        const results = [];
        
        // Create a readable stream from the JSON data
        const stream = fs.createReadStream(filePath)
            .pipe(StreamArray.withParser());
        
        // Process each JSON object as it's parsed
        stream.on('data', ({key, value}) => {
            // Process the value (a single object from the JSON array)
            // For example, transform it or filter it
            if (value.someProperty > 100) {
                results.push(value);
            }
        });
        
        stream.on('end', () => {
            resolve(results);
        });
        
        stream.on('error', (err) => {
            reject(err);
        });
    });
}

// Execute the streaming process
const processedData = await processJsonStream();

// Return the processed results
return [{json: processedData}];

 

Step 7: Using JavaScript Code for Advanced Processing

 

Leverage JavaScript in Function nodes for more advanced processing strategies.

Example: Memory-Efficient Processing with Generator Functions


// Using a generator function to process items one at a time
function\* processItems(items) {
    for (const item of items) {
        // Do your processing here
        const processed = {
            id: item.id,
            transformedValue: item.value \* 2,
            // other transformations...
        };
        
        yield processed;
    }
}

// Get the large JSON array
const largeArray = items[0].json;
const results = [];

// Use the generator to process items without keeping everything in memory
const processor = processItems(largeArray);
let count = 0;
const batchSize = 100;

// Process in batches of 100
for (const processedItem of processor) {
    results.push(processedItem);
    count++;
    
    // If we've reached our batch size, return current results
    if (count >= batchSize) {
        break;
    }
}

// Store the state for the next execution if needed
// This could be implemented with workflow data or external storage

return [{json: {
    processedItems: results,
    processedCount: count,
    totalItems: largeArray.length
}}];

 

Step 8: Managing Workflow Execution with Queue Mode

 

Use n8n's queue mode to prevent timeouts and memory issues with long-running workflows.

Setting Up Queue Mode:

  1. In your n8n installation, set the environment variable: EXECUTIONS_PROCESS=main
  2. Configure queue settings with variables like EXECUTIONS_TIMEOUT and EXECUTIONS_DATA_PRUNE_MAX_COUNT
  3. For workflows processing large JSON, increase the timeout value to prevent premature termination

 

Step 9: Implementing Error Handling for Large Data Processing

 

Add robust error handling to manage failures when processing large JSON data.


// In a Function node, implement try-catch with state tracking
let processedCount = 0;
const inputArray = items[0].json;
const results = [];

try {
    // Get the starting point (in case we're resuming after an error)
    const startIndex = $workflow.context.lastProcessedIndex || 0;
    const batchSize = 100;
    
    // Process a batch of items
    for (let i = startIndex; i < Math.min(startIndex + batchSize, inputArray.length); i++) {
        const item = inputArray[i];
        
        // Your processing logic here
        const processedItem = {
            id: item.id,
            // other transformed properties...
        };
        
        results.push(processedItem);
        processedCount++;
        
        // Save our progress periodically
        if (processedCount % 10 === 0) {
            $workflow.context.lastProcessedIndex = startIndex + processedCount;
        }
    }
    
    // Update the final position
    $workflow.context.lastProcessedIndex = startIndex + processedCount;
    
    // Check if we've finished
    const isComplete = (startIndex + processedCount) >= inputArray.length;
    
    return [{
        json: {
            results,
            processedCount,
            startIndex,
            currentIndex: startIndex + processedCount,
            isComplete,
            totalItems: inputArray.length
        }
    }];
} catch (error) {
    // Save our progress in case of error
    $workflow.context.lastProcessedIndex = $workflow.context.lastProcessedIndex || 0;
    
    // Return error information
    return [{
        json: {
            error: error.message,
            lastProcessedIndex: $workflow.context.lastProcessedIndex,
            totalItems: inputArray.length
        }
    }];
}

 

Step 10: Using n8n's Data Transformation Nodes Efficiently

 

Leverage built-in nodes for efficient data transformation of large JSON.

Efficient Use of Item Lists Node:

  1. Use the Item Lists node to filter only necessary data before processing
  2. Apply "Limit" operation to process only a specific number of items at once
  3. Use "Aggregate" to combine results after processing

Optimizing with Set Node:

  1. Use the Set node to keep only needed fields from large JSON objects
  2. Implement the "Keep Only Set" option to discard unnecessary data
  3. Apply transformations directly in the Set node rather than Function nodes when possible

 

Step 11: Implementing a Complete Workflow for Large JSON Processing

 

Let's put everything together in a comprehensive workflow example.

Example: End-to-End Large JSON Processing Workflow

  1. Start with your JSON source (HTTP Request, Read Binary File, etc.)
  2. Add a Function node to analyze the data size:

// Analyze JSON size and determine processing strategy
const jsonData = items[0].json;
const dataSize = JSON.stringify(jsonData).length;
const itemCount = Array.isArray(jsonData) ? jsonData.length : 1;

// Determine processing strategy based on size
let strategy = 'direct';
if (dataSize > 10 _ 1024 _ 1024) { // 10MB
    strategy = 'external-storage';
} else if (itemCount > 1000) {
    strategy = 'batch-processing';
} else {
    strategy = 'direct';
}

return [{
    json: {
        originalData: jsonData,
        metadata: {
            strategy,
            dataSize,
            itemCount
        }
    }
}];
  1. Add an IF node to route based on the determined strategy
  2. For batch processing path, add a Function node:

// Batch processing implementation
const data = items[0].json.originalData;
const batchSize = 100;
const results = [];

// Process in batches
for (let i = 0; i < data.length; i += batchSize) {
    const batch = data.slice(i, i + batchSize);
    
    // Process each batch item
    const processedBatch = batch.map(item => {
        // Your transformation logic here
        return {
            id: item.id,
            name: item.name,
            // Other processed properties...
            processedAt: new Date().toISOString()
        };
    });
    
    results.push(...processedBatch);
}

return [{json: results}];
  1. For the external storage path, add nodes to:
  • Save data to temporary storage or cloud storage
  • Process in chunks from storage
  • Clean up after processing
  1. Join the paths back together with a Merge node
  2. Add final processing and output nodes

 

Step 12: Monitoring and Debugging Large JSON Processing

 

Set up monitoring and debugging to track large JSON processing:

Method 1: Using Console Logs for Progress Tracking


// Add to your Function nodes to track progress
const data = items[0].json;
const totalItems = data.length;
let processedCount = 0;
const results = [];

// Process items with progress tracking
for (const item of data) {
    // Process the item
    const processed = {
        id: item.id,
        // other processing...
    };
    
    results.push(processed);
    processedCount++;
    
    // Log progress every 100 items
    if (processedCount % 100 === 0 || processedCount === totalItems) {
        console.log(`Processed ${processedCount}/${totalItems} items (${Math.round((processedCount/totalItems)*100)}%)`);
    }
}

return [{json: results}];

Method 2: Implementing Custom Progress Storage

  1. Create a workflow data store using $workflow.context
  2. Update progress information as items are processed
  3. Add nodes to notify via Slack, email, or other channels when processing is complete

 

Step 13: Optimizing Performance for Large JSON Processing

 

Fine-tune your workflow for better performance:

  • Use the most efficient data structures for your specific task
  • Consider using Map and Set objects for faster lookups in large datasets
  • Implement indexing for frequently accessed data
  • Pre-filter data as early as possible in the workflow
  • Use asynchronous processing where appropriate

Example: Optimized Data Lookup Implementation


// Using Map for efficient lookups
const largeData = items[0].json;

// Create an index for faster lookups
const idToDataMap = new Map();
for (const item of largeData) {
    idToDataMap.set(item.id, item);
}

// Now lookups are O(1) instead of O(n)
function findById(id) {
    return idToDataMap.get(id);
}

// Process data that requires lookups
const lookupIds = items[1].json.requiredIds;
const results = lookupIds.map(id => {
    const item = findById(id);
    return item ? {
        id: item.id,
        name: item.name,
        found: true
    } : {
        id,
        found: false
    };
});

return [{json: results}];

 

Step 14: Handling Nested JSON Structures

 

Process complex nested JSON structures efficiently:


// Function to flatten deeply nested JSON for easier processing
function flattenJSON(data, prefix = '', result = {}) {
    if (typeof data !== 'object' || data === null) {
        result[prefix] = data;
        return result;
    }
    
    if (Array.isArray(data)) {
        data.forEach((item, index) => {
            const newPrefix = prefix ? `${prefix}.${index}` : `${index}`;
            flattenJSON(item, newPrefix, result);
        });
        return result;
    }
    
    Object.keys(data).forEach(key => {
        const newPrefix = prefix ? `${prefix}.${key}` : key;
        flattenJSON(data[key], newPrefix, result);
    });
    
    return result;
}

// Example usage on large nested JSON
const nestedData = items[0].json;
const flattenedData = flattenJSON(nestedData);

// Now you can work with a flat structure
const processedData = {};
for (const [key, value] of Object.entries(flattenedData)) {
    // Process each field
    if (typeof value === 'string') {
        processedData[key] = value.toUpperCase();
    } else {
        processedData[key] = value;
    }
}

return [{json: processedData}];

 

Step 15: Summary and Best Practices

 

To effectively handle large JSON in n8n, remember these key principles:

  • Split large JSON into smaller, manageable chunks
  • Use pagination for API requests rather than fetching all data at once
  • Process data in batches to avoid memory issues
  • Leverage external storage for very large datasets
  • Implement streaming techniques when possible
  • Use queue mode for long-running workflows
  • Implement robust error handling and state tracking
  • Monitor progress and implement debugging mechanisms
  • Optimize data structures and algorithms for your specific use case
  • Pre-filter and transform data as early as possible in the workflow

By following these approaches, you can process large JSON data efficiently in n8n without encountering memory limitations or performance issues, allowing your workflows to handle substantial amounts of data reliably.

Want to explore opportunities to work with us?

Connect with our team to unlock the full potential of no-code solutions with a no-commitment consultation!

Book a Free Consultation

Client trust and success are our top priorities

When it comes to serving you, we sweat the little things. That’s why our work makes a big impact.

Rapid Dev was an exceptional project management organization and the best development collaborators I've had the pleasure of working with. They do complex work on extremely fast timelines and effectively manage the testing and pre-launch process to deliver the best possible product. I'm extremely impressed with their execution ability.

CPO, Praction - Arkady Sokolov

May 2, 2023

Working with Matt was comparable to having another co-founder on the team, but without the commitment or cost. He has a strategic mindset and willing to change the scope of the project in real time based on the needs of the client. A true strategic thought partner!

Co-Founder, Arc - Donald Muir

Dec 27, 2022

Rapid Dev are 10/10, excellent communicators - the best I've ever encountered in the tech dev space. They always go the extra mile, they genuinely care, they respond quickly, they're flexible, adaptable and their enthusiasm is amazing.

Co-CEO, Grantify - Mat Westergreen-Thorne

Oct 15, 2022

Rapid Dev is an excellent developer for no-code and low-code solutions.
We’ve had great success since launching the platform in November 2023. In a few months, we’ve gained over 1,000 new active users. We’ve also secured several dozen bookings on the platform and seen about 70% new user month-over-month growth since the launch.

Co-Founder, Church Real Estate Marketplace - Emmanuel Brown

May 1, 2024 

Matt’s dedication to executing our vision and his commitment to the project deadline were impressive. 
This was such a specific project, and Matt really delivered. We worked with a really fast turnaround, and he always delivered. The site was a perfect prop for us!

Production Manager, Media Production Company - Samantha Fekete

Sep 23, 2022