How to handle large JSON in n8n?

Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

How to handle large JSON in n8n?

To handle large JSON in n8n, you can use several approaches including splitting JSON data, using pagination, implementing batch processing, leveraging external storage, and optimizing memory usage through streaming. These techniques help prevent memory issues when working with substantial amounts of data while maintaining efficient workflow execution.

Step 1: Understanding the Challenges with Large JSON in n8n

Before diving into solutions, it's important to understand why large JSON files can be problematic in n8n:

Memory limitations: n8n has memory constraints based on your server configuration
Performance issues: Processing large files can slow down workflow execution
Timeout errors: Operations that take too long may time out
Data loss: If a process fails due to memory issues, you might lose data

Let's explore various methods to handle large JSON data effectively in n8n.

Step 2: Splitting Large JSON Files

One effective approach is to split large JSON files into smaller chunks that can be processed individually.

Method 1: Using the Function Node to Split JSON Arrays


// Assuming input JSON is an array in items[0].json
const jsonArray = items[0].json;
const chunkSize = 100; // Adjust based on your needs
const result = [];

// Split the array into chunks
for (let i = 0; i < jsonArray.length; i += chunkSize) {
    const chunk = jsonArray.slice(i, i + chunkSize);
    result.push({json: chunk});
}

return result;

Method 2: Using Split In Batches Node

If you have the Split In Batches node (available in newer n8n versions):

Connect your JSON source to the Split In Batches node
Configure the batch size (e.g., 100 items)
Set the "Batch Format" to "Each item in its own batch"
Connect subsequent nodes to process each batch

Step 3: Implementing Pagination for API Requests

When fetching large datasets from APIs, implement pagination instead of retrieving all data at once.

Example: Paginated API Request Using HTTP Request Node


// Set up in a Function node to control pagination
let page = 1;
const pageSize = 100;
const maxPages = 10; // Set a limit or use a condition to stop

// Function to make paginated requests
async function fetchDataWithPagination() {
    let allData = [];
    let hasMoreData = true;
    
    while (hasMoreData && page <= maxPages) {
        // Make HTTP request with pagination parameters
        const response = await $node["HTTP Request"].makeRequest({
            url: `https://api.example.com/data?page=${page}&limit=${pageSize}`,
            method: 'GET',
            returnFullResponse: true
        });
        
        const data = response.data;
        
        // Add data to our collection
        allData = allData.concat(data.items);
        
        // Check if there's more data to fetch
        hasMoreData = data.items.length === pageSize;
        page++;
    }
    
    return allData;
}

// Execute the function and return the result
const result = await fetchDataWithPagination();
return [{json: result}];

Step 4: Using Batch Processing

Process large datasets in batches to maintain performance and avoid memory issues.

Method 1: Implementing Manual Batching with Function Node


// Assuming items is an array of data from previous node
const batchSize = 50;
const totalItems = items.length;
const results = [];

// Process in batches
for (let i = 0; i < totalItems; i += batchSize) {
    const batch = items.slice(i, i + batchSize);
    
    // Process each item in the batch
    for (const item of batch) {
        // Perform your operations here
        // For example, transform the data
        const processedItem = {
            id: item.json.id,
            name: item.json.name.toUpperCase(),
            // Other processing...
        };
        
        results.push({json: processedItem});
    }
    
    // Optional: Add a small delay between batches
    // await new Promise(resolve => setTimeout(resolve, 100));
}

return results;

Method 2: Using the Loop Over Items Node

Add a Loop Over Items node after your JSON source
Configure it to process a certain number of items at a time
Add your processing logic inside the loop
Use the Loop End node to consolidate results

Step 5: Using External Storage Solutions

For extremely large JSON files, consider using external storage services.

Method 1: Using Temporary File Storage


// In a Function node, using the built-in n8n filesystem module
const fs = require('fs');
const path = require('path');
const os = require('os');

// Create a temporary file path
const tempFilePath = path.join(os.tmpdir(), `large-json-${Date.now()}.json`);

// Write large JSON to the file
fs.writeFileSync(tempFilePath, JSON.stringify(items[0].json));

// Return the file path for the next node
return [{
    json: {
        filePath: tempFilePath
    }
}];

// In a subsequent Function node, read and process the file in chunks
const fs = require('fs');
const filePath = items[0].json.filePath;

// Read the file in chunks or streams if necessary
const data = JSON.parse(fs.readFileSync(filePath, 'utf8'));

// Process the data as needed
// ...

// Clean up the temporary file
fs.unlinkSync(filePath);

return [{json: { result: 'Processing completed' }}];

Method 2: Using Cloud Storage (AWS S3, Google Cloud Storage, etc.)

Upload your large JSON to a cloud storage service
Use the respective n8n nodes (S3, Google Cloud Storage) to access the file
Process the data in manageable chunks
Write results back to storage if needed

Step 6: Optimizing Memory Usage with Streaming

Use streaming techniques to process large JSON files without loading the entire content into memory.

Example: Using a Streaming Approach with Function Node


// This requires setting up a custom function that processes data in a streaming manner
// Example using a streaming JSON parser like 'stream-json'

// Note: This requires installing the module in your n8n installation
// You may need to add it to your Docker container or n8n installation

const StreamArray = require('stream-json/streamers/StreamArray');
const {Readable} = require('stream');
const fs = require('fs');

// Assuming we have a file path from a previous node
const filePath = items[0].json.filePath;

// Create a function that returns a promise for the streaming operation
function processJsonStream() {
    return new Promise((resolve, reject) => {
        const results = [];
        
        // Create a readable stream from the JSON data
        const stream = fs.createReadStream(filePath)
            .pipe(StreamArray.withParser());
        
        // Process each JSON object as it's parsed
        stream.on('data', ({key, value}) => {
            // Process the value (a single object from the JSON array)
            // For example, transform it or filter it
            if (value.someProperty > 100) {
                results.push(value);
            }
        });
        
        stream.on('end', () => {
            resolve(results);
        });
        
        stream.on('error', (err) => {
            reject(err);
        });
    });
}

// Execute the streaming process
const processedData = await processJsonStream();

// Return the processed results
return [{json: processedData}];

Step 7: Using JavaScript Code for Advanced Processing

Leverage JavaScript in Function nodes for more advanced processing strategies.

Example: Memory-Efficient Processing with Generator Functions


// Using a generator function to process items one at a time
function\* processItems(items) {
    for (const item of items) {
        // Do your processing here
        const processed = {
            id: item.id,
            transformedValue: item.value \* 2,
            // other transformations...
        };
        
        yield processed;
    }
}

// Get the large JSON array
const largeArray = items[0].json;
const results = [];

// Use the generator to process items without keeping everything in memory
const processor = processItems(largeArray);
let count = 0;
const batchSize = 100;

// Process in batches of 100
for (const processedItem of processor) {
    results.push(processedItem);
    count++;
    
    // If we've reached our batch size, return current results
    if (count >= batchSize) {
        break;
    }
}

// Store the state for the next execution if needed
// This could be implemented with workflow data or external storage

return [{json: {
    processedItems: results,
    processedCount: count,
    totalItems: largeArray.length
}}];

Step 8: Managing Workflow Execution with Queue Mode

Use n8n's queue mode to prevent timeouts and memory issues with long-running workflows.

Setting Up Queue Mode:

In your n8n installation, set the environment variable: EXECUTIONS_PROCESS=main
Configure queue settings with variables like EXECUTIONS_TIMEOUT and EXECUTIONS_DATA_PRUNE_MAX_COUNT
For workflows processing large JSON, increase the timeout value to prevent premature termination

Step 9: Implementing Error Handling for Large Data Processing

Add robust error handling to manage failures when processing large JSON data.


// In a Function node, implement try-catch with state tracking
let processedCount = 0;
const inputArray = items[0].json;
const results = [];

try {
    // Get the starting point (in case we're resuming after an error)
    const startIndex = $workflow.context.lastProcessedIndex || 0;
    const batchSize = 100;
    
    // Process a batch of items
    for (let i = startIndex; i < Math.min(startIndex + batchSize, inputArray.length); i++) {
        const item = inputArray[i];
        
        // Your processing logic here
        const processedItem = {
            id: item.id,
            // other transformed properties...
        };
        
        results.push(processedItem);
        processedCount++;
        
        // Save our progress periodically
        if (processedCount % 10 === 0) {
            $workflow.context.lastProcessedIndex = startIndex + processedCount;
        }
    }
    
    // Update the final position
    $workflow.context.lastProcessedIndex = startIndex + processedCount;
    
    // Check if we've finished
    const isComplete = (startIndex + processedCount) >= inputArray.length;
    
    return [{
        json: {
            results,
            processedCount,
            startIndex,
            currentIndex: startIndex + processedCount,
            isComplete,
            totalItems: inputArray.length
        }
    }];
} catch (error) {
    // Save our progress in case of error
    $workflow.context.lastProcessedIndex = $workflow.context.lastProcessedIndex || 0;
    
    // Return error information
    return [{
        json: {
            error: error.message,
            lastProcessedIndex: $workflow.context.lastProcessedIndex,
            totalItems: inputArray.length
        }
    }];
}

Step 10: Using n8n's Data Transformation Nodes Efficiently

Leverage built-in nodes for efficient data transformation of large JSON.

Efficient Use of Item Lists Node:

Use the Item Lists node to filter only necessary data before processing
Apply "Limit" operation to process only a specific number of items at once
Use "Aggregate" to combine results after processing

Optimizing with Set Node:

Use the Set node to keep only needed fields from large JSON objects
Implement the "Keep Only Set" option to discard unnecessary data
Apply transformations directly in the Set node rather than Function nodes when possible

Step 11: Implementing a Complete Workflow for Large JSON Processing

Let's put everything together in a comprehensive workflow example.

Example: End-to-End Large JSON Processing Workflow

Start with your JSON source (HTTP Request, Read Binary File, etc.)
Add a Function node to analyze the data size:


// Analyze JSON size and determine processing strategy
const jsonData = items[0].json;
const dataSize = JSON.stringify(jsonData).length;
const itemCount = Array.isArray(jsonData) ? jsonData.length : 1;

// Determine processing strategy based on size
let strategy = 'direct';
if (dataSize > 10 _ 1024 _ 1024) { // 10MB
    strategy = 'external-storage';
} else if (itemCount > 1000) {
    strategy = 'batch-processing';
} else {
    strategy = 'direct';
}

return [{
    json: {
        originalData: jsonData,
        metadata: {
            strategy,
            dataSize,
            itemCount
        }
    }
}];

Add an IF node to route based on the determined strategy
For batch processing path, add a Function node:


// Batch processing implementation
const data = items[0].json.originalData;
const batchSize = 100;
const results = [];

// Process in batches
for (let i = 0; i < data.length; i += batchSize) {
    const batch = data.slice(i, i + batchSize);
    
    // Process each batch item
    const processedBatch = batch.map(item => {
        // Your transformation logic here
        return {
            id: item.id,
            name: item.name,
            // Other processed properties...
            processedAt: new Date().toISOString()
        };
    });
    
    results.push(...processedBatch);
}

return [{json: results}];

For the external storage path, add nodes to:

Save data to temporary storage or cloud storage
Process in chunks from storage
Clean up after processing

Join the paths back together with a Merge node
Add final processing and output nodes

Step 12: Monitoring and Debugging Large JSON Processing

Set up monitoring and debugging to track large JSON processing:

Method 1: Using Console Logs for Progress Tracking


// Add to your Function nodes to track progress
const data = items[0].json;
const totalItems = data.length;
let processedCount = 0;
const results = [];

// Process items with progress tracking
for (const item of data) {
    // Process the item
    const processed = {
        id: item.id,
        // other processing...
    };
    
    results.push(processed);
    processedCount++;
    
    // Log progress every 100 items
    if (processedCount % 100 === 0 || processedCount === totalItems) {
        console.log(`Processed ${processedCount}/${totalItems} items (${Math.round((processedCount/totalItems)*100)}%)`);
    }
}

return [{json: results}];

Method 2: Implementing Custom Progress Storage

Create a workflow data store using $workflow.context
Update progress information as items are processed
Add nodes to notify via Slack, email, or other channels when processing is complete

Step 13: Optimizing Performance for Large JSON Processing

Fine-tune your workflow for better performance:

Use the most efficient data structures for your specific task
Consider using Map and Set objects for faster lookups in large datasets
Implement indexing for frequently accessed data
Pre-filter data as early as possible in the workflow
Use asynchronous processing where appropriate

Example: Optimized Data Lookup Implementation


// Using Map for efficient lookups
const largeData = items[0].json;

// Create an index for faster lookups
const idToDataMap = new Map();
for (const item of largeData) {
    idToDataMap.set(item.id, item);
}

// Now lookups are O(1) instead of O(n)
function findById(id) {
    return idToDataMap.get(id);
}

// Process data that requires lookups
const lookupIds = items[1].json.requiredIds;
const results = lookupIds.map(id => {
    const item = findById(id);
    return item ? {
        id: item.id,
        name: item.name,
        found: true
    } : {
        id,
        found: false
    };
});

return [{json: results}];

Step 14: Handling Nested JSON Structures

Process complex nested JSON structures efficiently:


// Function to flatten deeply nested JSON for easier processing
function flattenJSON(data, prefix = '', result = {}) {
    if (typeof data !== 'object' || data === null) {
        result[prefix] = data;
        return result;
    }
    
    if (Array.isArray(data)) {
        data.forEach((item, index) => {
            const newPrefix = prefix ? `${prefix}.${index}` : `${index}`;
            flattenJSON(item, newPrefix, result);
        });
        return result;
    }
    
    Object.keys(data).forEach(key => {
        const newPrefix = prefix ? `${prefix}.${key}` : key;
        flattenJSON(data[key], newPrefix, result);
    });
    
    return result;
}

// Example usage on large nested JSON
const nestedData = items[0].json;
const flattenedData = flattenJSON(nestedData);

// Now you can work with a flat structure
const processedData = {};
for (const [key, value] of Object.entries(flattenedData)) {
    // Process each field
    if (typeof value === 'string') {
        processedData[key] = value.toUpperCase();
    } else {
        processedData[key] = value;
    }
}

return [{json: processedData}];

Step 15: Summary and Best Practices

To effectively handle large JSON in n8n, remember these key principles:

Split large JSON into smaller, manageable chunks
Use pagination for API requests rather than fetching all data at once
Process data in batches to avoid memory issues
Leverage external storage for very large datasets
Implement streaming techniques when possible
Use queue mode for long-running workflows
Implement robust error handling and state tracking
Monitor progress and implement debugging mechanisms
Optimize data structures and algorithms for your specific use case
Pre-filter and transform data as early as possible in the workflow

By following these approaches, you can process large JSON data efficiently in n8n without encountering memory limitations or performance issues, allowing your workflows to handle substantial amounts of data reliably.

How to handle large JSON in n8n?

How to handle large JSON in n8n?

Want to explore opportunities to work with us?

Client trust and success are our top priorities