/n8n-tutorials

How to monitor success rates of language model calls in n8n?

Learn how to monitor success rates of language model calls in n8n with error handling, logging, alerts, dashboards, and detailed analysis for optimal performance.

Matt Graham, CEO of Rapid Developers

Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

Book a free consultation

How to monitor success rates of language model calls in n8n?

To monitor the success rates of language model calls in n8n, you can implement a tracking system using n8n's built-in functionality along with external monitoring tools. This involves tracking API responses, logging failures, calculating success metrics, and visualizing the data for analysis. By following this guide, you'll be able to set up a comprehensive monitoring system that helps you understand your language model's performance and address any issues promptly.

 

Step 1: Set Up Basic Error Handling in n8n Workflows

 

The first step is to implement proper error handling in your n8n workflows that make language model API calls. This allows you to capture when calls fail and why.


// Example of implementing error handling in Function node
const errorHandler = (error, itemIndex) => {
  // Log the error
  console.error(`Error in LLM call for item ${itemIndex}:`, error.message);
  
  // Return a standardized error object
  return {
    success: false,
    error: error.message,
    timestamp: new Date().toISOString(),
    itemIndex: itemIndex
  };
};

// In your HTTP Request node that calls the language model
try {
  // Your language model API call
  const response = await $items(itemIndex).json.response;
  return {
    success: true,
    data: response,
    timestamp: new Date().toISOString(),
    itemIndex: itemIndex
  };
} catch (error) {
  return errorHandler(error, itemIndex);
}

 

Step 2: Create a Database for Storing Call Results

 

Set up a database to store the results of your language model calls. This will allow you to analyze the success rates over time.

  • Create a database table with the following fields:
    • id (primary key)
    • timestamp
    • success (boolean)
    • model\_name
    • error\_message (if applicable)
    • response_time_ms
    • workflow\_id
    • execution\_id

You can use n8n's PostgreSQL, MySQL, or MongoDB nodes to connect to your database.

 

Step 3: Record Each Language Model Call

 

After each language model call, record the results in your database:


// In a Function node after your LLM API call
const startTime = new Date();

// Make your LLM API call
const llmResponse = await $node["HTTP Request"].json;

const endTime = new Date();
const responseTime = endTime - startTime;

// Prepare data for database
const logData = {
  timestamp: new Date().toISOString(),
  success: llmResponse.error ? false : true,
  model\_name: "{{$json.modelName}}",
  error\_message: llmResponse.error || null,
  response_time_ms: responseTime,
  workflow\_id: $workflow.id,
  execution\_id: $execution.id
};

return { logData };

// Then use a database node to insert this data

 

Step 4: Create Success Rate Calculation Workflow

 

Create a separate workflow that calculates success rates at regular intervals:


// In a Function node after retrieving data from your database
function calculateSuccessRate(records, timeframe) {
  const totalCalls = records.length;
  const successfulCalls = records.filter(record => record.success === true).length;
  
  return {
    timeframe,
    total\_calls: totalCalls,
    successful\_calls: successfulCalls,
    success\_rate: totalCalls > 0 ? (successfulCalls / totalCalls) \* 100 : 0,
    average_response_time: records.reduce((sum, record) => sum + record.response_time_ms, 0) / totalCalls
  };
}

// Calculate for different time periods
const last24Hours = $items(0).json.records.filter(r => 
  new Date(r.timestamp) > new Date(Date.now() - 24 _ 60 _ 60 \* 1000)
);

const last7Days = $items(0).json.records.filter(r => 
  new Date(r.timestamp) > new Date(Date.now() - 7 _ 24 _ 60 _ 60 _ 1000)
);

return [
  calculateSuccessRate(last24Hours, "24 hours"),
  calculateSuccessRate(last7Days, "7 days")
];

 

Step 5: Set Up Alerting for Low Success Rates

 

Implement an alerting system that notifies you when success rates drop below a certain threshold:


// In a Function node after calculating success rates
const threshold = 95; // 95% success rate threshold

if ($json.success\_rate < threshold) {
  return {
    alert: true,
    message: `Language model success rate has dropped to ${$json.success_rate.toFixed(2)}% in the last ${$json.timeframe}`,
    severity: $json.success\_rate < 90 ? "high" : "medium",
    timestamp: new Date().toISOString()
  };
} else {
  return {
    alert: false
  };
}

Then use n8n's Slack, Email, or other notification nodes to send alerts when needed.

 

Step 6: Create a Visual Dashboard

 

Set up a visual dashboard to monitor your success rates. You can use:

  • n8n's integration with Grafana
  • Export data to a BI tool like Metabase or Power BI
  • Use n8n's HTTP Request node to send data to a dashboard service

Example of sending data to Grafana:


// In a Function node to format data for Grafana
const grafanaData = $items.map(item => ({
  targets: [
    {
      target: "llm.success\_rate",
      datapoints: [
        [$json.success\_rate, new Date().getTime()]
      ]
    },
    {
      target: "llm.response\_time",
      datapoints: [
        [$json.average_response_time, new Date().getTime()]
      ]
    }
  ]
}));

return { grafanaData };

 

Step 7: Implement Detailed Error Analysis

 

Create a workflow that analyzes errors to identify patterns:


// In a Function node after retrieving error data
function analyzeErrors(errorRecords) {
  // Group errors by type
  const errorTypes = {};
  
  errorRecords.forEach(record => {
    const errorMsg = record.error\_message;
    errorTypes[errorMsg] = (errorTypes[errorMsg] || 0) + 1;
  });
  
  // Sort by frequency
  const sortedErrors = Object.entries(errorTypes)
    .sort((a, b) => b[1] - a[1])
    .map(([error, count]) => ({ error, count }));
  
  return {
    total\_errors: errorRecords.length,
    error\_breakdown: sortedErrors,
    most_common_error: sortedErrors.length > 0 ? sortedErrors[0] : null
  };
}

const failedCalls = $items.filter(item => item.json.success === false);
return analyzeErrors(failedCalls);

 

Step 8: Track Success Rates by Model and Prompt Type

 

To get more granular insights, track success rates by different models and prompt types:


// In a Function node for segmented analysis
function calculateSegmentedSuccessRates(records) {
  // Group by model
  const modelGroups = {};
  records.forEach(record => {
    if (!modelGroups[record.model\_name]) {
      modelGroups[record.model\_name] = [];
    }
    modelGroups[record.model\_name].push(record);
  });
  
  // Calculate success rates for each model
  const modelStats = {};
  for (const [model, modelRecords] of Object.entries(modelGroups)) {
    const total = modelRecords.length;
    const successful = modelRecords.filter(r => r.success).length;
    modelStats[model] = {
      total\_calls: total,
      success\_rate: (successful / total) \* 100,
      avg_response_time: modelRecords.reduce((sum, r) => sum + r.response_time_ms, 0) / total
    };
  }
  
  return modelStats;
}

return calculateSegmentedSuccessRates($items);

 

Step 9: Monitor Cost and Usage Together with Success Rates

 

Track the cost implications of your language model usage alongside success rates:


// In a Function node to calculate costs
function calculateCosts(records, costPerSuccessfulCall, costPerFailedCall) {
  const successfulCalls = records.filter(r => r.success).length;
  const failedCalls = records.length - successfulCalls;
  
  const successCost = successfulCalls \* costPerSuccessfulCall;
  const failureCost = failedCalls \* costPerFailedCall;
  
  return {
    total\_cost: successCost + failureCost,
    success\_cost: successCost,
    failure\_cost: failureCost,
    cost_per_call: (successCost + failureCost) / records.length
  };
}

// Adjust these values based on your LLM provider's pricing
const costPerSuccessfulCall = 0.02; // $0.02 per successful call
const costPerFailedCall = 0.005; // $0.005 per failed call (some providers charge less for failures)

return calculateCosts($items, costPerSuccessfulCall, costPerFailedCall);

 

Step 10: Implement Scheduled Monitoring Reports

 

Set up a scheduled workflow to generate and send regular monitoring reports:


// In a Function node to generate a report
function generateReport(successRates, errorAnalysis, costData) {
  const now = new Date();
  
  return {
    report\_title: `LLM Performance Report - ${now.toISOString().split('T')[0]}`,
    report\_timestamp: now.toISOString(),
    summary: {
      overall_success_rate: successRates.find(r => r.timeframe === "24 hours").success\_rate,
      total_calls_24h: successRates.find(r => r.timeframe === "24 hours").total\_calls,
      avg_response_time_ms: successRates.find(r => r.timeframe === "24 hours").average_response\_time,
      total_cost_24h: costData.total\_cost
    },
    success\_rates: successRates,
    error\_analysis: errorAnalysis,
    cost\_analysis: costData,
    recommendations: generateRecommendations(successRates, errorAnalysis)
  };
}

function generateRecommendations(successRates, errorAnalysis) {
  const recommendations = [];
  
  if (successRates.find(r => r.timeframe === "24 hours").success\_rate < 95) {
    recommendations.push("Investigate recent failures - success rate is below 95%");
  }
  
  if (errorAnalysis.most_common_error) {
    recommendations.push(`Address the most common error: "${errorAnalysis.most_common_error.error}"`);
  }
  
  return recommendations;
}

return generateReport($node["Success Rates"].json, $node["Error Analysis"].json, $node["Cost Analysis"].json);

Use an Email node to send this report to stakeholders on a daily or weekly basis.

 

Step 11: Implement A/B Testing for Different Prompts

 

Set up A/B testing to compare success rates between different prompt formats:


// In a Function node for A/B testing
function setupABTest(item, testId) {
  // Randomly assign variant A or B
  const variant = Math.random() < 0.5 ? 'A' : 'B';
  
  // Define different prompt templates for each variant
  const promptTemplates = {
    'A': `Standard prompt: ${item.json.promptBase}`,
    'B': `Enhanced prompt with examples: ${item.json.promptBase}\n\nFor example:\n${item.json.examples}`
  };
  
  return {
    test\_id: testId,
    variant,
    prompt: promptTemplates[variant],
    original\_item: item.json
  };
}

// Assign each item to a test group
return $items.map(item => setupABTest(item, 'prompt-formatting-test'));

 

Step 12: Create a Real-time Monitoring Dashboard with n8n

 

Build a real-time dashboard using n8n's webhook functionality:


// In a Function node to prepare dashboard data
function prepareDashboardData(successRates, currentStatus) {
  return {
    dashboard: {
      current\_status: {
        status: currentStatus.overall_success_rate > 98 ? "healthy" : "degraded",
        success_rate_current: currentStatus.success\_rate,
        response_time_current: currentStatus.average_response_time
      },
      historical: {
        daily: successRates.filter(r => r.timeframe === "24 hours"),
        weekly: successRates.filter(r => r.timeframe === "7 days")
      },
      last\_updated: new Date().toISOString()
    }
  };
}

// Use this with a Set node to update a static value
// Then create a webhook endpoint to expose this data

Then create a simple HTML dashboard that fetches this data via the webhook.

 

Step 13: Implement Automatic Retries for Failed Calls

 

Set up automatic retries when language model calls fail:


// In a Function node
const maxRetries = 3;
const retryDelay = 1000; // ms

async function callLLMWithRetry(prompt, modelName, retryCount = 0) {
  try {
    // Your LLM API call
    const response = await makeAPICall(prompt, modelName);
    return {
      success: true,
      data: response,
      retries\_needed: retryCount
    };
  } catch (error) {
    if (retryCount < maxRetries) {
      // Wait before retrying
      await new Promise(resolve => setTimeout(resolve, retryDelay \* (retryCount + 1)));
      return callLLMWithRetry(prompt, modelName, retryCount + 1);
    } else {
      return {
        success: false,
        error: error.message,
        retries\_attempted: retryCount
      };
    }
  }
}

// Helper function for making the actual API call
async function makeAPICall(prompt, modelName) {
  // Implementation depends on your LLM provider
  // Example for OpenAI:
  const response = await $node["HTTP Request"].helpers.httpRequest({
    method: "POST",
    url: "https://api.openai.com/v1/chat/completions",
    headers: {
      "Authorization": `Bearer ${$credentials.openAiApi.apiKey}`,
      "Content-Type": "application/json"
    },
    body: {
      model: modelName,
      messages: [{ role: "user", content: prompt }]
    }
  });
  
  return response.data;
}

return callLLMWithRetry($json.prompt, $json.modelName);

 

Step 14: Implement Performance Benchmarking

 

Create a workflow that periodically runs benchmark tests against your language models:


// In a Function node to set up benchmark tests
function createBenchmarkTests() {
  const standardPrompts = [
    { id: "simple_qa", prompt: "What is the capital of France?", expected_type: "factual" },
    { id: "code_generation", prompt: "Write a function to calculate fibonacci numbers", expected_type: "code" },
    { id: "creative", prompt: "Write a short poem about artificial intelligence", expected\_type: "creative" }
  ];
  
  const models = [
    "gpt-3.5-turbo",
    "gpt-4",
    "claude-2"
    // Add other models you use
  ];
  
  const benchmarkTests = [];
  
  // Create all combinations of prompts and models
  for (const prompt of standardPrompts) {
    for (const model of models) {
      benchmarkTests.push({
        test\_id: `${prompt.id}_${model}`,
        prompt: prompt.prompt,
        model: model,
        expected_type: prompt.expected_type,
        timestamp: new Date().toISOString()
      });
    }
  }
  
  return benchmarkTests;
}

return createBenchmarkTests();

 

Step 15: Create a Comprehensive Monitoring System

 

Finally, tie all the previous components together into a comprehensive monitoring system:

  1. Create a master workflow that orchestrates all monitoring activities
  2. Set up a database schema that stores all monitoring data
  3. Implement a dashboard that shows real-time and historical data
  4. Configure alerts for critical failures

Master workflow example:


// In a Function node at the start of your master monitoring workflow
const monitoringTasks = [
  {
    task: "collect_recent_data",
    description: "Retrieve recent LLM call data from database",
    next: "calculate_success_rates"
  },
  {
    task: "calculate_success_rates",
    description: "Calculate success rates for different time periods",
    next: "analyze\_errors" 
  },
  {
    task: "analyze\_errors",
    description: "Analyze error patterns",
    next: "generate_cost_analysis"
  },
  {
    task: "generate_cost_analysis",
    description: "Calculate cost metrics",
    next: "run\_benchmarks"
  },
  {
    task: "run\_benchmarks",
    description: "Run standard benchmark tests",
    next: "generate\_report"
  },
  {
    task: "generate\_report",
    description: "Generate comprehensive monitoring report",
    next: "update\_dashboard"
  },
  {
    task: "update\_dashboard",
    description: "Update real-time monitoring dashboard",
    next: "send\_notifications"
  },
  {
    task: "send\_notifications",
    description: "Send alerts for any issues detected",
    next: null
  }
];

return { monitoringTasks };

By implementing this comprehensive monitoring system, you'll gain full visibility into the success rates of your language model calls in n8n, enabling you to quickly identify and address any issues that arise.

Want to explore opportunities to work with us?

Connect with our team to unlock the full potential of no-code solutions with a no-commitment consultation!

Book a Free Consultation

Client trust and success are our top priorities

When it comes to serving you, we sweat the little things. That’s why our work makes a big impact.

Rapid Dev was an exceptional project management organization and the best development collaborators I've had the pleasure of working with. They do complex work on extremely fast timelines and effectively manage the testing and pre-launch process to deliver the best possible product. I'm extremely impressed with their execution ability.

CPO, Praction - Arkady Sokolov

May 2, 2023

Working with Matt was comparable to having another co-founder on the team, but without the commitment or cost. He has a strategic mindset and willing to change the scope of the project in real time based on the needs of the client. A true strategic thought partner!

Co-Founder, Arc - Donald Muir

Dec 27, 2022

Rapid Dev are 10/10, excellent communicators - the best I've ever encountered in the tech dev space. They always go the extra mile, they genuinely care, they respond quickly, they're flexible, adaptable and their enthusiasm is amazing.

Co-CEO, Grantify - Mat Westergreen-Thorne

Oct 15, 2022

Rapid Dev is an excellent developer for no-code and low-code solutions.
We’ve had great success since launching the platform in November 2023. In a few months, we’ve gained over 1,000 new active users. We’ve also secured several dozen bookings on the platform and seen about 70% new user month-over-month growth since the launch.

Co-Founder, Church Real Estate Marketplace - Emmanuel Brown

May 1, 2024 

Matt’s dedication to executing our vision and his commitment to the project deadline were impressive. 
This was such a specific project, and Matt really delivered. We worked with a really fast turnaround, and he always delivered. The site was a perfect prop for us!

Production Manager, Media Production Company - Samantha Fekete

Sep 23, 2022