Learn how to monitor success rates of language model calls in n8n with error handling, logging, alerts, dashboards, and detailed analysis for optimal performance.
Book a call with an Expert
Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.
To monitor the success rates of language model calls in n8n, you can implement a tracking system using n8n's built-in functionality along with external monitoring tools. This involves tracking API responses, logging failures, calculating success metrics, and visualizing the data for analysis. By following this guide, you'll be able to set up a comprehensive monitoring system that helps you understand your language model's performance and address any issues promptly.
Step 1: Set Up Basic Error Handling in n8n Workflows
The first step is to implement proper error handling in your n8n workflows that make language model API calls. This allows you to capture when calls fail and why.
// Example of implementing error handling in Function node
const errorHandler = (error, itemIndex) => {
// Log the error
console.error(`Error in LLM call for item ${itemIndex}:`, error.message);
// Return a standardized error object
return {
success: false,
error: error.message,
timestamp: new Date().toISOString(),
itemIndex: itemIndex
};
};
// In your HTTP Request node that calls the language model
try {
// Your language model API call
const response = await $items(itemIndex).json.response;
return {
success: true,
data: response,
timestamp: new Date().toISOString(),
itemIndex: itemIndex
};
} catch (error) {
return errorHandler(error, itemIndex);
}
Step 2: Create a Database for Storing Call Results
Set up a database to store the results of your language model calls. This will allow you to analyze the success rates over time.
You can use n8n's PostgreSQL, MySQL, or MongoDB nodes to connect to your database.
Step 3: Record Each Language Model Call
After each language model call, record the results in your database:
// In a Function node after your LLM API call
const startTime = new Date();
// Make your LLM API call
const llmResponse = await $node["HTTP Request"].json;
const endTime = new Date();
const responseTime = endTime - startTime;
// Prepare data for database
const logData = {
timestamp: new Date().toISOString(),
success: llmResponse.error ? false : true,
model\_name: "{{$json.modelName}}",
error\_message: llmResponse.error || null,
response_time_ms: responseTime,
workflow\_id: $workflow.id,
execution\_id: $execution.id
};
return { logData };
// Then use a database node to insert this data
Step 4: Create Success Rate Calculation Workflow
Create a separate workflow that calculates success rates at regular intervals:
// In a Function node after retrieving data from your database
function calculateSuccessRate(records, timeframe) {
const totalCalls = records.length;
const successfulCalls = records.filter(record => record.success === true).length;
return {
timeframe,
total\_calls: totalCalls,
successful\_calls: successfulCalls,
success\_rate: totalCalls > 0 ? (successfulCalls / totalCalls) \* 100 : 0,
average_response_time: records.reduce((sum, record) => sum + record.response_time_ms, 0) / totalCalls
};
}
// Calculate for different time periods
const last24Hours = $items(0).json.records.filter(r =>
new Date(r.timestamp) > new Date(Date.now() - 24 _ 60 _ 60 \* 1000)
);
const last7Days = $items(0).json.records.filter(r =>
new Date(r.timestamp) > new Date(Date.now() - 7 _ 24 _ 60 _ 60 _ 1000)
);
return [
calculateSuccessRate(last24Hours, "24 hours"),
calculateSuccessRate(last7Days, "7 days")
];
Step 5: Set Up Alerting for Low Success Rates
Implement an alerting system that notifies you when success rates drop below a certain threshold:
// In a Function node after calculating success rates
const threshold = 95; // 95% success rate threshold
if ($json.success\_rate < threshold) {
return {
alert: true,
message: `Language model success rate has dropped to ${$json.success_rate.toFixed(2)}% in the last ${$json.timeframe}`,
severity: $json.success\_rate < 90 ? "high" : "medium",
timestamp: new Date().toISOString()
};
} else {
return {
alert: false
};
}
Then use n8n's Slack, Email, or other notification nodes to send alerts when needed.
Step 6: Create a Visual Dashboard
Set up a visual dashboard to monitor your success rates. You can use:
Example of sending data to Grafana:
// In a Function node to format data for Grafana
const grafanaData = $items.map(item => ({
targets: [
{
target: "llm.success\_rate",
datapoints: [
[$json.success\_rate, new Date().getTime()]
]
},
{
target: "llm.response\_time",
datapoints: [
[$json.average_response_time, new Date().getTime()]
]
}
]
}));
return { grafanaData };
Step 7: Implement Detailed Error Analysis
Create a workflow that analyzes errors to identify patterns:
// In a Function node after retrieving error data
function analyzeErrors(errorRecords) {
// Group errors by type
const errorTypes = {};
errorRecords.forEach(record => {
const errorMsg = record.error\_message;
errorTypes[errorMsg] = (errorTypes[errorMsg] || 0) + 1;
});
// Sort by frequency
const sortedErrors = Object.entries(errorTypes)
.sort((a, b) => b[1] - a[1])
.map(([error, count]) => ({ error, count }));
return {
total\_errors: errorRecords.length,
error\_breakdown: sortedErrors,
most_common_error: sortedErrors.length > 0 ? sortedErrors[0] : null
};
}
const failedCalls = $items.filter(item => item.json.success === false);
return analyzeErrors(failedCalls);
Step 8: Track Success Rates by Model and Prompt Type
To get more granular insights, track success rates by different models and prompt types:
// In a Function node for segmented analysis
function calculateSegmentedSuccessRates(records) {
// Group by model
const modelGroups = {};
records.forEach(record => {
if (!modelGroups[record.model\_name]) {
modelGroups[record.model\_name] = [];
}
modelGroups[record.model\_name].push(record);
});
// Calculate success rates for each model
const modelStats = {};
for (const [model, modelRecords] of Object.entries(modelGroups)) {
const total = modelRecords.length;
const successful = modelRecords.filter(r => r.success).length;
modelStats[model] = {
total\_calls: total,
success\_rate: (successful / total) \* 100,
avg_response_time: modelRecords.reduce((sum, r) => sum + r.response_time_ms, 0) / total
};
}
return modelStats;
}
return calculateSegmentedSuccessRates($items);
Step 9: Monitor Cost and Usage Together with Success Rates
Track the cost implications of your language model usage alongside success rates:
// In a Function node to calculate costs
function calculateCosts(records, costPerSuccessfulCall, costPerFailedCall) {
const successfulCalls = records.filter(r => r.success).length;
const failedCalls = records.length - successfulCalls;
const successCost = successfulCalls \* costPerSuccessfulCall;
const failureCost = failedCalls \* costPerFailedCall;
return {
total\_cost: successCost + failureCost,
success\_cost: successCost,
failure\_cost: failureCost,
cost_per_call: (successCost + failureCost) / records.length
};
}
// Adjust these values based on your LLM provider's pricing
const costPerSuccessfulCall = 0.02; // $0.02 per successful call
const costPerFailedCall = 0.005; // $0.005 per failed call (some providers charge less for failures)
return calculateCosts($items, costPerSuccessfulCall, costPerFailedCall);
Step 10: Implement Scheduled Monitoring Reports
Set up a scheduled workflow to generate and send regular monitoring reports:
// In a Function node to generate a report
function generateReport(successRates, errorAnalysis, costData) {
const now = new Date();
return {
report\_title: `LLM Performance Report - ${now.toISOString().split('T')[0]}`,
report\_timestamp: now.toISOString(),
summary: {
overall_success_rate: successRates.find(r => r.timeframe === "24 hours").success\_rate,
total_calls_24h: successRates.find(r => r.timeframe === "24 hours").total\_calls,
avg_response_time_ms: successRates.find(r => r.timeframe === "24 hours").average_response\_time,
total_cost_24h: costData.total\_cost
},
success\_rates: successRates,
error\_analysis: errorAnalysis,
cost\_analysis: costData,
recommendations: generateRecommendations(successRates, errorAnalysis)
};
}
function generateRecommendations(successRates, errorAnalysis) {
const recommendations = [];
if (successRates.find(r => r.timeframe === "24 hours").success\_rate < 95) {
recommendations.push("Investigate recent failures - success rate is below 95%");
}
if (errorAnalysis.most_common_error) {
recommendations.push(`Address the most common error: "${errorAnalysis.most_common_error.error}"`);
}
return recommendations;
}
return generateReport($node["Success Rates"].json, $node["Error Analysis"].json, $node["Cost Analysis"].json);
Use an Email node to send this report to stakeholders on a daily or weekly basis.
Step 11: Implement A/B Testing for Different Prompts
Set up A/B testing to compare success rates between different prompt formats:
// In a Function node for A/B testing
function setupABTest(item, testId) {
// Randomly assign variant A or B
const variant = Math.random() < 0.5 ? 'A' : 'B';
// Define different prompt templates for each variant
const promptTemplates = {
'A': `Standard prompt: ${item.json.promptBase}`,
'B': `Enhanced prompt with examples: ${item.json.promptBase}\n\nFor example:\n${item.json.examples}`
};
return {
test\_id: testId,
variant,
prompt: promptTemplates[variant],
original\_item: item.json
};
}
// Assign each item to a test group
return $items.map(item => setupABTest(item, 'prompt-formatting-test'));
Step 12: Create a Real-time Monitoring Dashboard with n8n
Build a real-time dashboard using n8n's webhook functionality:
// In a Function node to prepare dashboard data
function prepareDashboardData(successRates, currentStatus) {
return {
dashboard: {
current\_status: {
status: currentStatus.overall_success_rate > 98 ? "healthy" : "degraded",
success_rate_current: currentStatus.success\_rate,
response_time_current: currentStatus.average_response_time
},
historical: {
daily: successRates.filter(r => r.timeframe === "24 hours"),
weekly: successRates.filter(r => r.timeframe === "7 days")
},
last\_updated: new Date().toISOString()
}
};
}
// Use this with a Set node to update a static value
// Then create a webhook endpoint to expose this data
Then create a simple HTML dashboard that fetches this data via the webhook.
Step 13: Implement Automatic Retries for Failed Calls
Set up automatic retries when language model calls fail:
// In a Function node
const maxRetries = 3;
const retryDelay = 1000; // ms
async function callLLMWithRetry(prompt, modelName, retryCount = 0) {
try {
// Your LLM API call
const response = await makeAPICall(prompt, modelName);
return {
success: true,
data: response,
retries\_needed: retryCount
};
} catch (error) {
if (retryCount < maxRetries) {
// Wait before retrying
await new Promise(resolve => setTimeout(resolve, retryDelay \* (retryCount + 1)));
return callLLMWithRetry(prompt, modelName, retryCount + 1);
} else {
return {
success: false,
error: error.message,
retries\_attempted: retryCount
};
}
}
}
// Helper function for making the actual API call
async function makeAPICall(prompt, modelName) {
// Implementation depends on your LLM provider
// Example for OpenAI:
const response = await $node["HTTP Request"].helpers.httpRequest({
method: "POST",
url: "https://api.openai.com/v1/chat/completions",
headers: {
"Authorization": `Bearer ${$credentials.openAiApi.apiKey}`,
"Content-Type": "application/json"
},
body: {
model: modelName,
messages: [{ role: "user", content: prompt }]
}
});
return response.data;
}
return callLLMWithRetry($json.prompt, $json.modelName);
Step 14: Implement Performance Benchmarking
Create a workflow that periodically runs benchmark tests against your language models:
// In a Function node to set up benchmark tests
function createBenchmarkTests() {
const standardPrompts = [
{ id: "simple_qa", prompt: "What is the capital of France?", expected_type: "factual" },
{ id: "code_generation", prompt: "Write a function to calculate fibonacci numbers", expected_type: "code" },
{ id: "creative", prompt: "Write a short poem about artificial intelligence", expected\_type: "creative" }
];
const models = [
"gpt-3.5-turbo",
"gpt-4",
"claude-2"
// Add other models you use
];
const benchmarkTests = [];
// Create all combinations of prompts and models
for (const prompt of standardPrompts) {
for (const model of models) {
benchmarkTests.push({
test\_id: `${prompt.id}_${model}`,
prompt: prompt.prompt,
model: model,
expected_type: prompt.expected_type,
timestamp: new Date().toISOString()
});
}
}
return benchmarkTests;
}
return createBenchmarkTests();
Step 15: Create a Comprehensive Monitoring System
Finally, tie all the previous components together into a comprehensive monitoring system:
Master workflow example:
// In a Function node at the start of your master monitoring workflow
const monitoringTasks = [
{
task: "collect_recent_data",
description: "Retrieve recent LLM call data from database",
next: "calculate_success_rates"
},
{
task: "calculate_success_rates",
description: "Calculate success rates for different time periods",
next: "analyze\_errors"
},
{
task: "analyze\_errors",
description: "Analyze error patterns",
next: "generate_cost_analysis"
},
{
task: "generate_cost_analysis",
description: "Calculate cost metrics",
next: "run\_benchmarks"
},
{
task: "run\_benchmarks",
description: "Run standard benchmark tests",
next: "generate\_report"
},
{
task: "generate\_report",
description: "Generate comprehensive monitoring report",
next: "update\_dashboard"
},
{
task: "update\_dashboard",
description: "Update real-time monitoring dashboard",
next: "send\_notifications"
},
{
task: "send\_notifications",
description: "Send alerts for any issues detected",
next: null
}
];
return { monitoringTasks };
By implementing this comprehensive monitoring system, you'll gain full visibility into the success rates of your language model calls in n8n, enabling you to quickly identify and address any issues that arise.
When it comes to serving you, we sweat the little things. That’s why our work makes a big impact.