ETIMEDOUT errors in n8n occur when an LLM API call exceeds the HTTP connection timeout, usually because the model takes too long to generate a response. Fix this by increasing the n8n HTTP timeout via environment variables, reducing prompt complexity, setting lower max_tokens, enabling retry-on-fail, and using the HTTP Request node with custom timeout settings for long-running completions.
Why LLM Calls Time Out with ETIMEDOUT in n8n
When n8n sends a request to an LLM API, it waits for a response within a default HTTP timeout period (typically 300 seconds). If the model takes longer — due to complex prompts, high token counts, server congestion, or network latency — the connection is terminated with an ETIMEDOUT error. This is distinct from API-level timeouts; it happens at the Node.js HTTP client level. Large language models like GPT-4 and Claude can take 30-60+ seconds for complex completions, and during peak usage, response times can spike significantly.
Prerequisites
- A running n8n instance (self-hosted recommended for environment variable access)
- LLM API credentials configured in n8n (OpenAI, Anthropic, etc.)
- A workflow experiencing ETIMEDOUT errors on LLM nodes
- Access to n8n environment variables (self-hosted) or workflow settings (Cloud)
Step-by-step guide
Increase the n8n HTTP Request Timeout
Increase the n8n HTTP Request Timeout
The default HTTP timeout in n8n is 300 seconds (5 minutes). For self-hosted instances, increase this by setting the N8N_HTTP_TIMEOUT environment variable. Set it to 600000 (10 minutes in milliseconds) or higher for workflows that make complex LLM calls. In Docker, add this to your docker-compose.yml environment section. For n8n Cloud, this setting is managed by n8n and cannot be changed directly — use the HTTP Request node with a custom timeout instead.
1# Docker Compose environment variable2services:3 n8n:4 image: n8nio/n8n:latest5 environment:6 - N8N_HTTP_TIMEOUT=6000007 # Also set the generic Node.js timeout8 - NODE_OPTIONS=--max-http-header-size=16384Expected result: n8n now waits up to 10 minutes for HTTP responses before timing out.
Reduce Prompt Complexity and Max Tokens
Reduce Prompt Complexity and Max Tokens
The simplest way to avoid timeouts is to make the LLM respond faster. Reduce the max_tokens or 'Maximum Number of Tokens' setting in your LLM node to limit response length. For OpenAI, set it in the node options. For Anthropic Claude, set max_tokens in the Anthropic Chat Model sub-node. Also simplify your system prompt — remove unnecessary instructions, examples, and formatting requirements. Each additional instruction increases processing time. If your prompt includes few-shot examples, reduce them to 1-2 instead of 5-6.
Expected result: LLM responses complete faster, well within the timeout window.
Use the HTTP Request Node with Custom Timeout
Use the HTTP Request Node with Custom Timeout
For maximum control over timeout behavior, bypass the dedicated LLM nodes and use the HTTP Request node to call the API directly. The HTTP Request node has an explicit timeout setting in its Options. Set it to the maximum you need. This approach also lets you set custom headers, handle streaming responses, and implement more granular error handling. Configure the request body to match the API's expected format.
1// HTTP Request node configuration for OpenAI2// Method: POST3// URL: https://api.openai.com/v1/chat/completions4// Authentication: Header Auth5// Name: Authorization6// Value: Bearer {{ $credentials.openAiApi.apiKey }}7// Body (JSON):8{9 "model": "gpt-4o",10 "messages": [11 {12 "role": "system",13 "content": "You are a helpful assistant."14 },15 {16 "role": "user",17 "content": "{{ $json.user_message }}"18 }19 ],20 "max_tokens": 2048,21 "temperature": 0.722}23// Options → Timeout: 120000 (120 seconds)Expected result: LLM API calls use a custom timeout that you control per-node, independent of n8n's global timeout.
Add Retry Logic for Timeout Failures
Add Retry Logic for Timeout Failures
Timeouts are often transient — the same request may succeed on a second attempt if the API server is less loaded. On your LLM node or HTTP Request node, go to Settings and enable 'Retry On Fail'. Set Max Tries to 3 and Wait Between Tries to 5000 ms. This gives the API a chance to recover between attempts. For production workflows, also consider adding an exponential backoff pattern using a Code node and a Loop.
1// Code node for exponential backoff retry (advanced)2// Use this inside a loop workflow for custom retry logic34const maxRetries = 3;5const baseDelay = 2000; // 2 seconds6const currentRetry = $json.retry_count || 0;78if ($json.status === 'timeout_error' && currentRetry < maxRetries) {9 const delay = baseDelay * Math.pow(2, currentRetry);10 // Use the Wait node after this Code node with the calculated delay11 return [{12 json: {13 ...($json),14 retry_count: currentRetry + 1,15 wait_ms: delay,16 should_retry: true17 }18 }];19}2021return [{22 json: {23 ...($json),24 should_retry: false,25 final_status: $json.status === 'timeout_error' ? 'failed_after_retries' : $json.status26 }27}];Expected result: Transient timeout errors are automatically retried with increasing delays between attempts.
Switch to a Faster Model for Time-Sensitive Workflows
Switch to a Faster Model for Time-Sensitive Workflows
If your workflow is triggered by webhooks and needs to respond within a tight window, consider using a faster model. GPT-4o-mini is significantly faster than GPT-4o. Claude 3.5 Haiku is faster than Claude 3.5 Sonnet. Gemini Flash is faster than Gemini Pro. You can also implement a tiered approach: try the fast model first, and only fall back to the larger model if the task requires it. Use an IF node to route simple vs complex requests to different LLM nodes.
Expected result: Time-sensitive workflows use faster models that complete well within timeout limits.
Complete working example
1// Code node: Run Once for Each Item2// Place AFTER the LLM node (with Continue On Fail + Retry On Fail enabled)3// This handles timeout errors and prepares data for retry or fallback45const item = $input.item;6const json = item.json;78// Detect timeout errors9const isTimeout = json.error && (10 json.error.message?.includes('ETIMEDOUT') ||11 json.error.message?.includes('ESOCKETTIMEDOUT') ||12 json.error.message?.includes('timeout') ||13 json.error.message?.includes('ECONNRESET') ||14 json.error.code === 'ETIMEDOUT'15);1617if (isTimeout) {18 return [{19 json: {20 text: '',21 status: 'timeout_error',22 error_message: json.error.message,23 suggestion: 'Reduce prompt length, lower max_tokens, or try a faster model',24 timestamp: new Date().toISOString()25 }26 }];27}2829// Detect other API errors30if (json.error) {31 return [{32 json: {33 text: '',34 status: 'api_error',35 error_message: json.error.message || 'Unknown error',36 error_code: json.error.code || 'unknown',37 timestamp: new Date().toISOString()38 }39 }];40}4142// Success — extract response text43const text = json?.message?.content44 || json?.text45 || json?.output46 || json?.choices?.[0]?.message?.content47 || '';4849return [{50 json: {51 text: text,52 status: text ? 'success' : 'empty_response',53 timestamp: new Date().toISOString()54 }55}];Common mistakes when fixing ETIMEDOUT Errors When Calling a Large Language Model in n8n
Why it's a problem: Confusing ETIMEDOUT (HTTP connection timeout) with API rate limiting (429 errors)
How to avoid: ETIMEDOUT means the connection itself timed out — the server did not respond in time. A 429 error means the server responded but denied the request due to rate limits. They require different fixes: timeout needs longer wait or faster model; 429 needs request throttling.
Why it's a problem: Setting a very high global timeout without also setting a workflow-level timeout
How to avoid: A high N8N_HTTP_TIMEOUT means any stuck HTTP request will hang for that entire duration. Set a workflow-level timeout in Workflow Settings → Timeout Workflow After to prevent executions from running indefinitely.
Why it's a problem: Not using the Respond to Webhook node, causing webhook callers to time out waiting for the LLM response
How to avoid: Place a Respond to Webhook node immediately after the Webhook Trigger to send a 200 OK response, then continue processing the LLM call. This prevents the caller from timing out.
Why it's a problem: Using the same timeout for all LLM calls regardless of model speed differences
How to avoid: Use the HTTP Request node with per-node timeout settings instead of relying on the global timeout. Fast models (GPT-4o-mini) need 30 seconds; slow models (GPT-4o with high token counts) may need 120+ seconds.
Best practices
- Set N8N_HTTP_TIMEOUT to at least 600000ms (10 minutes) for workflows that call large language models
- Always set max_tokens on LLM nodes — unbounded token counts lead to unpredictable response times
- Use faster models (GPT-4o-mini, Haiku, Gemini Flash) for time-sensitive webhook workflows
- Enable 'Retry On Fail' with 3 retries and 5-second delays on all LLM nodes in production
- Monitor API response times by logging timestamps in a Code node before and after LLM calls
- For webhook-triggered workflows, return a 200 response immediately and process the LLM call asynchronously using the Respond to Webhook node
- Set workflow-level timeouts in Workflow Settings as a safety net against indefinitely hanging executions
Still stuck?
Copy one of these prompts to get a personalized, step-by-step explanation.
I'm getting ETIMEDOUT errors in n8n when calling LLM APIs (OpenAI, Claude, Gemini). The models take too long to respond and the connection times out. How do I increase the timeout, add retry logic, and optimize my prompts to reduce response time?
Fix ETIMEDOUT errors in my n8n LLM workflow. Show me how to set N8N_HTTP_TIMEOUT in Docker, use the HTTP Request node with a custom timeout for OpenAI API calls, and add a Code node to detect and handle timeout errors with retry logic.
Frequently asked questions
What is the default HTTP timeout in n8n?
The default HTTP timeout in n8n is 300 seconds (5 minutes). You can change this for self-hosted instances by setting the N8N_HTTP_TIMEOUT environment variable to a value in milliseconds (e.g., 600000 for 10 minutes).
Is ETIMEDOUT the same as a 408 Request Timeout HTTP error?
No. ETIMEDOUT is a Node.js-level error that occurs when the TCP connection itself times out — the server never responded. A 408 is an HTTP-level response from the server indicating it timed out processing the request. ETIMEDOUT means no HTTP response was received at all.
Can I use streaming to avoid timeouts with LLM APIs in n8n?
n8n's built-in LLM nodes do not support streaming responses. However, you can use the HTTP Request node to call APIs with streaming enabled, though you would need a Code node to process the streamed chunks. In most cases, it is simpler to increase the timeout and reduce max_tokens.
Why do timeouts happen more often during certain times of day?
LLM APIs experience peak usage during US and European business hours. During these times, API response times increase significantly. If you experience time-of-day patterns, consider queuing requests, using faster models during peak hours, or implementing request throttling.
Does n8n Cloud support changing the HTTP timeout?
n8n Cloud does not expose the N8N_HTTP_TIMEOUT setting. Instead, use the HTTP Request node with a custom timeout in its Options section, or reduce prompt complexity and max_tokens to ensure responses complete within the default timeout.
Can RapidDev help optimize my n8n workflows to avoid timeout issues?
Yes. RapidDev's engineering team can audit your n8n workflows, optimize prompt design for faster responses, configure timeout and retry settings, and implement asynchronous processing patterns for long-running LLM calls.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation