To integrate OpenAI GPT with Bolt.new, install the openai npm package (pure JavaScript, works in WebContainers) and route API calls through a Next.js API route to keep your secret key server-side. Prompt Bolt with 'Add an OpenAI chat interface with streaming responses' and it auto-generates the complete setup. Direct client-side OpenAI calls may fail due to WebContainer CORS headers — always use an API route.
Building AI-Powered Features in Bolt.new with OpenAI GPT
OpenAI's GPT API is the most-searched Bolt integration by a wide margin, with nearly 10,000 impressions per month from developers looking to add AI features to their Bolt-generated apps. The good news: the openai npm package is written in pure JavaScript and installs cleanly inside Bolt's WebContainer runtime. Bolt's own AI agent understands the OpenAI API deeply — prompting 'Add an OpenAI chat interface' triggers automatic code generation for the API route, streaming setup, and chat component.
The main complication comes from WebContainer's browser security headers. Bolt enforces Cross-Origin-Embedder-Policy (COEP) and Cross-Origin-Opener-Policy (COOP) headers that can cause direct client-side fetch calls to external APIs to fail. This isn't an OpenAI-specific problem — it affects any API that doesn't explicitly whitelist StackBlitz origins. The standard fix is routing all OpenAI API calls through a server-side API route, which runs in the WebContainer's Node.js process and has no CORS restrictions. This also correctly keeps your OPENAI_API_KEY on the server, preventing it from being bundled into client JavaScript.
Streaming is the feature that makes AI chat interfaces feel responsive. Instead of waiting 5-10 seconds for a full GPT-4o response, streaming returns tokens as they're generated, creating a natural typing effect. The OpenAI SDK's streaming support combined with the ReadableStream API makes this straightforward to implement — Bolt can generate the complete streaming setup from a single prompt.
Integration method
The OpenAI npm package is pure JavaScript and installs cleanly in Bolt's WebContainer. However, direct client-side OpenAI calls may fail due to WebContainer's Cross-Origin-Embedder-Policy headers. The recommended pattern routes all API calls through a Next.js API route or Supabase Edge Function, which keeps your secret key server-side and handles CORS correctly. Bolt's AI agent can generate the complete streaming chat integration from a single prompt.
Prerequisites
- An OpenAI account with an API key from platform.openai.com/api-keys
- Sufficient OpenAI API credits (new accounts get $5 free credit, GPT-4o-mini is cheapest at ~$0.15/1M input tokens)
- A Bolt.new project using Next.js (recommended for API routes) or Vite with a Supabase backend
- Basic understanding of React component state and async JavaScript
Step-by-step guide
Prompt Bolt to Generate the OpenAI Integration
Prompt Bolt to Generate the OpenAI Integration
Bolt's AI agent has deep knowledge of the OpenAI SDK and can generate a complete working integration from a single well-crafted prompt. The agent will install the openai package, create the API route file, set up streaming, and generate a chat UI component automatically. You don't need to write the boilerplate code yourself — your job is to describe what you want clearly. The most important detail to specify is streaming behavior: without it, your app will wait for the full response before displaying anything, which feels sluggish for longer answers. Also specify that you want the API key read from environment variables (not hardcoded), which Bolt handles correctly by default but it's good to be explicit. After Bolt generates the code, you'll see a message in the chat asking you to add your API key to the .env file.
Integrate OpenAI GPT into my app. Create a Next.js API route at /api/chat that accepts POST requests with a messages array. Use the OpenAI SDK with GPT-4o-mini model. Stream the response back to the client using a ReadableStream. Read the API key from process.env.OPENAI_API_KEY. Also create a ChatInterface React component that sends messages to this route, displays streaming responses with a typing cursor effect, and maintains conversation history in state. Handle loading and error states.
Paste this in Bolt.new chat
Pro tip: If you're on a limited OpenAI budget, specify gpt-4o-mini in your prompt — it's 30x cheaper than gpt-4o while being excellent for most chat applications.
Expected result: Bolt generates the API route file at app/api/chat/route.ts and a ChatInterface component. The terminal shows 'openai' being added to package.json. Bolt prompts you to add OPENAI_API_KEY to your .env file.
Add Your API Key to the .env File
Add Your API Key to the .env File
After Bolt generates the integration code, you need to add your real OpenAI API key to the .env file in your project root. In a Next.js project, server-side environment variables (without a NEXT_PUBLIC_ prefix) are only readable by API routes and server components — they never get bundled into client JavaScript. Your OPENAI_API_KEY must never have a NEXT_PUBLIC_ prefix. Find your API key at platform.openai.com/api-keys. If you don't have one yet, click 'Create new secret key' and give it a descriptive name like 'bolt-app'. Copy the key immediately — OpenAI only shows it once. Paste it into your .env file. You may also want to set an optional OPENAI_MAX_TOKENS variable to control response length and manage API costs. After saving .env, the Bolt preview should automatically restart the dev server and pick up the new variable.
1# .env — add your real API key here, never commit this file2OPENAI_API_KEY=sk-proj-your-api-key-here3OPENAI_MODEL=gpt-4o-mini4# Optional: limit response length to control costs5OPENAI_MAX_TOKENS=1000Pro tip: Create a separate API key for each project in the OpenAI dashboard so you can revoke access per-project without affecting other apps.
Expected result: The .env file contains your API key. The Bolt dev server restarts and the API route can now authenticate with OpenAI. Test by sending a message in the chat interface — you should see a streaming response appear.
Review and Understand the Streaming API Route
Review and Understand the Streaming API Route
Bolt generates the API route automatically, but understanding the streaming pattern helps you customize it. The route uses OpenAI's stream: true option combined with a ReadableStream response. Instead of waiting for the full completion, the API route creates a ReadableStream that yields text chunks as they arrive from OpenAI. The client reads this stream and appends each chunk to the displayed message. This is fundamentally different from a regular JSON response — the HTTP connection stays open until OpenAI finishes generating, and data flows incrementally. One important detail: the route must set appropriate response headers (Content-Type: text/plain or text/event-stream) so the browser knows to process it as a stream rather than waiting for the complete response body. Bolt handles this correctly in the generated code, but if you see responses that appear all at once after a delay, check that the streaming headers are set correctly.
Show me the generated API route at app/api/chat/route.ts and explain what each section does. Also show the ChatInterface component and explain how it reads the streaming response.
Paste this in Bolt.new chat
1import OpenAI from 'openai';2import { NextResponse } from 'next/server';34const openai = new OpenAI({5 apiKey: process.env.OPENAI_API_KEY,6});78export async function POST(request: Request) {9 try {10 const { messages } = await request.json();1112 const stream = await openai.chat.completions.create({13 model: process.env.OPENAI_MODEL || 'gpt-4o-mini',14 messages,15 stream: true,16 max_tokens: parseInt(process.env.OPENAI_MAX_TOKENS || '1000'),17 });1819 // Create a ReadableStream to pipe chunks to the client20 const readableStream = new ReadableStream({21 async start(controller) {22 const encoder = new TextEncoder();23 for await (const chunk of stream) {24 const text = chunk.choices[0]?.delta?.content || '';25 if (text) {26 controller.enqueue(encoder.encode(text));27 }28 }29 controller.close();30 },31 });3233 return new Response(readableStream, {34 headers: {35 'Content-Type': 'text/plain; charset=utf-8',36 'Transfer-Encoding': 'chunked',37 },38 });39 } catch (error: unknown) {40 const message = error instanceof Error ? error.message : 'OpenAI API error';41 return NextResponse.json({ error: message }, { status: 500 });42 }43}Expected result: You understand how the streaming API route works. The chat interface shows responses appearing token by token with a natural typing effect, rather than appearing all at once after a delay.
Add a System Prompt and Customize Behavior
Add a System Prompt and Customize Behavior
The system prompt is the most powerful way to customize how GPT responds. It defines the AI's persona, knowledge scope, response format, and behavioral constraints. You pass it as the first message in the messages array with role: 'system'. Good system prompts are specific about format (bullet points vs paragraphs, response length), scope (what topics the AI should and shouldn't discuss), and persona (formal vs casual, expert vs beginner-friendly). In Bolt apps, the system prompt is typically hardcoded in the API route for security — you don't want users to override it via client-side manipulation. For more dynamic applications, you can store system prompts in your database and load them server-side before passing to OpenAI. Test your system prompt thoroughly before deploying, as it significantly affects user experience.
Update the /api/chat route to include a system prompt that defines the AI's persona. The system prompt should be: 'You are a helpful assistant for [my app]. Be concise, friendly, and accurate. Format long answers with bullet points. Never discuss topics unrelated to [my app's domain].' Add it as the first message before the user's messages array.
Paste this in Bolt.new chat
1// In your API route, construct messages with system prompt2const messagesWithSystem = [3 {4 role: 'system' as const,5 content: `You are a helpful assistant. Be concise and accurate.6 Format responses with bullet points when listing multiple items.7 If you don't know something, say so rather than guessing.`,8 },9 ...messages, // user conversation messages from the request10];1112// Then pass messagesWithSystem to the OpenAI call13const stream = await openai.chat.completions.create({14 model: 'gpt-4o-mini',15 messages: messagesWithSystem,16 stream: true,17});Pro tip: Keep system prompts under 500 tokens to minimize API costs. Detailed instructions in the system prompt count as tokens on every request.
Expected result: The AI now responds within the constraints defined by your system prompt. Responses match the expected persona and format. Off-topic questions receive the appropriate redirect or refusal.
Deploy to Netlify and Configure Production Environment Variables
Deploy to Netlify and Configure Production Environment Variables
Testing in Bolt's WebContainer preview is useful for development, but you must deploy to confirm the integration works in production. During Bolt development, direct client-to-OpenAI calls may occasionally encounter CORS issues due to WebContainer's security headers — these problems disappear in deployment because API routes execute as true server-side functions outside the browser sandbox. To deploy via Netlify: connect your GitHub repository in Bolt's settings, push your code, then log into Netlify and add your environment variables in Site Settings → Environment Variables. Add OPENAI_API_KEY, OPENAI_MODEL, and OPENAI_MAX_TOKENS. For Vercel, add them in Project Settings → Environment Variables. Note that incoming webhooks from OpenAI (for features like Assistants API callback) cannot be received during WebContainer development — you need a deployed URL. For rate limit management in production, consider adding request debouncing in the chat component to prevent users from sending messages before the previous response completes.
Pro tip: Set OpenAI usage limits in the OpenAI dashboard under Settings → Limits to cap your monthly spending. A $10 or $20 monthly limit prevents unexpected bills from heavy usage.
Expected result: Your deployed app at [name].netlify.app shows a working chat interface. Messages send and receive streaming responses. The OpenAI API key is visible in Netlify's environment variables panel but never appears in browser DevTools network requests.
Common use cases
Customer Support Chat Interface
Replace a static FAQ page with an AI-powered chat interface that answers questions about your product. Use OpenAI's system prompt to provide the AI with your product documentation and restrict it to relevant topics. Streaming responses make the chat feel instant and natural.
Add an AI chat widget to my app. It should appear as a floating button in the bottom-right corner that opens a chat panel. Use the OpenAI API with GPT-4o-mini. The system prompt should say: 'You are a helpful customer support assistant for [my app name]. Answer questions about features, pricing, and troubleshooting. Keep responses concise and friendly. If asked something outside your knowledge, say so politely.' Stream responses token by token. Store the chat history in component state.
Copy this prompt to try it in Bolt.new
AI Content Generator
Build a content generation tool where users input a topic or brief and receive AI-generated blog posts, marketing copy, or product descriptions. The generation streams in real-time so users see progress immediately rather than waiting for the full output.
Create a content generator page. Users can select a content type (blog post, product description, social media post, email subject line), enter a topic or brief in a textarea, and click Generate. Call the OpenAI API with GPT-4o using a system prompt tailored to each content type. Stream the response into a readonly output textarea with a blinking cursor effect while generating. Include a Copy to Clipboard button. Add a word count display.
Copy this prompt to try it in Bolt.new
Document Summarizer and Q&A
Let users paste long text documents and ask questions about them using GPT-4o's large context window. This is useful for legal documents, research papers, or any situation where users need to quickly extract information from lengthy content.
Build a document Q&A tool. On the left side, there's a textarea where users paste a document (up to 50,000 characters). On the right side, there's a chat interface where users can ask questions about the document. For each question, include the full document text in the system prompt and stream the AI's answer. Show a character count on the document input and warn when approaching GPT-4o's context limit. Keep the Q&A conversation history visible.
Copy this prompt to try it in Bolt.new
Troubleshooting
CORS error in browser console: 'Failed to fetch' or 'blocked by CORS policy' when calling OpenAI
Cause: Direct client-side calls to api.openai.com fail because Bolt's WebContainer enforces COEP and COOP headers. The browser blocks cross-origin requests to APIs that don't explicitly whitelist StackBlitz's WebContainer origins.
Solution: Never call the OpenAI API directly from client-side React components. All OpenAI API calls must go through your Next.js API route (/api/chat) which runs server-side and has no CORS restrictions. Verify your fetch call in the client component targets /api/chat (relative URL) not https://api.openai.com directly.
1// Wrong — direct client-side call:2const response = await fetch('https://api.openai.com/v1/chat/completions', {...});34// Correct — through your API route:5const response = await fetch('/api/chat', {6 method: 'POST',7 headers: { 'Content-Type': 'application/json' },8 body: JSON.stringify({ messages }),9});API route returns 401 Unauthorized: 'Incorrect API key provided'
Cause: The OPENAI_API_KEY environment variable is missing, misspelled, or the key was revoked in the OpenAI dashboard.
Solution: Check your .env file for the exact variable name OPENAI_API_KEY. Verify the key is active in the OpenAI dashboard under API Keys. Restart the Bolt dev server after editing .env — Next.js caches environment variables at startup. In production on Netlify/Vercel, verify the environment variable is set in the hosting dashboard's environment variables section.
Streaming responses appear all at once after a delay instead of token by token
Cause: The client is waiting for the complete response before rendering, either because the response isn't being read as a stream or the component updates aren't happening incrementally.
Solution: Ensure the client uses a ReadableStream reader loop to process chunks as they arrive. The API route must set Transfer-Encoding: chunked or Content-Type: text/event-stream headers. Verify the component calls setState on each chunk to trigger a re-render for each incoming token.
1// Client-side streaming read loop:2const response = await fetch('/api/chat', { method: 'POST', body: JSON.stringify({ messages }) });3const reader = response.body!.getReader();4const decoder = new TextDecoder();56while (true) {7 const { done, value } = await reader.read();8 if (done) break;9 const text = decoder.decode(value, { stream: true });10 setCurrentMessage(prev => prev + text); // triggers re-render per chunk11}OpenAI API returns 429: 'You exceeded your current quota'
Cause: Your OpenAI account has run out of credits or hit a rate limit. Free trial credits ($5) expire after 3 months even if unused.
Solution: Check your usage and billing at platform.openai.com/usage. Add a payment method if your trial credits are exhausted. For rate limits (too many requests per minute), add a debounce to the chat send button to prevent rapid successive requests. Consider switching to gpt-4o-mini which has higher rate limits and lower costs than gpt-4o.
Best practices
- Never call the OpenAI API from client-side code — always proxy through an API route to protect your secret key and avoid CORS issues in WebContainers
- Use gpt-4o-mini for most use cases and only upgrade to gpt-4o when you need higher reasoning quality — mini is 30x cheaper and sufficient for chat, summarization, and content generation
- Set a system prompt in the API route (not client-side) to define AI behavior — users cannot override server-side system prompts through client manipulation
- Implement streaming from the start rather than adding it later — non-streaming GPT-4o responses can take 5-15 seconds for longer outputs, making the UI feel broken
- Add a max_tokens limit to prevent runaway responses from consuming excessive API credits, especially in production with real users
- Store conversation history in your database rather than just component state so users can return to previous conversations
- Set usage limits in the OpenAI dashboard and configure billing alerts to prevent unexpected charges from heavy usage or prompt injection attacks
- Test your system prompt thoroughly for jailbreak resistance — adversarial users will try to bypass topic restrictions and persona constraints
Alternatives
IBM Watson offers enterprise-grade AI with stronger compliance certifications (HIPAA, GDPR) and is better suited for regulated industries that can't use OpenAI.
DeepAI provides simpler text generation APIs with lower pricing tiers, suitable for basic AI features without OpenAI's premium capabilities or cost.
Azure OpenAI Service offers the same GPT models through Microsoft's infrastructure, ideal for teams already using Azure with compliance requirements.
Google's Gemini API provides strong multimodal capabilities and competitive pricing, especially for tasks involving vision alongside text.
Frequently asked questions
Does the OpenAI SDK work in Bolt's WebContainer?
Yes, the openai npm package is pure JavaScript and installs cleanly in Bolt's WebContainer. However, direct client-side calls to api.openai.com may fail due to WebContainer's COEP/COOP security headers. The correct pattern is to route all OpenAI calls through a server-side API route, which has no CORS restrictions and keeps your API key secure.
Can I use OpenAI's Assistants API or fine-tuned models in Bolt?
Yes, the Assistants API and fine-tuned models work through the same API route pattern. The Assistants API does require webhook callbacks for async operations (run completion events), which need a deployed URL — you can't receive those webhooks in the Bolt preview. For development, use polling (checking run status repeatedly) instead of webhooks, then switch to webhooks after deploying.
How do I prevent my OpenAI integration from getting too expensive?
Use gpt-4o-mini instead of gpt-4o (30x cheaper), set max_tokens limits in the API route, add a monthly spending limit in the OpenAI dashboard under Settings → Limits, and implement rate limiting in your API route to prevent abuse. Store system prompts efficiently and trim conversation history to keep context windows small.
How do I add conversation memory so the AI remembers previous messages?
Pass the full conversation history as the messages array to the API route on each request. Store messages in React state on the client, appending each user message and AI response. For persistent memory across sessions, save the messages array to your database (Supabase or Bolt Database) tied to the user's session ID and load it on page mount.
Can I use GPT-4o's vision features to analyze images in Bolt?
Yes. Upload images to S3 or Supabase Storage first (to get a public URL), then pass the URL in the message content array with type: 'image_url'. Vision analysis works through the same API route pattern. The image processing happens server-side at OpenAI, so no additional WebContainer limitations apply.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation