Build an AI-powered resume parser in Lovable where candidates upload PDF resumes, an Edge Function extracts text and sends it to the Anthropic API for structured extraction, the parsed data is stored as JSONB in Supabase, and a side-by-side view shows the original PDF next to the AI-extracted profile — with manual correction support and bulk processing for recruiters.
What you're building
Resume parsing is a two-step problem: first extract the text from a PDF file, then extract structured information from unformatted text. This build handles both.
PDF text extraction runs inside the Deno Edge Function using the pdf-parse library loaded via esm.sh. The Edge Function receives the PDF binary as a base64 string, decodes it, runs pdf-parse to get raw text, then sends that text to the Anthropic API with a structured extraction prompt. The prompt asks Claude to return a specific JSON schema: contact info, work experience (company, title, dates, bullets), education (institution, degree, year), skills (array of strings), and certifications.
Claude returns the structured JSON. The Edge Function validates the JSON shape and stores it in the candidates.parsed_data JSONB column alongside the original resume path. If any field is missing, the Edge Function fills it with null rather than failing — partial extraction is better than no extraction.
The side-by-side view uses an iframe to render the PDF from a signed Supabase Storage URL. The right panel renders the parsed_data JSONB as editable form fields. Recruiters can correct any field that Claude got wrong. Saving the corrections updates the JSONB in place using Supabase's jsonb_set function.
Final result
A resume parser where AI does the extraction work, recruiters correct the inevitable errors in a clean UI, and the final structured data is searchable and queryable for candidate sourcing.
Tech stack
Prerequisites
- Lovable Pro account for Edge Function generation (required)
- Supabase project with SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY in Cloud tab → Secrets
- Anthropic API key stored as ANTHROPIC_API_KEY in Cloud tab → Secrets
- Understanding of resume data structures: what fields you want to extract
- Sample PDF resumes to test extraction quality during development
Build steps
Create the candidates schema with JSONB fields
Ask Lovable to create the schema that stores both the raw resume reference and the structured parsed data. The JSONB column gives flexibility to store varying resume structures without rigid schema migration.
1Create a resume parser schema in Supabase.23Tables:4- candidates: id, created_by (references auth.users), first_name (text, extracted), last_name (text, extracted), email (text, extracted), phone (text), location (text), linkedin_url (text), resume_path (text, Supabase Storage path), resume_file_name (text), parsed_data (jsonb), parse_status ('pending' | 'processing' | 'completed' | 'failed'), parse_error (text, nullable), parsed_at (timestamptz), manually_reviewed (bool default false), notes (text), created_at56The parsed_data JSONB should follow this structure:7{8 contact: { name, email, phone, location, linkedin, github },9 summary: string,10 experience: [{ company, title, location, start_date, end_date, is_current, bullets: string[] }],11 education: [{ institution, degree, field, graduation_year, gpa }],12 skills: string[],13 certifications: [{ name, issuer, year }],14 languages: string[]15}1617Create a GIN index on candidates.parsed_data for fast JSONB queries.18Create a full-text search index: CREATE INDEX idx_candidates_fts ON candidates USING gin(to_tsvector('english', coalesce(first_name,'') || ' ' || coalesce(last_name,'') || ' ' || coalesce(parsed_data::text,''))).1920Storage: create a private bucket 'resumes'. RLS: users can upload and view their own files.2122RLS on candidates: authenticated users SELECT/INSERT/UPDATE their own rows (created_by = auth.uid()).Pro tip: Add a job_id column (references jobs, nullable) to candidates so parsed resumes can be attached to specific job openings. This turns the parser into the intake layer of a full ATS. The candidates table then links naturally to your pipeline stages.
Expected result: The candidates table is created with the parsed_data JSONB column. GIN and full-text indexes are in place. The resumes bucket exists. TypeScript types are generated.
Build the PDF upload component
Ask Lovable to create the upload UI with progress feedback. The file goes to Supabase Storage and a pending candidate record is created before parsing begins.
1Build a resume upload page at src/pages/UploadResume.tsx.23Requirements:4- A drag-and-drop upload zone (DnD). Accept only application/pdf. Show file name and size after selection.5- Single file upload flow:6 1. User selects or drops a PDF7 2. Show a progress bar while uploading to supabase.storage.from('resumes').upload()8 3. File path: resumes/{userId}/{Date.now()}-{sanitizedFilename}9 4. After upload, INSERT into candidates: resume_path, resume_file_name, parse_status='processing', created_by=auth.uid()10 5. Call the parse-resume Edge Function with the candidate id and storage path11 6. Show a Spinner with 'Analyzing resume...' while the Edge Function runs (typically 5-15 seconds)12 7. On success: navigate to /candidates/{id} to see the side-by-side view13 8. On failure: show an Alert with the error, set parse_status='failed'14- Bulk upload mode (toggle at top):15 - Allow multiple file selection16 - Show each file as a Card in a queue with individual status badges17 - Process files sequentially (not in parallel) to avoid rate limits18 - After all files are processed, show a summary: X parsed, Y failedExpected result: Single file upload creates a candidate record and triggers parsing. The progress UI correctly shows upload progress, then the AI analysis spinner, then redirects on success. Bulk mode processes files one by one with per-file status.
Build the resume parsing Edge Function
This is the core AI logic. The Edge Function receives a storage path, fetches the PDF, extracts text, and calls Claude for structured extraction. Ask Lovable to generate it from this specification.
1// supabase/functions/parse-resume/index.ts2import { serve } from 'https://deno.land/std@0.168.0/http/server.ts'3import { createClient } from 'https://esm.sh/@supabase/supabase-js@2'4import Anthropic from 'https://esm.sh/@anthropic-ai/sdk'56const corsHeaders = {7 'Access-Control-Allow-Origin': '*',8 'Access-Control-Allow-Headers': 'authorization, x-client-info, apikey, content-type',9 'Content-Type': 'application/json',10}1112serve(async (req: Request) => {13 if (req.method === 'OPTIONS') return new Response('ok', { headers: corsHeaders })1415 const supabase = createClient(16 Deno.env.get('SUPABASE_URL') ?? '',17 Deno.env.get('SUPABASE_SERVICE_ROLE_KEY') ?? ''18 )19 const anthropic = new Anthropic({ apiKey: Deno.env.get('ANTHROPIC_API_KEY') })2021 try {22 const { candidate_id, storage_path } = await req.json()2324 // Download the PDF from Storage25 const { data: fileData, error: fileError } = await supabase.storage26 .from('resumes')27 .download(storage_path)28 if (fileError) throw new Error(`Storage download failed: ${fileError.message}`)2930 const arrayBuffer = await fileData.arrayBuffer()31 const base64 = btoa(String.fromCharCode(...new Uint8Array(arrayBuffer)))3233 // Use Claude's document vision to parse the PDF directly34 const response = await anthropic.messages.create({35 model: 'claude-opus-4-5',36 max_tokens: 4096,37 messages: [{38 role: 'user',39 content: [40 {41 type: 'document',42 source: { type: 'base64', media_type: 'application/pdf', data: base64 },43 },44 {45 type: 'text',46 text: `Extract all resume information and return ONLY valid JSON matching this exact schema (use null for missing fields):47{48 "contact": { "name": string, "email": string, "phone": string, "location": string, "linkedin": string|null, "github": string|null },49 "summary": string|null,50 "experience": [{"company": string, "title": string, "location": string|null, "start_date": string, "end_date": string|null, "is_current": boolean, "bullets": string[]}],51 "education": [{"institution": string, "degree": string, "field": string|null, "graduation_year": number|null, "gpa": string|null}],52 "skills": string[],53 "certifications": [{"name": string, "issuer": string|null, "year": number|null}],54 "languages": string[]55}56Return ONLY the JSON object, no explanation or markdown.`,57 },58 ],59 }],60 })6162 const rawJson = response.content[0].type === 'text' ? response.content[0].text.trim() : '{}'63 const parsed = JSON.parse(rawJson)6465 const contactName = parsed.contact?.name ?? ''66 const [first_name, ...rest] = contactName.split(' ')67 const last_name = rest.join(' ')6869 await supabase.from('candidates').update({70 parsed_data: parsed,71 first_name: first_name || null,72 last_name: last_name || null,73 email: parsed.contact?.email || null,74 phone: parsed.contact?.phone || null,75 location: parsed.contact?.location || null,76 linkedin_url: parsed.contact?.linkedin || null,77 parse_status: 'completed',78 parsed_at: new Date().toISOString(),79 }).eq('id', candidate_id)8081 return new Response(JSON.stringify({ success: true }), { headers: corsHeaders })82 } catch (err) {83 const message = err instanceof Error ? err.message : 'Parse failed'84 await supabase.from('candidates').update({85 parse_status: 'failed',86 parse_error: message,87 }).eq('id', (await req.json().catch(() => ({}))).candidate_id)88 return new Response(JSON.stringify({ error: message }), { status: 500, headers: corsHeaders })89 }90})Pro tip: Claude's native PDF support (document content type) reads the PDF layout directly, including tables and multi-column formats. This is significantly more accurate than extracting text first with a PDF library. Reserve the text-extraction fallback for PDFs that fail the document approach.
Expected result: The Edge Function downloads the PDF, sends it to Claude with the structured extraction prompt, parses the JSON response, and updates the candidates table with all extracted fields. The candidate's parse_status changes to 'completed'.
Build the side-by-side review view
The key UX of this app: original PDF on the left, AI-extracted data on the right, with inline editing. Ask Lovable to build this as the candidate detail page.
1Build a candidate review page at src/pages/CandidateReview.tsx (route: /candidates/:id).23Layout: two-column at 50/50 width on desktop, stacked on mobile.45Left panel:6- Fetch a signed URL for the candidate's resume_path from supabase.storage.from('resumes').createSignedUrl(path, 3600)7- Render the PDF in an <iframe> with the signed URL8- Below the iframe: a Download Button and an 'Upload New Version' Button910Right panel:11- If parse_status = 'processing': show a Spinner and poll every 3 seconds12- If parse_status = 'failed': show an Alert with parse_error and a Retry Button13- If parse_status = 'completed': show the parsed_data as editable fields:14 - Contact section (Card): editable Inputs for name, email, phone, location, linkedin15 - Summary (Card): editable Textarea16 - Experience (Card per job): each entry shows company/title/dates. Clicking 'Edit' opens an inline form. A trash Button removes the entry. An 'Add Experience' Button adds a new empty entry.17 - Education (Card per entry): same pattern as Experience18 - Skills: a multi-value Input with removable Badges. Type a skill and press Enter to add.19 - Certifications: editable list20- A floating 'Save Changes' Button that appears when any field is edited (tracks dirty state with useForm)21- On save: UPDATE candidates SET parsed_data = {updated json}, manually_reviewed = true22- Show a 'Mark as Reviewed' Button that sets manually_reviewed = trueExpected result: The side-by-side view renders the PDF and parsed data simultaneously. Editing any field marks the form as dirty. Save updates the JSONB. The manually_reviewed flag is set when the recruiter confirms the data.
Build the candidate search and list view
Recruiters need to search across all parsed resumes. Ask Lovable to build the candidate list with full-text search and JSONB filtering.
1Build a candidates list page at src/pages/Candidates.tsx.23Requirements:4- A DataTable with columns: Name (link to /candidates/:id), Email, Location, Parse Status Badge, Skills (first 3 as small Badges + '+N more'), Experience Years (calculate from parsed_data.experience dates), Reviewed Badge (checkmark if manually_reviewed), Upload Date5- Full-text search Input above table: search across name, email, location, and skills. Use Supabase's textSearch or ilike queries against parsed_data::text for broad matching.6- Filter panel (collapsible):7 - Skills filter: type a skill to filter candidates who have it in parsed_data->>'skills'8 - Location filter: text Input9 - Min years experience: number Select10 - Review status: all / reviewed / unreviewed11 - Date range picker for upload date12- Clicking a row navigates to the candidate review page13- Bulk actions checkbox: select multiple candidates, then 'Delete' (with confirmation Dialog) or 'Export as CSV'14- Stats row above table: total candidates, pending review count, parse failures countExpected result: The candidate list shows all uploaded resumes with their parse status. The search field filters results dynamically. Skills filtering works against JSONB data. Bulk delete with confirmation works.
Complete code
1import { serve } from 'https://deno.land/std@0.168.0/http/server.ts'2import { createClient } from 'https://esm.sh/@supabase/supabase-js@2'3import Anthropic from 'https://esm.sh/@anthropic-ai/sdk'45const cors = {6 'Access-Control-Allow-Origin': '*',7 'Access-Control-Allow-Headers': 'authorization, x-client-info, apikey, content-type',8 'Content-Type': 'application/json',9}1011const PROMPT = `Extract all resume information and return ONLY valid JSON (no markdown).12Schema: { "contact": { "name": string, "email": string|null, "phone": string|null, "location": string|null, "linkedin": string|null, "github": string|null }, "summary": string|null, "experience": [{"company": string, "title": string, "location": string|null, "start_date": string, "end_date": string|null, "is_current": boolean, "bullets": string[]}], "education": [{"institution": string, "degree": string, "field": string|null, "graduation_year": number|null, "gpa": string|null}], "skills": string[], "certifications": [{"name": string, "issuer": string|null, "year": number|null}], "languages": string[] }`1314serve(async (req: Request) => {15 if (req.method === 'OPTIONS') return new Response('ok', { headers: cors })1617 const supabase = createClient(Deno.env.get('SUPABASE_URL') ?? '', Deno.env.get('SUPABASE_SERVICE_ROLE_KEY') ?? '')18 const anthropic = new Anthropic({ apiKey: Deno.env.get('ANTHROPIC_API_KEY') })19 let candidate_id: string | undefined2021 try {22 const body = await req.json()23 candidate_id = body.candidate_id24 const { storage_path } = body25 if (!candidate_id || !storage_path)26 return new Response(JSON.stringify({ error: 'candidate_id and storage_path required' }), { status: 400, headers: cors })2728 const { data: fileData, error: fileError } = await supabase.storage.from('resumes').download(storage_path)29 if (fileError) throw new Error(`Storage error: ${fileError.message}`)3031 const base64 = btoa(String.fromCharCode(...new Uint8Array(await fileData.arrayBuffer())))3233 const message = await anthropic.messages.create({34 model: 'claude-opus-4-5', max_tokens: 4096,35 messages: [{ role: 'user', content: [36 { type: 'document', source: { type: 'base64', media_type: 'application/pdf', data: base64 } },37 { type: 'text', text: PROMPT },38 ] }],39 })4041 const textContent = message.content.find(c => c.type === 'text')42 if (!textContent || textContent.type !== 'text') throw new Error('No text in Claude response')43 const cleaned = textContent.text.replace(/^```json\s*/i, '').replace(/\s*```$/, '').trim()44 const parsed = JSON.parse(cleaned)4546 const fullName: string = parsed.contact?.name ?? ''47 const spaceIdx = fullName.indexOf(' ')48 const first_name = spaceIdx > -1 ? fullName.slice(0, spaceIdx) : fullName49 const last_name = spaceIdx > -1 ? fullName.slice(spaceIdx + 1) : null5051 await supabase.from('candidates').update({52 parsed_data: parsed, first_name: first_name || null, last_name: last_name || null,53 email: parsed.contact?.email ?? null, phone: parsed.contact?.phone ?? null,54 location: parsed.contact?.location ?? null, linkedin_url: parsed.contact?.linkedin ?? null,55 parse_status: 'completed', parsed_at: new Date().toISOString(), parse_error: null,56 }).eq('id', candidate_id)5758 return new Response(JSON.stringify({ success: true }), { headers: cors })59 } catch (err) {60 const msg = err instanceof Error ? err.message : 'Unknown error'61 if (candidate_id) await supabase.from('candidates').update({ parse_status: 'failed', parse_error: msg }).eq('id', candidate_id)62 return new Response(JSON.stringify({ error: msg }), { status: 500, headers: cors })63 }64})Customization ideas
Job matching score
Add a jobs table with required_skills (text array) and a minimum_experience_years column. Create a Postgres function match_candidate_to_job(candidate_id, job_id) that returns a score (0-100) based on skill overlap and experience years. Display match scores in the candidate list when filtered to a specific job opening.
Batch reparse with improved prompts
As you refine your extraction prompt, older resumes parsed with earlier versions may have worse quality. Add a 'Reparse Selected' action in the candidate list that re-queues candidates through the parse-resume Edge Function. Track prompt_version in the candidates table to know which version extracted which records.
LinkedIn profile enrichment
After parsing, if a LinkedIn URL is found, trigger a secondary Edge Function that fetches the LinkedIn profile data via a scraping service. Merge additional professional details (recent connections, endorsements, recommendations) into the parsed_data JSONB. Show a 'Enrich from LinkedIn' button on the candidate review page.
ATS pipeline integration
Add a pipeline_stages table (applied, screening, interview, offer, hired, rejected) and a stage_assignments table linking candidates to stages for each job. Add a Kanban board view where columns are pipeline stages and cards are candidates. Drag a candidate card from one column to another to update their stage.
Candidate email outreach
Add an outreach_templates table with email subjects and bodies using {{variables}} for personalization. Build a compose view that lets recruiters select a template, preview it with the candidate's parsed data substituted, and send via Resend. Track sent_at and open/reply status in a candidate_communications table.
Common pitfalls
Pitfall: Calling the AI parse synchronously and blocking the upload response
How to avoid: Create the candidate record with parse_status='processing' immediately after upload, return success to the client, then trigger parsing asynchronously. The client polls or subscribes via Realtime to the parse_status column to know when parsing is complete.
Pitfall: Not handling Claude's JSON response containing markdown code fences
How to avoid: Strip markdown fences before parsing: const cleaned = text.replace(/^```json\s*/i, '').replace(/\s*```$/, '').trim(). Then JSON.parse(cleaned). This handles both fenced and non-fenced responses.
Pitfall: Storing large base64-encoded PDFs in the Supabase request payload
How to avoid: Store the PDF in Supabase Storage and pass only the storage path to the Edge Function. The Edge Function downloads the PDF directly from Storage using the service role key. This keeps the request payload small and uses Supabase's optimized internal file transfer.
Pitfall: Not validating the JSON schema returned by the AI
How to avoid: After JSON.parse(), validate the structure: check that experience is an array, that contact is an object. Fill missing fields with null rather than failing. Log the raw Claude response in a separate column for debugging extraction quality over time.
Best practices
- Process parsing asynchronously. Create the candidate record immediately with status='processing', trigger the Edge Function fire-and-forget, and let the client poll or subscribe to status updates. This keeps the UI responsive during AI processing.
- Store the raw Claude response alongside the parsed JSONB during development. This makes debugging extraction failures much faster — you can see exactly what the model returned before your parsing code processed it.
- Use Claude's native PDF document support rather than extracting text first. Claude reads PDF layout including multi-column formats, tables, and headers that text extraction libraries flatten incorrectly.
- Track prompt_version and model version in the candidates table so you can identify when a model change affects extraction quality across your candidate database.
- Implement a manual review workflow: every parsed resume should be reviewed by a human before being used for candidate decisions. The manually_reviewed flag and side-by-side view support this workflow.
- Add error recovery: if parsing fails, allow recruiters to retry with a single button. Log the error for debugging. Common failures are malformed PDFs, password-protected files, and scanned image-only PDFs where text extraction returns nothing.
- Index the parsed_data JSONB column with a GIN index for performant skill and keyword queries. Without it, searching for all candidates with a specific skill requires a full table scan.
AI prompts to try
Copy these prompts to build this project faster.
I'm building a resume parser where a Deno Edge Function receives a PDF file as base64 and calls the Anthropic API with Claude's native document support to extract structured data. Help me write the extraction prompt that returns a consistent JSON schema with contact info, work experience array, education array, skills array, and certifications. The prompt should instruct Claude to use null for missing fields, never invent data, and return only the JSON object without markdown formatting.
Add a skills analytics page at /analytics. Show a bar chart of the top 20 skills across all candidates (count of candidates with each skill, sorted descending). Fetch by querying SELECT jsonb_array_elements_text(parsed_data->'skills') as skill, COUNT(*) as count FROM candidates WHERE parse_status='completed' GROUP BY skill ORDER BY count DESC LIMIT 20. Also show a treemap or word cloud of locations from parsed_data->'contact'->>'location'. Add filters: uploaded in the last 30/90/180 days.
In Supabase, write a SQL function search_candidates(p_search_text text, p_skills text[], p_min_experience_years int) that returns candidate rows matching: 1) full-text search on name, email, location using to_tsvector, 2) each skill in p_skills exists in parsed_data->'skills' array (using jsonb ? operator), 3) experience years estimated from the most recent experience entry's start_date (use current year minus start year). Return candidates matching ALL provided filters. Use a CTE for readability.
Frequently asked questions
How accurate is Claude at extracting resume data?
Claude achieves roughly 90-95% accuracy on well-formatted digital PDFs. Accuracy drops on scanned resumes (image-only PDFs), unusual layouts, and non-English resumes. The side-by-side review view exists precisely because you should expect and account for extraction errors. Always have a human review parsed data before using it for hiring decisions.
What if the resume is a scanned image PDF with no selectable text?
Claude's document support handles image-based PDFs by using its vision capabilities to read the content. This works reasonably well for clean scans. For very low-quality scans or handwritten resumes, accuracy degrades. Set a minimum file quality expectation in your upload instructions and reject obviously unreadable files with a clear error message.
How much does it cost to parse one resume with Claude?
A typical resume is 500-1500 tokens of text. With the extraction prompt (~300 tokens) and JSON output (~800 tokens), a single parse call uses roughly 2,000-3,000 tokens total. At Claude claude-sonnet-4-6 pricing ($3/$15 per million in/out tokens), one parse costs approximately $0.01-$0.02. For 1,000 resumes per month, that is $10-20 in API costs.
Can I parse resumes in formats other than PDF?
The Edge Function can be extended to handle .docx files. Use the mammoth library (available via esm.sh) to convert .docx to HTML or plain text, then send the text to Claude using a text content block instead of the document block. Accept additional MIME types in the file input: application/vnd.openxmlformats-officedocument.wordprocessingml.document.
How do I handle confidential resumes — should candidates know they are being processed by AI?
This depends on your jurisdiction and use case. In the EU, GDPR requires transparency about automated processing of personal data. In hiring contexts, candidates should be informed that AI tools are used in the screening process. Add a disclosure in your job application form or terms of service. Store only what you need and implement a data deletion policy.
What happens when the Claude API is unavailable?
The Edge Function catches API errors and updates parse_status='failed' with the error message. The UI shows a Retry button. For production systems, implement exponential backoff: if parsing fails, queue a retry after 60 seconds, then 5 minutes, then 30 minutes. Store retry_count in the candidates table to stop after 3 attempts.
Can I use a different AI model instead of Claude?
Yes. Replace the Anthropic client with the OpenAI SDK and use GPT-4o, which also supports vision and document inputs. The extraction prompt works similarly. Store OPENAI_API_KEY in Cloud tab → Secrets instead of ANTHROPIC_API_KEY. The rest of the Edge Function structure stays the same.
Is there help building a full ATS or recruitment platform?
RapidDev builds Lovable apps with complete recruitment workflows including job board integration, interview scheduling, offer management, and HRIS sync. Reach out if you need features beyond resume parsing.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation