Add document scanning to FlutterFlow using the cunning_document_scanner or document_scanner_flutter package via Custom Actions. The scanner opens the camera with automatic edge detection and perspective correction. Send the scanned image to Google Cloud Vision API via a Cloud Function for OCR text extraction. Store the image in Firebase Storage and save the URL plus OCR text to Firestore for searchable document management.
Build a document scanner with OCR and cloud storage in FlutterFlow
A document scanner in FlutterFlow works in three stages: capture (open the camera with edge detection to auto-crop the document), process (send the image to an OCR API to extract the text), and store (upload the image to Firebase Storage and save the URL and text to Firestore). The FlutterFlow app handles the capture and triggers the upload. A Cloud Function handles the OCR to keep your Vision API key off the client. The resulting Firestore document is searchable by extracted text, filename, and tags — enabling a mobile document management system without any third-party paid services.
Prerequisites
- A FlutterFlow project connected to Firebase (Firestore + Storage)
- Google Cloud Vision API enabled in your Firebase project's Google Cloud Console
- Camera permission configured in FlutterFlow Settings → App Details → Permissions
- A physical iOS or Android device for testing — document scanning cannot be tested in the emulator
Step-by-step guide
Add the document scanner package to Pubspec Dependencies
Add the document scanner package to Pubspec Dependencies
Go to Custom Code → Pubspec Dependencies → Add Dependency. Add cunning_document_scanner with version ^1.0.5. This package provides a full-screen camera view with real-time edge detection, perspective correction, and brightness adjustment. After saving, also add image ^4.1.3 for optional image resizing before upload. Click Compile Code to verify. For iOS, you also need to add a camera usage description — in FlutterFlow, go to Settings → App Details → iOS → Info.plist Entries and add NSCameraUsageDescription with value 'Required for document scanning'.
Expected result: Packages compile successfully and camera permission is configured for iOS.
Create the Scan Document Custom Action
Create the Scan Document Custom Action
Go to Custom Code → Custom Actions → Add Action. Name it scanDocument. Import: import 'package:cunning_document_scanner/cunning_document_scanner.dart';. Call final pictures = await CunningDocumentScanner.getPictures(noOfPages: 5, isGalleryImportAllowed: true);. This opens the scanner UI — the user photographs the document, the app applies perspective correction, and the function returns a List<String> of file paths to the scanned images. Return the first path as a String or iterate for multi-page documents. In the FlutterFlow Action Flow that calls scanDocument, store the returned path in a Page State variable scannedImagePath, then show an Image widget bound to that path so the user can review before uploading.
Expected result: Tapping the Scan button opens the camera scanner, and the captured image appears in a preview widget after the user confirms the scan.
Upload the scanned image to Firebase Storage
Upload the scanned image to Firebase Storage
After the user confirms the scan preview, trigger an upload. In FlutterFlow, add an Action Flow to the Confirm Upload button: use the Backend Call action → Upload File to Firebase Storage. Pass the scannedImagePath from Page State. Set the upload path to documents/{userId}/{timestamp}.jpg. The Upload File action returns the download URL — store it in Page State variable documentUrl. Alternatively, create a Custom Action named uploadScannedDoc that uses firebase_storage package directly for more control: final ref = FirebaseStorage.instance.ref('documents/${userId}/${DateTime.now().millisecondsSinceEpoch}.jpg'); await ref.putFile(File(scannedImagePath)); final url = await ref.getDownloadURL();. Return the URL from the Custom Action.
Expected result: The scanned image uploads to Firebase Storage and the download URL is stored in Page State for use in the next step.
Run OCR via Google Cloud Vision API in a Cloud Function
Run OCR via Google Cloud Vision API in a Cloud Function
In your Firebase project, create a Cloud Function named extractDocumentText (Node.js). The function receives: imageUrl, userId, documentTitle. It fetches the image from the Storage URL, sends it to Vision API: const [result] = await client.textDetection(imageUrl); const text = result.fullTextAnnotation.text;. Then creates a Firestore document in the documents collection: { userId, title: documentTitle, imageUrl, ocrText: text, createdAt: admin.firestore.FieldValue.serverTimestamp(), tags: [] }. In FlutterFlow, call this Cloud Function via an API Group: API Manager → Add API Group → your Cloud Functions base URL. Add an API Call named extractDocumentText with POST method, JSON body containing imageUrl, userId, documentTitle. Trigger this API Call after the successful Firebase Storage upload.
Expected result: After upload, the Cloud Function creates a Firestore document with the extracted OCR text visible in Firebase Console → Firestore.
Build the searchable document library
Build the searchable document library
Create a DocumentLibrary page. Add a TextField named searchField with On Text Change triggering a Page State update for searchQuery. Add a Backend Query on the page body bound to the documents collection filtered by userId == currentUser.uid, ordered by createdAt descending. Add a ListView using Generate Dynamic Children. Each document card shows: an Image widget bound to item.imageUrl, a Text for item.title, a Text showing a truncated item.ocrText preview (first 100 chars). For search: add a Conditional to the Backend Query that changes the filter based on searchQuery — if searchQuery is not empty, filter by ocrText containing the query using Firestore's array-contains or a startsWith condition on a tags array you populate in the Cloud Function by splitting key words from the OCR text.
Expected result: The document library displays all scanned documents for the current user, and typing in the search field filters results in real time.
Complete working example
1// Cloud Function: extractDocumentText2// Deploy with: firebase deploy --only functions:extractDocumentText3const functions = require('firebase-functions');4const admin = require('firebase-admin');5const vision = require('@google-cloud/vision');67const visionClient = new vision.ImageAnnotatorClient();89exports.extractDocumentText = functions.https.onRequest(async (req, res) => {10 if (req.method !== 'POST') {11 return res.status(405).send('Method Not Allowed');12 }1314 const { imageUrl, userId, documentTitle } = req.body;1516 if (!imageUrl || !userId) {17 return res.status(400).json({ error: 'imageUrl and userId required' });18 }1920 try {21 // Run OCR on the image URL22 const [result] = await visionClient.textDetection(imageUrl);23 const fullText = result.fullTextAnnotation24 ? result.fullTextAnnotation.text25 : '';2627 // Extract keywords for search tags (first 20 words, lowercase)28 const tags = [...new Set(29 fullText30 .split(/\s+/)31 .filter(w => w.length > 3)32 .map(w => w.toLowerCase().replace(/[^a-z0-9]/g, ''))33 .filter(Boolean)34 .slice(0, 20)35 )];3637 // Save to Firestore38 const docRef = await admin.firestore()39 .collection('documents')40 .add({41 userId,42 title: documentTitle || 'Untitled Document',43 imageUrl,44 ocrText: fullText,45 tags,46 createdAt: admin.firestore.FieldValue.serverTimestamp(),47 });4849 return res.status(200).json({50 documentId: docRef.id,51 ocrText: fullText,52 tags,53 });54 } catch (error) {55 console.error('OCR error:', error);56 return res.status(500).json({ error: 'OCR processing failed' });57 }58});Common mistakes
Why it's a problem: Sending the full-resolution scanned image to the Vision API
How to avoid: Resize the scanned image to a maximum width of 1024 pixels in the Custom Action before upload using the image package: final img = decodeImage(File(path).readAsBytesSync()); final resized = copyResize(img, width: 1024); final resizedBytes = encodeJpg(resized, quality: 85);. Write the resized bytes to a new temp file and upload that instead.
Why it's a problem: Calling the Vision API directly from the FlutterFlow app
How to avoid: Route all Vision API calls through a Cloud Function. The function runs server-side with credentials from its Google Cloud service account — no API key is needed in the client at all.
Why it's a problem: Storing the raw unprocessed image path from the scanner without uploading
How to avoid: Upload the scanned image to Firebase Storage immediately after the scan completes (Step 3). Use the permanent Firebase Storage URL for all downstream operations including display and OCR.
Best practices
- Show a loading indicator during the upload and OCR process — a combined scan, upload, and OCR flow typically takes 3-8 seconds and users will assume the app is frozen without feedback
- Allow users to rename documents after scanning — the default title from OCR is often the first line of text, which may not be meaningful
- Add a tags field to the Firestore document schema and allow users to manually add tags from the document detail page for better organization beyond OCR text search
- Store the page count in the Firestore document — multi-page scans should reference an array of Storage URLs rather than a single imageUrl field
- Implement a delete flow that removes both the Firestore document AND the Firebase Storage file — orphaned Storage files accumulate costs silently
- Cache recently viewed document thumbnails locally using the FlutterFlow cached network image widget to avoid re-downloading large images on every visit
- Request camera permissions with an explanation screen before opening the scanner — explain why scanning requires camera access to improve permission grant rates on iOS
Still stuck?
Copy one of these prompts to get a personalized, step-by-step explanation.
I am building a document scanning app in FlutterFlow using the cunning_document_scanner Flutter package. Write a Dart Custom Action that: (1) calls CunningDocumentScanner.getPictures() to open the scanner, (2) takes the first returned image path, (3) resizes it to max 1024px wide using the image package, (4) returns the path to the resized image. Include imports and error handling for the case where the user cancels the scan.
Create a Scan Document button action flow in my FlutterFlow app. When tapped: call the scanDocument custom action to get the image path, store it in Page State scannedImagePath, show an Image widget preview bound to scannedImagePath, and show a Confirm and Upload button that calls a Cloud Function API call with the image URL and current user ID.
Frequently asked questions
What Flutter package should I use for document scanning in FlutterFlow?
The two most reliable options are cunning_document_scanner (^1.0.5) and document_scanner_flutter (^1.0.7). Both provide automatic edge detection and perspective correction. Add either one in Custom Code → Pubspec Dependencies. If one has a version conflict with FlutterFlow's current Flutter SDK, try the other. Both require camera permission configured in Settings → App Details.
Can I scan multiple pages into one document?
Yes. The cunning_document_scanner getPictures function accepts a noOfPages parameter that controls how many pages can be scanned in one session. Set it to 10 for multi-page documents. The function returns a List<String> of file paths. Upload each to Firebase Storage at documents/{userId}/{docId}/page_{n}.jpg and store the full array of URLs in a pages array field in the Firestore document.
How accurate is the OCR text extraction?
Google Cloud Vision API's TEXT_DETECTION achieves over 95% accuracy on clearly printed text in good lighting. Handwriting accuracy is lower (70-85%) depending on legibility. For best results, ensure the document is flat, well-lit, and fills at least 60% of the camera frame. The cunning_document_scanner's automatic perspective correction significantly improves OCR accuracy compared to a raw photo.
How do I search documents by their content in FlutterFlow?
Firestore does not support full-text substring search natively. The approach in this tutorial uses keyword tags extracted from OCR text and stored as an array field — you can query with arrayContains for single keywords. For more powerful search (partial words, relevance ranking), add Algolia to your Cloud Function: after creating the Firestore document, also index it in Algolia using the algoliasearch npm package. Then add an Algolia search API call in FlutterFlow's API Manager.
Can I scan documents without an internet connection?
The camera scanning and image capture work offline. The OCR step requires internet (Cloud Function calls Vision API). To support offline workflows, scan and store the image locally using App State, then trigger the OCR Cloud Function call when connectivity is restored. Firebase's offline persistence can queue Firestore writes while offline and sync when the connection resumes.
What if I need help building a more complex document management system?
Document management systems with folder hierarchies, version history, shared access, digital signatures, and full-text search involve significant backend and FlutterFlow architecture work. RapidDev has built document management apps in FlutterFlow for legal, healthcare, and field service clients. If your requirements go beyond basic scan-and-store, reach out for a scoping discussion.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation