Add image recognition to a FlutterFlow app using either on-device processing with the google_mlkit_image_labeling package (works offline, no API cost) or cloud-based Google Cloud Vision API via a Cloud Function (higher accuracy, returns bounding boxes and text). Capture an image with a Custom Action using camera or ImagePicker, send it to the recognition API, and display results as a list of labels with confidence scores. Only show labels with confidence above 70% to keep results user-friendly.
Add AI-powered image recognition to any FlutterFlow app
Image recognition lets your app understand what is in a photo — identifying objects, scenes, plants, animals, products, or custom categories trained for your domain. FlutterFlow supports two approaches: on-device ML using Google's MLKit (runs locally on the device, no internet required, no API cost) and cloud ML using Google Cloud Vision API (higher accuracy, supports text extraction, object localization, logo detection, and face detection, but requires internet and a Cloud Function proxy). Most apps start with on-device for basic classification and add the cloud API when they need more sophisticated detection.
Prerequisites
- A FlutterFlow project with Standard plan or higher (Custom Code required)
- A Google Cloud project with the Cloud Vision API enabled (for the cloud option)
- Basic understanding of Custom Actions and Custom Widgets in FlutterFlow
Step-by-step guide
Option A: On-device image labeling with MLKit
Option A: On-device image labeling with MLKit
In FlutterFlow, go to Custom Code → Pubspec Dependencies → Add Dependency: google_mlkit_image_labeling: ^0.11.0 and image_picker: ^1.0.4. Create a Custom Action named classifyImageOnDevice. The action: (1) opens the image picker with ImagePicker().pickImage(source: ImageSource.gallery or camera), (2) creates an InputImage from the file path, (3) runs ImageLabeler with ImageLabelerOptions(confidenceThreshold: 0.5), (4) processes the image with imageLabeler.processImage(inputImage) to get a List<ImageLabel>, (5) filters to labels with confidence > 0.7, (6) sorts by confidence descending, (7) converts to a list of strings formatted as 'Dog (87%)' and stores in App State variable recognitionResults (List of String). Call this Custom Action from a Button's onTap action. Display App State recognitionResults in a ListView with one Text widget per item.
1// Custom Action: classifyImageOnDevice2import 'package:image_picker/image_picker.dart';3import 'package:google_mlkit_image_labeling/google_mlkit_image_labeling.dart';45Future classifyImageOnDevice() async {6 final picker = ImagePicker();7 final pickedFile = await picker.pickImage(8 source: ImageSource.gallery,9 maxWidth: 1024,10 maxHeight: 1024,11 );12 if (pickedFile == null) return;1314 final inputImage = InputImage.fromFilePath(pickedFile.path);15 final labeler = ImageLabeler(16 options: ImageLabelerOptions(confidenceThreshold: 0.5)17 );18 19 final labels = await labeler.processImage(inputImage);20 labeler.close();21 22 // Filter to useful labels, format nicely23 final results = labels24 .where((label) => label.confidence > 0.7)25 .toList()26 ..sort((a, b) => b.confidence.compareTo(a.confidence));27 28 final formatted = results29 .take(5) // top 5 labels30 .map((l) => '${l.label} (${(l.confidence * 100).toStringAsFixed(0)}%)')31 .toList();32 33 // Store in App State: recognitionResults (List<String>)34 FFAppState().update(() {35 FFAppState().recognitionResults = formatted;36 FFAppState().classifiedImagePath = pickedFile.path;37 });38}Expected result: Tapping the classify button opens the image picker, processes the image on-device, and populates App State with labels like 'Dog (94%)', 'Animal (91%)', 'Golden Retriever (87%)'.
Option B: Google Cloud Vision API via a Cloud Function
Option B: Google Cloud Vision API via a Cloud Function
Deploy a Cloud Function named analyzeImage that accepts a base64-encoded image and a detection type (LABEL_DETECTION, OBJECT_LOCALIZATION, TEXT_DETECTION, or LOGO_DETECTION). The function calls the Vision API POST https://vision.googleapis.com/v1/images:annotate with your API key. For label detection, it returns labels with descriptions and scores. For object localization, it returns object names with bounding polygon coordinates. Enable the Cloud Vision API in Google Cloud Console → APIs and Services → Enable. Store the API key in Cloud Function environment variables as VISION_API_KEY. In FlutterFlow, add an API Call to your Cloud Functions API Group for analyzeImage. After the Custom Action captures the image and encodes it to base64, call the Cloud Function API, then store the returned labels array in App State.
1// Cloud Function: analyzeImage2const functions = require('firebase-functions');3const fetch = require('node-fetch');45exports.analyzeImage = functions.https.onCall(async (data, context) => {6 const { imageBase64, detectionType = 'LABEL_DETECTION' } = data;7 const apiKey = process.env.VISION_API_KEY;8 9 const response = await fetch(10 `https://vision.googleapis.com/v1/images:annotate?key=${apiKey}`,11 {12 method: 'POST',13 headers: { 'Content-Type': 'application/json' },14 body: JSON.stringify({15 requests: [{16 image: { content: imageBase64 },17 features: [18 { type: detectionType, maxResults: 10 },19 ]20 }]21 })22 }23 );24 25 const result = await response.json();26 const annotations = result.responses[0];27 28 if (detectionType === 'LABEL_DETECTION') {29 return {30 labels: (annotations.labelAnnotations || []).map(l => ({31 name: l.description,32 confidence: Math.round(l.score * 100)33 }))34 };35 }36 37 if (detectionType === 'OBJECT_LOCALIZATION') {38 return {39 objects: (annotations.localizedObjectAnnotations || []).map(o => ({40 name: o.name,41 confidence: Math.round(o.score * 100),42 bounds: o.boundingPoly.normalizedVertices43 }))44 };45 }46 47 return { raw: annotations };48});Expected result: The Cloud Function returns structured label or object data from Cloud Vision API, ready to display in a FlutterFlow ListView.
Build the image recognition UI in FlutterFlow
Build the image recognition UI in FlutterFlow
Create an ImageRecognition page. Add an Image widget bound to App State classifiedImagePath (initially show a placeholder). Below the image, add a Row with two Buttons: 'From Camera' and 'From Gallery' — each calls the classifyImageOnDevice Custom Action with the appropriate source. Add a CircularProgressIndicator that shows while the action runs (bind to App State isProcessing boolean). Below that, add a Column with a Text header 'Detected:' and a ListView bound to App State recognitionResults (List of String). Each list item: a Container with a Text widget displaying the label string (already formatted as 'Dog (87%)') and a LinearProgressIndicator showing the confidence numerically — use a Custom Function to parse the percentage from the string. Add a 'Classify Another' button that clears App State recognitionResults and classifiedImagePath.
Expected result: A complete image recognition screen where users can capture or select a photo and see labeled results with confidence indicators.
Add a bounding box overlay for object detection results
Add a bounding box overlay for object detection results
When using OBJECT_LOCALIZATION, Cloud Vision returns normalized vertex coordinates for each detected object's bounding polygon. To draw these overlays in FlutterFlow, create a Custom Widget named BoundingBoxOverlay. It accepts the image file path and a list of detected objects (name, confidence, normalized bounds). The widget uses a Stack: an Image widget at the bottom, and a CustomPaint widget on top that draws colored rectangles scaled to the image dimensions. The normalizedVertices coordinates (0.0 to 1.0) are multiplied by the image width and height to get pixel coordinates. Each rectangle has a label drawn above it with the object name and confidence. This Custom Widget replaces the plain Image widget on the recognition results page.
Expected result: Detected objects are highlighted with colored bounding boxes and labels overlaid on top of the original image.
Complete working example
1IMAGE RECOGNITION IN FLUTTERFLOW — BOTH OPTIONS23OPTION A: ON-DEVICE (MLKit)4├── Package: google_mlkit_image_labeling + image_picker5├── No API key, no internet required, works offline6├── Custom Action: classifyImageOnDevice7│ ├── ImagePicker → pick from gallery or camera8│ ├── ImageLabeler.processImage(inputImage)9│ ├── Filter: confidence > 0.710│ ├── Format: 'Label (87%)'11│ └── Store in App State: recognitionResults12├── Accuracy: good for general categories13└── Supports: labels only (no text, no bounding boxes)1415OPTION B: CLOUD VISION API16├── Cloud Function: analyzeImage17│ ├── Accepts: imageBase64, detectionType18│ ├── Calls: vision.googleapis.com/v1/images:annotate19│ └── Returns: labels[] or objects[] with bounds20├── Supports: LABEL_DETECTION, OBJECT_LOCALIZATION,21│ TEXT_DETECTION, FACE_DETECTION, LOGO_DETECTION22├── API key: stored in CF env var VISION_API_KEY23└── Cost: $1.50/1000 images (LABEL), $4.50/1000 (OBJECT)2425FLUTTERFLOW UI:26├── Image widget (bound to App State: classifiedImagePath)27├── Row: 'Camera' button + 'Gallery' button28│ └── Both call classifyImageOnDevice Custom Action29├── CircularProgressIndicator (while processing)30└── ListView (App State: recognitionResults)31 └── Each item: label text + confidence bar3233RESULT DISPLAY RULES:34├── Only show labels with confidence > 70%35├── Sort by confidence descending36├── Show top 5 labels maximum37├── Format: 'Golden Retriever (94%)' not '0.9412'38└── No confidence bar needed if label includes %3940BOUNDING BOX OVERLAY (Custom Widget):41├── Stack: Image + CustomPaint overlay42├── normalizedVertices × image_size = pixel coordinates43└── Each box: colored border + name/confidence labelCommon mistakes
Why it's a problem: Displaying raw confidence scores like 0.8723 directly to users
How to avoid: Format confidence as a human-friendly percentage: 'Dog (87%)'. Only show labels above 70% confidence. Sort by confidence descending and show a maximum of 5 results. This makes the feature feel intuitive and accurate.
Why it's a problem: Sending full-resolution images (4-8MB) to the Vision API on every classification
How to avoid: Resize images to a maximum of 1024 pixels on the longest side before encoding to base64. Set maxWidth: 1024, maxHeight: 1024 in the ImagePicker call. This reduces image size by 80-95% with no meaningful accuracy loss.
Why it's a problem: Calling the Vision API from FlutterFlow directly with the API key in the request header
How to avoid: All Vision API calls must go through a Cloud Function. The Flutter app sends the base64 image to the Cloud Function URL, and the Cloud Function uses the API key stored as an environment variable. Never put Vision API keys in FlutterFlow.
Best practices
- Start with on-device MLKit (google_mlkit_image_labeling) for basic label detection — it is free, works offline, and requires no API key
- Switch to Google Cloud Vision when you need text extraction (OCR), face detection, logo recognition, or higher accuracy on specialized categories
- Always resize images to max 1024px before sending to any vision API to reduce upload time and API cost by 80-95%
- Filter results to labels with confidence above 70% and show a maximum of 5 results — more labels with lower confidence create noise, not value
- Store recognition results in App State so users can navigate away and return without re-processing the image
- Add a clear 'Classify Another' button that resets the results — do not let stale results from a previous image persist when a new image is selected
- Test image recognition across diverse lighting conditions, image angles, and compressed images typical of mobile photo sharing — accuracy degrades with poor image quality
Still stuck?
Copy one of these prompts to get a personalized, step-by-step explanation.
I am building a FlutterFlow app that needs to classify images taken by the user's camera. Explain the difference between using google_mlkit_image_labeling on-device versus Google Cloud Vision API via a Cloud Function. Which should I use for a [product identification / plant recognition / general object detection] use case? Show me the code for whichever you recommend, including how to display results as user-friendly labels with confidence percentages.
Add AI image recognition to my FlutterFlow app. Create a Custom Action called classifyImageOnDevice that uses image_picker to let the user select a photo from the gallery and google_mlkit_image_labeling to classify it. Filter results to labels with confidence above 70%, format them as 'Label (XX%)' strings, sort by confidence descending, and store the top 5 in App State. Create an ImageRecognition page with an Image widget and a ListView displaying the results.
Frequently asked questions
How do I add image recognition to a FlutterFlow app?
Add the google_mlkit_image_labeling and image_picker packages via Custom Code → Pubspec Dependencies. Create a Custom Action that picks an image, runs the MLKit labeler, filters results above 70% confidence, and stores formatted label strings in App State. Display the App State list in a ListView. This works offline with no API key. For higher accuracy or object localization, add a Cloud Function that calls Google Cloud Vision API and call it from FlutterFlow.
Does FlutterFlow have built-in image recognition features?
FlutterFlow does not have a built-in image recognition widget. You implement it through Custom Code: a Custom Action for capturing and processing images, and either the google_mlkit_image_labeling package for on-device inference or a Cloud Function call for the Google Cloud Vision API. Both approaches are fully supported within FlutterFlow's Custom Code system on the Standard plan and above.
What is the difference between Google MLKit and Google Cloud Vision for image recognition?
MLKit (google_mlkit_image_labeling) runs on the device using a pre-downloaded TensorFlow Lite model. It works offline, is free, processes images in under 200ms, but only returns general category labels and has lower accuracy for specialized domains. Google Cloud Vision runs in Google's data centers, requires internet, costs $1.50 per 1,000 images for label detection, processes in 1-3 seconds, and supports specialized detection types: text (OCR), object localization with bounding boxes, faces, logos, landmarks, explicit content.
Can I train a custom image recognition model for my specific use case?
Yes. Google Cloud AutoML Vision lets you train a custom model by uploading labeled images of your categories (minimum 100 images per class). Alternatively, use Google Vertex AI to train a TensorFlow Lite model for on-device deployment. For on-device custom models, add the tflite_flutter package as a Custom Widget in FlutterFlow and load your .tflite model from assets. Custom models are useful for domain-specific recognition like product identification, plant species, defect detection, or any category not covered by the general MLKit labeler.
How much does Google Cloud Vision API cost for image recognition?
Google Cloud Vision API pricing (March 2026): LABEL_DETECTION — first 1,000 units/month free, then $1.50 per 1,000 images. OBJECT_LOCALIZATION — $4.50 per 1,000 images. TEXT_DETECTION (OCR) — $1.50 per 1,000. FACE_DETECTION — $1.50 per 1,000. For an app with 1,000 monthly active users each classifying 5 images per month, that is 5,000 label detection calls = $6 per month. The on-device MLKit alternative is completely free.
What if I need help building a product scanning or plant identification feature?
Domain-specific image recognition — where you need a model trained on your product catalog, specific plant species, or custom defect categories — involves model training, evaluation, and deployment beyond what general MLKit or Vision API provides. RapidDev can build custom image recognition features in FlutterFlow, including model training on Google Vertex AI, Cloud Function integration, and the FlutterFlow UI for capture and display.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation