Skip to main content
RapidDev - Software Development Agency
flutterflow-tutorials

How to Implement Voice Authentication in FlutterFlow

Voice authentication in FlutterFlow works by recording a passphrase during enrollment, sending the audio to a Cloud Function that extracts a voice embedding via a speaker recognition API, and storing the embedding in Firestore. At login, a new recording is compared to the stored embedding. Use random challenge words to prevent replay attacks. Always combine with a second factor — voice can be recorded and replayed.

What you'll learn

  • How to record audio in FlutterFlow using a custom plugin and upload it to Firebase Storage
  • How to call a speaker recognition API from a Cloud Function to extract voice embeddings
  • How to implement random challenge phrases to prevent voice replay attacks
  • Why voice authentication must always be a second factor, never a standalone authentication method
Book a free consultation
4.9Clutch rating
600+Happy partners
17+Countries served
190+Team members
Intermediate10 min read90-120 minFlutterFlow Pro+ (code export required for microphone custom plugin); Firebase Blaze plan for Cloud FunctionsMarch 2026RapidDev Engineering Team
TL;DR

Voice authentication in FlutterFlow works by recording a passphrase during enrollment, sending the audio to a Cloud Function that extracts a voice embedding via a speaker recognition API, and storing the embedding in Firestore. At login, a new recording is compared to the stored embedding. Use random challenge words to prevent replay attacks. Always combine with a second factor — voice can be recorded and replayed.

How Voice Biometric Authentication Works

Voice authentication is not about recognizing what you say — it is about recognizing who is speaking, regardless of the words. The process has two phases. Enrollment: the user reads a passphrase, the audio is analyzed to extract a voice embedding (a numerical vector representing the unique acoustic fingerprint of their voice), and this embedding is stored securely in Firestore. Verification: at login, the user reads a randomly selected challenge phrase. The audio is analyzed to produce a new embedding, which is compared to the stored enrollment embedding using cosine similarity. If the similarity exceeds a threshold, verification passes. The random challenge phrase prevents replay attacks where a recording of the enrollment passphrase is played back. Because voice can be recorded and deepfaked, voice authentication should always be one factor in a multi-factor flow, not the only factor.

Prerequisites

  • FlutterFlow Pro plan with code export (microphone recording requires a custom plugin)
  • Firebase project with Firestore, Storage, and Cloud Functions (Blaze plan)
  • A speaker recognition API account (e.g., Microsoft Azure Speaker Recognition, or AssemblyAI speaker diarization)
  • Understanding of Firebase Authentication — voice auth supplements, not replaces, standard Firebase Auth

Step-by-step guide

1

Record audio using a custom Flutter plugin

FlutterFlow does not have a built-in microphone recorder widget, so you need to add the record package (pub.dev/packages/record) as a custom dependency. In FlutterFlow's Custom Code panel add record to your pubspec.yaml dependencies. Create a Custom Action called startRecording that initializes the recorder, requests microphone permission, and starts recording to a temporary file path. Create a matching stopRecording action that stops the recorder and returns the file path. Connect these to a hold-to-speak button in FlutterFlow: startRecording on button press down, stopRecording on button press up. Display a waveform animation (a simple animated Container that pulses) while recording is active to give visual feedback.

audioRecorder.dart
1// Custom Action: audioRecorder.dart
2import 'package:record/record.dart';
3import 'package:path_provider/path_provider.dart';
4
5final AudioRecorder _recorder = AudioRecorder();
6String? _currentPath;
7
8Future<void> startRecording() async {
9 final dir = await getTemporaryDirectory();
10 _currentPath = '${dir.path}/voice_sample_${DateTime.now().millisecondsSinceEpoch}.m4a';
11 if (await _recorder.hasPermission()) {
12 await _recorder.start(
13 const RecordConfig(encoder: AudioEncoder.aacLc, sampleRate: 16000),
14 path: _currentPath!,
15 );
16 }
17}
18
19Future<String?> stopRecording() async {
20 await _recorder.stop();
21 return _currentPath;
22}

Expected result: Holding the record button captures an audio file at the temporary path returned by stopRecording.

2

Upload the audio to Firebase Storage securely

After recording stops, upload the audio file to Firebase Storage under a path scoped to the user: voice_samples/{userId}/{timestamp}.m4a. In FlutterFlow use the Upload File action with the file path returned by stopRecording. Set the Storage path dynamically using the current user's UID to ensure users cannot access each other's recordings. Set the Storage security rule to allow read and write only to the file owner: allow read, write: if request.auth.uid == userId (extracted from the path). After upload, retrieve the download URL and pass it to the next step. Delete the temporary local file after successful upload to avoid accumulating audio files on the device.

Expected result: The audio file is in Firebase Storage under the correct user-scoped path and the download URL is returned.

3

Extract a voice embedding via a Cloud Function

Create an onCall Cloud Function called extractVoiceEmbedding. The function accepts the Storage download URL, downloads the audio, and calls your chosen speaker recognition API (e.g., Azure Cognitive Services Speaker Recognition or a self-hosted resemblyzer model). The API returns a voice embedding — an array of floats representing the acoustic fingerprint. For enrollment, store this embedding in the user's Firestore document under voiceEmbedding (Array of Numbers) along with enrollmentTimestamp. For verification, the function computes the cosine similarity between the new embedding and the stored one and returns a score between 0 and 1. A score above 0.85 typically indicates the same speaker.

functions/voiceAuth.js
1// functions/voiceAuth.js
2const { onCall } = require('firebase-functions/v2/https');
3const { getFirestore } = require('firebase-admin/firestore');
4const { getStorage } = require('firebase-admin/storage');
5const axios = require('axios');
6
7function cosineSimilarity(a, b) {
8 const dot = a.reduce((sum, val, i) => sum + val * b[i], 0);
9 const magA = Math.sqrt(a.reduce((s, v) => s + v * v, 0));
10 const magB = Math.sqrt(b.reduce((s, v) => s + v * v, 0));
11 return dot / (magA * magB);
12}
13
14exports.enrollVoice = onCall(async (request) => {
15 const { audioUrl } = request.data;
16 const uid = request.auth?.uid;
17 if (!uid) throw new Error('Unauthenticated');
18 // Call speaker recognition API
19 const response = await axios.post(
20 `${process.env.SPEAKER_API_URL}/enroll`,
21 { audioUrl },
22 { headers: { 'Ocp-Apim-Subscription-Key': process.env.SPEAKER_API_KEY } }
23 );
24 const embedding = response.data.embedding;
25 await getFirestore().collection('users').doc(uid).update({
26 voiceEmbedding: embedding,
27 voiceEnrolledAt: new Date(),
28 });
29 return { success: true };
30});
31
32exports.verifyVoice = onCall(async (request) => {
33 const { audioUrl, challengeWord } = request.data;
34 const uid = request.auth?.uid;
35 if (!uid) throw new Error('Unauthenticated');
36 const userDoc = await getFirestore().collection('users').doc(uid).get();
37 const storedEmbedding = userDoc.data()?.voiceEmbedding;
38 if (!storedEmbedding) throw new Error('Voice not enrolled');
39 const response = await axios.post(
40 `${process.env.SPEAKER_API_URL}/verify`,
41 { audioUrl, challengeWord },
42 { headers: { 'Ocp-Apim-Subscription-Key': process.env.SPEAKER_API_KEY } }
43 );
44 const newEmbedding = response.data.embedding;
45 const score = cosineSimilarity(storedEmbedding, newEmbedding);
46 return { verified: score > 0.85, score };
47});

Expected result: enrollVoice stores the embedding in Firestore. verifyVoice returns a verified boolean and similarity score.

4

Implement random challenge words to prevent replay attacks

A replay attack uses a recording of the legitimate user's enrollment passphrase to pass verification. Prevent this by making the user read a randomly selected challenge phrase at each verification attempt. Store a list of 20-30 short challenge phrases in a Firestore config document (challenges/words). In FlutterFlow, when the verification screen opens, call a Custom Action that fetches a random phrase from the list and displays it. The user reads the displayed phrase aloud. Your Cloud Function receives both the audio URL and the expected challenge word and can validate that the recorded audio contains the correct phrase (using speech recognition before the speaker verification step). Rotate the challenge list weekly.

Expected result: Each verification session displays a unique random phrase. Replaying a previous recording fails because it does not contain the current challenge phrase.

5

Wire voice verification as a second factor in the login flow

In FlutterFlow, implement voice authentication as a step after standard Firebase email/password or social login. After the user successfully authenticates with their password, navigate to a VoiceVerification page. This page shows the random challenge phrase, the hold-to-speak button, and a 'Verify' button. On tap: record audio, upload to Storage, call the verifyVoice Cloud Function, and check the returned verified boolean. If true, update a Firestore session field voiceMfaVerified to true and navigate to the main app. If false, show an error and allow 2 more attempts before locking the session. Check voiceMfaVerified in your Firestore security rules for any sensitive data access.

Expected result: Users must complete both password authentication and voice verification to access sensitive app features.

Complete working example

functions/voiceAuth.js
1const { onCall } = require('firebase-functions/v2/https');
2const { getFirestore, FieldValue } = require('firebase-admin/firestore');
3const { initializeApp } = require('firebase-admin/app');
4const axios = require('axios');
5
6initializeApp();
7
8function cosineSimilarity(a, b) {
9 if (!a || !b || a.length !== b.length) return 0;
10 const dot = a.reduce((sum, val, i) => sum + val * b[i], 0);
11 const magA = Math.sqrt(a.reduce((s, v) => s + v * v, 0));
12 const magB = Math.sqrt(b.reduce((s, v) => s + v * v, 0));
13 if (magA === 0 || magB === 0) return 0;
14 return dot / (magA * magB);
15}
16
17async function getEmbedding(audioUrl, endpoint) {
18 const response = await axios.post(
19 `${process.env.SPEAKER_API_URL}/${endpoint}`,
20 { audioUrl },
21 { headers: { 'Ocp-Apim-Subscription-Key': process.env.SPEAKER_API_KEY },
22 timeout: 15000 }
23 );
24 return response.data.embedding;
25}
26
27exports.enrollVoice = onCall(async (request) => {
28 const uid = request.auth?.uid;
29 if (!uid) throw new Error('Unauthenticated');
30 const { audioUrl } = request.data;
31 const embedding = await getEmbedding(audioUrl, 'enroll');
32 await getFirestore().collection('users').doc(uid).update({
33 voiceEmbedding: embedding,
34 voiceEnrolledAt: FieldValue.serverTimestamp(),
35 voiceEnrollmentVersion: 1,
36 });
37 return { success: true };
38});
39
40exports.verifyVoice = onCall(async (request) => {
41 const uid = request.auth?.uid;
42 if (!uid) throw new Error('Unauthenticated');
43 const { audioUrl } = request.data;
44 const [newEmbedding, userDoc] = await Promise.all([
45 getEmbedding(audioUrl, 'verify'),
46 getFirestore().collection('users').doc(uid).get(),
47 ]);
48 const storedEmbedding = userDoc.data()?.voiceEmbedding;
49 if (!storedEmbedding) throw new Error('Voice not enrolled');
50 const score = cosineSimilarity(storedEmbedding, newEmbedding);
51 const threshold = 0.85;
52 const verified = score >= threshold;
53 // Log attempt for security audit
54 await getFirestore().collection('voiceAuthAttempts').add({
55 uid, verified, score, timestamp: FieldValue.serverTimestamp(),
56 });
57 return { verified, score: Math.round(score * 100) / 100 };
58});

Common mistakes

Why it's a problem: Using voice authentication as the only authentication factor

How to avoid: Always combine voice authentication with another factor: password, email OTP, or a hardware key. Treat voice as 'something you are' — one of two required factors.

Why it's a problem: Using a fixed passphrase for all voice verification sessions

How to avoid: Display a randomly selected challenge phrase at each verification session. Rotate the challenge list regularly. Optionally verify the spoken words match the challenge using speech recognition before running speaker verification.

Why it's a problem: Storing the raw audio recording permanently in Firebase Storage

How to avoid: Delete the raw audio file from Storage immediately after the embedding is extracted by the Cloud Function. Store only the mathematical embedding vector, which cannot be used to reconstruct the original voice.

Best practices

  • Enroll users with at least 3 recording samples taken on different days to build a robust voice model that handles variation in background noise and vocal conditions.
  • Log all voice authentication attempts (verified or failed) with timestamp and score in a Firestore audit collection for security review.
  • Implement account lockout after 3 consecutive failed voice verifications — without rate limiting, brute-force audio attacks are possible.
  • Inform users clearly in your privacy policy that you collect and process voice biometric data, and obtain explicit consent before enrollment.
  • Provide an easy way for users to delete their voice enrollment and use a fallback authentication method instead.
  • Test voice verification in different acoustic environments (quiet room, open office, coffee shop) to tune your similarity threshold appropriately.
  • Use RapidDev to implement and audit your voice authentication flow if this is for a regulated industry — biometric authentication in healthcare or finance requires additional compliance review.

Still stuck?

Copy one of these prompts to get a personalized, step-by-step explanation.

ChatGPT Prompt

I want to implement voice biometric authentication as a second factor in a FlutterFlow app. Explain the full technical architecture: how voice embeddings work, how to prevent replay attacks with challenge phrases, what speaker recognition APIs are available, and how to implement this securely in a Firebase Cloud Function that integrates with FlutterFlow.

FlutterFlow Prompt

Write a Firebase Cloud Function in Node.js that accepts an audio file URL and a user ID, calls a speaker recognition API to extract a voice embedding, stores the embedding in Firestore during enrollment, and computes cosine similarity against the stored embedding during verification. Return a verified boolean and similarity score.

Frequently asked questions

What speaker recognition API should I use with FlutterFlow?

Microsoft Azure Cognitive Services Speaker Recognition is the most mature option with a generous free tier (10,000 verifications/month). AssemblyAI also provides speaker diarization. For self-hosted, the open-source resemblyzer library (Python) runs in a Cloud Function but requires more setup.

How accurate is voice authentication?

Modern speaker recognition achieves 95-99% accuracy in quiet environments. Accuracy drops significantly in noisy environments (open offices, outdoor) and for users with respiratory illness. This is why voice should never be a standalone factor — accuracy insufficient for security-critical decisions.

Can voice authentication work offline in FlutterFlow?

Not with the architecture described here. Cloud-based speaker recognition APIs require network access. For offline voice verification you need a local machine learning model, which requires code export from FlutterFlow and a significantly more complex implementation.

Is storing voice embeddings subject to biometric data privacy laws?

Yes. Mathematical voice embeddings derived from biometric voice recordings are classified as biometric data under GDPR, CCPA, and BIPA in Illinois. You must: obtain explicit informed consent before enrollment, provide a right to deletion, disclose the processing purpose in your privacy policy, and implement appropriate security controls.

How long does voice enrollment and verification take?

Enrollment requires a 3-5 second recording for the user. Server-side embedding extraction typically takes 1-3 seconds via a cloud API. Verification (record + upload + compare) end-to-end takes roughly 4-6 seconds on a good connection. Optimize by streaming audio to the function rather than waiting for the full recording.

What should I do if a user's voice changes (illness, aging)?

Implement periodic re-enrollment prompts (e.g., every 6 months) or when verification fails multiple times in a row. Allow fallback authentication via email OTP for users who cannot complete voice verification. Never lock a user out permanently based on voice alone.

RapidDev

Talk to an Expert

Our team has built 600+ apps. Get personalized help with your project.

Book a free consultation

Need help with your project?

Our experts have built 600+ apps and can accelerate your development. Book a free consultation — no strings attached.

Book a free consultation

We put the rapid in RapidDev

Need a dedicated strategic tech and growth partner? Discover what RapidDev can do for your business! Book a call with our team to schedule a free, no-obligation consultation. We'll discuss your project and provide a custom quote at no cost.