Skip to main content
RapidDev - Software Development Agency
flutterflow-tutorials

How to Implement a Voice Command System in FlutterFlow

Build a voice command system in FlutterFlow by using speech_to_text to convert speech to text, matching the result against a command dictionary using contains() or fuzzy matching instead of exact string comparison, and using flutter_tts to give spoken feedback. A floating mic button starts and stops listening. Exact string matching fails for natural speech variations — always use keyword-based or fuzzy matching.

What you'll learn

  • How to set up the speech_to_text package for voice recognition in a FlutterFlow Custom Action
  • How to build a command dictionary and match spoken phrases using keyword matching instead of exact strings
  • How to provide spoken feedback using flutter_tts so users know their command was understood
  • How to build a floating microphone button UI that shows listening state visually
Book a free consultation
4.9Clutch rating
600+Happy partners
17+Countries served
190+Team members
Beginner10 min read35-45 minFlutterFlow Pro+ (code export required for custom packages)March 2026RapidDev Engineering Team
TL;DR

Build a voice command system in FlutterFlow by using speech_to_text to convert speech to text, matching the result against a command dictionary using contains() or fuzzy matching instead of exact string comparison, and using flutter_tts to give spoken feedback. A floating mic button starts and stops listening. Exact string matching fails for natural speech variations — always use keyword-based or fuzzy matching.

Building a Voice-Controlled Interface in FlutterFlow

A voice command system lets users navigate and control your app hands-free by speaking natural commands like 'go to home' or 'show my orders'. The core challenge is that speech recognition never returns the exact string you expect — background noise, accents, and natural speech patterns mean 'show orders' might be transcribed as 'show my orders', 'show the orders', or 'show order'. Exact string matching handles none of these variations. Instead, you need keyword-based matching that triggers on the presence of key words regardless of surrounding words. This tutorial covers the complete pipeline from microphone capture to spoken feedback, with a command dictionary approach that handles real-world speech naturally.

Prerequisites

  • FlutterFlow Pro plan with code export enabled
  • speech_to_text and flutter_tts packages added to pubspec.yaml
  • Microphone permission configured in iOS Info.plist and Android AndroidManifest.xml
  • Basic familiarity with FlutterFlow Custom Actions and App State variables

Step-by-step guide

1

Add the speech_to_text and flutter_tts packages

In FlutterFlow, go to Settings → Pubspec Dependencies and add speech_to_text: ^6.6.2 and flutter_tts: ^3.8.5. Then configure the required platform permissions. In FlutterFlow's Settings → App Permissions panel, enable Microphone. For iOS, this adds the NSMicrophoneUsageDescription and NSSpeechRecognitionUsageDescription keys to Info.plist automatically. For Android, the RECORD_AUDIO and INTERNET permissions are added to AndroidManifest.xml. The speech_to_text package requires network access on Android because it sends audio to Google's on-device speech recognition service.

Expected result: Both packages resolve in pubspec.yaml and the app builds successfully on both iOS and Android simulators.

2

Create the startListening and stopListening Custom Actions

Create two Custom Actions. The first, startListening, initialises a SpeechToText instance, calls speechToText.initialize(), and then calls speechToText.listen() with an onResult callback. In the callback, when finalResult is true, store the recognised text in a non-persisted App State variable named lastVoiceCommand. The second action, stopListening, calls speechToText.stop() and updates a Boolean App State variable isListening to false. Store the SpeechToText instance as a static variable so both actions share the same instance. Call startListening from the mic button's On Press action and stopListening from the same button when isListening is true.

speech_actions.dart
1// Custom Action pair: startListening + stopListening
2import 'package:speech_to_text/speech_to_text.dart';
3
4// Shared instance
5final _speech = SpeechToText();
6bool _isInitialized = false;
7
8// Action 1: startListening
9// No parameters. Updates App State: lastVoiceCommand (String), isListening (Bool)
10Future<void> startListening(
11 Future<void> Function(String) onCommandRecognized,
12) async {
13 if (!_isInitialized) {
14 _isInitialized = await _speech.initialize(
15 onError: (err) => print('Speech error: ${err.errorMsg}'),
16 );
17 }
18 if (!_isInitialized) return;
19
20 await _speech.listen(
21 onResult: (result) {
22 if (result.finalResult) {
23 onCommandRecognized(result.recognizedWords.toLowerCase());
24 }
25 },
26 listenFor: const Duration(seconds: 10),
27 pauseFor: const Duration(seconds: 3),
28 localeId: 'en_US',
29 );
30}
31
32// Action 2: stopListening (no parameters)
33Future<void> stopListening() async {
34 await _speech.stop();
35}

Expected result: Pressing the mic button starts speech recognition. Speaking a phrase stores the lowercase transcript in App State. Pressing again or waiting 3 seconds stops listening.

3

Build a command dictionary with keyword matching

Create a Custom Action named processVoiceCommand that accepts the recognized text string and returns a command identifier string. Inside the action, define a command dictionary as a Map where keys are command identifiers and values are lists of trigger keywords. For each entry in the dictionary, check if the input text contains any of the trigger keywords using String.contains(). Return the first matching command identifier, or 'unknown' if no keywords match. In your FlutterFlow Action Flow, after startListening sets the App State variable, call processVoiceCommand with the transcript, then use a Switch or series of Conditional Actions to navigate or trigger the appropriate action based on the returned command identifier.

process_voice_command.dart
1// Custom Action: processVoiceCommand
2// Parameters: transcript (String)
3// Returns: String (command identifier)
4Future<String> processVoiceCommand(String transcript) async {
5 final text = transcript.toLowerCase().trim();
6
7 // Command dictionary: id -> trigger keywords (any match triggers the command)
8 final commands = <String, List<String>>{
9 'navigate_home': ['home', 'main', 'dashboard', 'start'],
10 'navigate_profile': ['profile', 'account', 'settings', 'my account'],
11 'navigate_orders': ['order', 'orders', 'purchase', 'bought'],
12 'navigate_search': ['search', 'find', 'look for', 'browse'],
13 'action_back': ['back', 'go back', 'previous', 'return'],
14 'action_refresh': ['refresh', 'reload', 'update'],
15 'action_help': ['help', 'assist', 'what can', 'commands'],
16 };
17
18 for (final entry in commands.entries) {
19 for (final keyword in entry.value) {
20 if (text.contains(keyword)) {
21 return entry.key;
22 }
23 }
24 }
25
26 return 'unknown';
27}

Expected result: Saying 'take me to my orders' returns 'navigate_orders'. Saying 'can you help me' returns 'action_help'. Saying unrecognised words returns 'unknown'.

4

Add spoken feedback with flutter_tts

Create a Custom Action named speakFeedback that accepts a message string and uses the flutter_tts FlutterTts class to speak it aloud. Call flutterTts.setLanguage('en-US'), flutterTts.setSpeechRate(0.5) for a natural speaking pace, and flutterTts.speak(message). In your command processing flow, call speakFeedback after identifying the command: 'Going to Home' for navigate_home, 'Showing your orders' for navigate_orders, and 'Sorry, I did not understand that. Try saying home, orders, or help.' for unknown. Spoken feedback is essential because users' eyes are often elsewhere when using voice control — they need audio confirmation that their command was heard.

Expected result: After every recognised command, the device speaks a confirmation message before navigating. For unrecognised commands, the device speaks the fallback help message.

5

Build the floating mic button with listening state animation

Create a FloatingActionButton in your app's main scaffold. Bind its icon to the isListening App State variable: show Icons.mic_none when not listening and Icons.mic when listening. Add a Container behind the button that scales from 1.0 to 1.3 and pulses red when isListening is true, using FlutterFlow's built-in animation settings on the Container. Connect the FAB's On Tap action to a Conditional Action: if isListening is false, call startListening and set isListening to true; if isListening is true, call stopListening and set isListening to false. This gives users clear visual feedback that the app is actively capturing their voice.

Expected result: The FAB icon changes when listening starts. A pulsing animation surrounds the button. Tapping again stops listening and the animation stops.

Complete working example

process_voice_command.dart
1// Complete voice command processing action
2// Place in FlutterFlow Custom Actions panel
3// Parameters: transcript (String)
4// Returns: String (command identifier)
5
6Future<String> processVoiceCommand(String transcript) async {
7 final text = transcript.toLowerCase().trim();
8 if (text.isEmpty) return 'empty';
9
10 // Command dictionary: command_id -> trigger keywords
11 // Order matters: more specific phrases first
12 final commands = <String, List<String>>{
13 'navigate_home': ['go home', 'main screen', 'home screen', 'home', 'dashboard'],
14 'navigate_profile': ['my profile', 'my account', 'profile', 'account', 'settings'],
15 'navigate_orders': ['my orders', 'order history', 'purchases', 'orders', 'order'],
16 'navigate_search': ['search for', 'find', 'look for', 'search', 'browse'],
17 'navigate_cart': ['my cart', 'shopping cart', 'cart', 'basket'],
18 'navigate_notifications':['notifications', 'alerts', 'messages'],
19 'action_back': ['go back', 'previous page', 'back'],
20 'action_refresh': ['refresh', 'reload', 'update page'],
21 'action_logout': ['log out', 'sign out', 'logout', 'signout'],
22 'action_help': ['what can you do', 'help me', 'commands', 'help'],
23 };
24
25 // Check each command's keywords
26 for (final entry in commands.entries) {
27 for (final keyword in entry.value) {
28 if (text.contains(keyword)) {
29 return entry.key;
30 }
31 }
32 }
33
34 // Partial word matching fallback
35 final words = text.split(' ');
36 final allKeywords = <String, String>{};
37 for (final entry in commands.entries) {
38 for (final kw in entry.value) {
39 allKeywords[kw] = entry.key;
40 }
41 }
42
43 for (final word in words) {
44 for (final kw in allKeywords.keys) {
45 if (kw.contains(word) && word.length > 3) {
46 return allKeywords[kw]!;
47 }
48 }
49 }
50
51 return 'unknown';
52}

Common mistakes when implementing a Voice Command System in FlutterFlow

Why it's a problem: Using exact string matching for voice commands

How to avoid: Use a keyword dictionary where each command has multiple trigger words. Check if the transcript contains any trigger keyword using String.contains(). This handles natural language variations reliably.

Why it's a problem: Creating a new SpeechToText instance on every call to startListening

How to avoid: Create the SpeechToText instance as a static variable shared between startListening and stopListening actions. Initialize it once with a guard check (if !_isInitialized) and reuse it for all subsequent listen calls.

Why it's a problem: Not providing spoken feedback for unrecognised commands

How to avoid: Always call speakFeedback even for unknown commands. Use a message like 'Sorry, I did not understand. You can say home, orders, search, or help.' This teaches users the available commands at the moment of failure.

Best practices

  • Always use keyword-based matching instead of exact string comparison for voice commands
  • Provide at least 5-10 trigger keywords per command to cover natural speech variations
  • Give spoken feedback for every command outcome — success, ambiguous, and failure
  • Show a visible listening indicator (animated mic, pulsing ring) so users know the app is capturing audio
  • Set a maximum listening duration (10 seconds) to prevent accidentally capturing background audio
  • Lowercase all transcripts before matching — speech recognition sometimes capitalises randomly
  • Test your command dictionary with multiple speakers including non-native English speakers

Still stuck?

Copy one of these prompts to get a personalized, step-by-step explanation.

ChatGPT Prompt

I am building a voice command system in FlutterFlow. The speech_to_text package returns transcripts that vary each time the user says a command. For example, 'show orders' might come back as 'show my orders' or 'display orders'. How do I build a command matching system in Dart that handles these variations without requiring exact string matches?

FlutterFlow Prompt

Create a FlutterFlow Custom Action called processVoiceCommand that takes a transcript String parameter. It should contain a command dictionary mapping command identifiers to lists of trigger keywords, match the transcript against keywords using String.contains(), and return the matching command identifier as a String, or 'unknown' if no keywords match. Include commands for: navigate_home, navigate_profile, navigate_orders, action_back, and action_help.

Frequently asked questions

Does speech_to_text work offline?

On iOS, speech_to_text uses on-device recognition that works offline for short phrases. On Android, it typically requires an internet connection because it uses Google's cloud speech API. For fully offline voice commands on Android, consider using Vosk, which is an open-source offline speech recognition library, though it requires a larger custom plugin setup.

What languages does speech_to_text support?

The speech_to_text package supports any language installed on the device's speech recognition engine. On iOS, this includes dozens of languages. On Android, it depends on the Google Speech Recognition service. You can call speech.locales() to get a list of available locale IDs on the current device, and pass the localeId parameter to listen() to set the recognition language.

How do I handle voice commands in different languages?

For multilingual apps, detect the user's preferred language from their app settings or device locale, then pass the corresponding localeId to the speech.listen() call. Maintain separate command dictionaries for each language in your processVoiceCommand action. You can also use a single universal dictionary if your app only supports one language for voice commands but multiple languages for UI text.

Can I use the Gemini or OpenAI API instead of speech_to_text for better accuracy?

Yes. For higher accuracy, use the device microphone to record audio (with the record package in AAC format), upload the audio to OpenAI's Whisper API or Google Cloud Speech-to-Text API, and use the transcript from those services instead of speech_to_text. Cloud APIs are significantly more accurate for noisy environments and accented speech, at a cost of about $0.006 per minute for OpenAI Whisper.

Why does voice recognition stop working after the app is in the background?

iOS suspends background audio after a short time unless your app declares the audio background mode in Info.plist. Speech recognition does not qualify as background audio processing. For continuous background voice recognition (like 'Hey Siri' style always-on detection), you need a dedicated wake-word detection library like Porcupine, which runs a lightweight model locally and never sends audio to the cloud.

How do I show the recognised words in real-time as the user is speaking?

The speech_to_text listen() callback fires repeatedly with partial results as the user speaks. Set result.finalResult to false for partial updates and update a Page State variable with result.recognizedWords on every callback. Bind a Text widget to this Page State variable to show the live transcription. Display it in a subtle text field below the mic button so users can see what was captured before the command is processed.

RapidDev

Talk to an Expert

Our team has built 600+ apps. Get personalized help with your project.

Book a free consultation

Need help with your project?

Our experts have built 600+ apps and can accelerate your development. Book a free consultation — no strings attached.

Book a free consultation

We put the rapid in RapidDev

Need a dedicated strategic tech and growth partner? Discover what RapidDev can do for your business! Book a call with our team to schedule a free, no-obligation consultation. We'll discuss your project and provide a custom quote at no cost.