How I built a voice enabled chatGPT app (Part II)

Jul 14, 2023
header image

As promised, this is part II of my journey developing a chatGPT-enabled application. In this part, I will discuss the cross platform mobile application that I build using Google’s Flutter Framework.

Flutter

As I had experience with Flutter, it was my go-to choice for building the application.

For those who are unfamiliar, according to Google, "Flutter is an open-source framework for building beautiful, natively compiled, multi-platform applications from a single codebase."

Architecture

As I already had a successful proof of concept, I decided to keep the overall architecture of the application. This meant implementing the same three main components: voice recognition, interaction with chatGPT, and text-to-speech functionality.

Voice recognition

For optimal performance, I chose to use the device's built-in voice recognition capabilities instead of a cloud solution like in the proof of concept (POC).

Both Android SpeechRecognizer and iOS Speech are supported by the Flutter package speech_to_text, so I only needed to write code once that calls the respective library on the device it runs on.

The only additional step for this part is setting up permissions to prompt the user for microphone and speech recognition access when they first open the app. The documentation of the package explains how to handle this.

Here is a code snippet for the speech recognition process:

import 'package:speech_to_text/speech_to_text.dart' as stt; class SampleWidget extends StatefulWidget { State<AskPage> createState() => _SampleWidgetState(); } class _SampleWidgetState extends State<SampleWidget> { late stt.SpeechToText speech; void initState() { super.initState(); speech = stt.SpeechToText(); } Future<void> listen() async { bool available = await speech.initialize( onStatus: (val) => print('onStatus: $val'), onError: (val) => print('onError: $val'), ); if (!available) { return; } speech.listen( onResult: (val) async => { //do something with the result, if (val.finalResult) stopListening() }, ); } Future<void> stopListening() async { await speech.stop(); } Widget build(BuildContext context) { return //UI widget here } }

Interaction with chatGPT

Just like with the POC, the interaction with chatGPT is straightforward. The only tricky part is setting up dotenv to prevent the OpenAI API key from accidentally being uploaded to Github.

Here is the chatGPT service implementation from the app:

import 'package:dart_openai/openai.dart'; import '../env/env.dart'; class ChatGptService { ChatGptService() { OpenAI.apiKey = Env.apiKey; // Initializes the package with that API key } Future<String> askGPT(String question) async { OpenAIChatCompletionModel chatCompletion = await OpenAI.instance.chat.create( model: "gpt-3.5-turbo", messages: [ OpenAIChatCompletionChoiceMessageModel( content: question, role: OpenAIChatMessageRole.user, ), ], ); if (chatCompletion.choices.isEmpty) { throw ("ChatGPT: I'm sorry, I don't know."); } return chatCompletion.choices.first.message.content; }

Text to speech

Just like with voice recognition, I chose to utilize the built-in text-to-speech capabilities offered by Android and iOS.

To interact with the native APIs mentioned above, I used the Flutter package flutter_tts. This package provided a solid interface, allowing me to easily configure the results to my liking and even add support for French and Arabic.

Below is the full implementation of my Speech service:

import 'package:flutter_tts/flutter_tts.dart' as tts; import 'package:flutter_tts/flutter_tts.dart'; class SpeakService { late tts.FlutterTts flutterTts; SpeakService() { flutterTts = tts.FlutterTts(); } Future<void> speak(String content, String localeId) async { await flutterTts.awaitSpeakCompletion(true); await flutterTts.setLanguage(localeId); await flutterTts.setVolume(1); await flutterTts.setSpeechRate(0.55); await flutterTts.setPitch(0.8); await flutterTts.setIosAudioCategory( IosTextToSpeechAudioCategory.ambient, [ IosTextToSpeechAudioCategoryOptions.allowBluetooth, IosTextToSpeechAudioCategoryOptions.allowBluetoothA2DP, IosTextToSpeechAudioCategoryOptions.mixWithOthers, IosTextToSpeechAudioCategoryOptions.defaultToSpeaker ], IosTextToSpeechAudioMode.voicePrompt); await flutterTts.speak(content); } Future<void> stopSpeaking() async { await flutterTts.stop(); } }

Deployment

To demonstrate my app to my friends and family (most importantly my mom), the last step was to deploy the app to a mobile device.

Icon

Before deploying the app, I needed to update the launcher icon. I searched online for inspiration and came up with a simple design using Pixelmator. Then, with the help of the Flutter library flutter_launcher_icons, I updated the app with the custom icon.

Android

The Flutter documentation provides a comprehensive guide for deploying an app for testing purposes or the Google Play Store on Android.

I just needed the app running on an Android device, so I followed the first option. Once my code was stable enough, I ran the command flutter build apk --split-per-abi. When the build finished, copied the APK from [project]/build/app/outputs/apk/release/ onto my device and installed it.

IOS

After successfully testing the app on an Android device, I realized the convenience of having ChatGPT in my pocket, so I wanted it on my primary phone, an iPhone.

However, deploying the app on iOS proved to be more challenging. While Flutter's documentation provides detailed instructions on deploying through the App Store, I opted for a simpler approach.

Initially, I configured Xcode with the appropriate settings as outlined in the documentation, paying particular attention to the app's display name and bundle identifier.

After that, I built the application using Xcode to ensure everything compiled correctly. Then, I used the command flutter build ios --release to create the app bundle, which gets generated in [project]/build/ios/iphoneos/.

To get the app running on the iPhone, I needed to leverage Sideloadly to bypass Apple's certificate complications. The necessary steps can be found in the FAQ.

The only challenge I encountered was locating the IPA file since the app bundle generated by Flutter was a Runner.app file. It turned out that the IPA file is essentially a ZIP file containing a folder named Payload with the Runner.app file inside it. To ensure Sideloadly detects the file, the extension needs to be manually changed from ZIP to IPA.

Conclusion

In this second part, I provided a brief overview of how I developed a cross-platform voice-enabled chatGPT mobile application using Flutter.

The application's codebase closely resembled the POC version, consisting of three primary components: voice recognition, interaction with chatGPT, and text-to-speech.

This time I used native phone APIs instead of Cloud services. Additionally, since I was working with a mobile application running on more than one platform, deployment was a tricky part of the equation.

To summarize, building a voice-enabled chatGPT application proved to be an fun adventure that taught me a lot about mobile development using Flutter. I hope that anyone reading through the outlined steps in this article will be inspired to create something remarkable using this framework.