Developing intelligent assistant based on Android TTS service

Developing intelligent assistant based on Android TTS service

Introduction

In this article, I will show you a basic Android application that can listen to the user's voice and convert it into text data. In addition, this application can also perform text analysis and then execute corresponding commands to implement data storage and user response functions.

Note: The source code for this article can be downloaded from https://github.com/sitepoint-editors/SpeechApplication.

The program snapshot is as follows:

[[167745]]

Create an application

Open Android Studio and create a new project. Select the minimum version of Android API 18 and add an empty Activity. This is also the only Activity in this project.

To achieve full-screen display of the view, open the configuration file AndroidManifest.xml and set it as follows:

  1. android:theme= "@style/Theme.AppCompat.NoActionBar"  

This configuration will hide the ActionBar in our current activity.

At this point, you already have a full-screen white background layout view with only one TextView control. To make some improvements, you can add a gradient shape to the RelativeLayout.

Next, right-click the drawable folder and select New->Drawable resource file. Name this resource file background and replace the original content with the following code:

  1. <?xml version= "1.0" encoding= "UTF-8" ?>
  2.  
  3. <shape xmlns:android= "http://schemas.android.com/apk/res/android"  
  4.  
  5. android:shape= "rectangle" >
  6.  
  7. < gradient
  8.  
  9. android:type= "linear"  
  10.  
  11. android:startColor= "#FF85FBFF"  
  12.  
  13. android:endColor= "#FF008080"  
  14.  
  15. android:angle= "45" />
  16.  
  17. </shape>

In fact, you can modify the color and angle as you like.

Note: The ImageButton control in the layout uses an image from https://design.google.com/icons/#ic_mic_none. You can download it and add it as a resource.

Next, update the code in the activity_main.xml file:

  1. <?xml version= "1.0" encoding= "utf-8" ?>
  2.  
  3. <RelativeLayoutxmlns:android= "http://schemas.android.com/apk/res/android"  
  4.  
  5. xmlns:tools= "http://schemas.android.com/tools"  
  6.  
  7. android:layout_width= "match_parent"  
  8.  
  9. android:layout_height= "match_parent"  
  10.  
  11. android:background= "@drawable/background"  
  12.  
  13. android:id= "@+id/rel"  
  14.  
  15. tools:context= "com.example.theodhor.speechapplication.MainActivity" >
  16.  
  17. <ImageButton
  18.  
  19. android:layout_width= "wrap_content"  
  20.  
  21. android:layout_height= "wrap_content"  
  22.  
  23. android:id= "@+id/microphoneButton"  
  24.  
  25. android:layout_centerVertical= "true"  
  26.  
  27. android:layout_centerHorizontal= "true"  
  28.  
  29. android:src= "@drawable/ic_mic_none_white_48dp"  
  30.  
  31. android:background= "@null" />
  32.  
  33. </RelativeLayout>

Add speaking function

Now that the user interface is complete, it is time to write the Java code inside MainActivity.

First, declare a variable TextToSpeech above the onCreate method:

  1. private TextToSpeechtts;

Then, add the following code to the onCreate method:

  1. tts = new TextToSpeech(this, new TextToSpeech.OnInitListener() {
  2.  
  3. @Override
  4.  
  5. public void onInit( int status) {
  6.  
  7. if (status == TextToSpeech.SUCCESS) {
  8.  
  9. int result = tts.setLanguage(Locale.US);
  10.  
  11. if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
  12.  
  13. Log.e( "TTS" , "This Language is not supported" );
  14.  
  15. }
  16.  
  17. speak( "Hello" );
  18.  
  19. } else {
  20.  
  21. Log.e( "TTS" , "Initilization Failed!" );
  22.  
  23. }
  24.  
  25. }
  26.  
  27. });

The above code will start the TextToSpeech service in the system. The speak() method uses a String type parameter, which is the text you want your Android device to read out.

Next, create this method and add the following code:

  1. private void speak(String text){
  2.  
  3. if (Build.VERSION.SDK_INT>= Build.VERSION_CODES.LOLLIPOP) {
  4.  
  5. tts.speak(text, TextToSpeech.QUEUE_FLUSH, null , null );
  6.  
  7. } else {
  8.  
  9. tts.speak(text, TextToSpeech.QUEUE_FLUSH, null );
  10.  
  11. }
  12.  
  13. }

The Build.VERSION check is used in the above code because calls of the form tts.speak(param,param,param) are deprecated for Android API 5.1.

Create another method after the speak() method to stop the TextToSpeech service when the user closes the program:

  1. @Override
  2.  
  3. public void onDestroy() {
  4.  
  5. if (tts != null ) {
  6.  
  7. tts.stop();
  8.  
  9. tts.shutdown();
  10.  
  11. }
  12.  
  13. super.onDestroy();
  14.  
  15. }

At this point, once you start the program, it can say "Hello". The next step is to make the program listen.

Add listening function

To enable the program to listen, you need to use the microphone button. To do this, add the following code to the onCreate method:

  1. findViewById(R.id.microphoneButton).setOnClickListener(new View .OnClickListener() {
  2.  
  3. @Override
  4.  
  5. public void onClick( View v) {
  6.  
  7. listen();
  8.  
  9. }
  10.  
  11. });

When the ImageButton control is clicked, the following function will be called:

  1. private void listen(){
  2.  
  3. Intent i = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
  4.  
  5. i.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
  6.  
  7. i.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
  8.  
  9. i.putExtra(RecognizerIntent.EXTRA_PROMPT, "Say something" );
  10.  
  11. try {
  12.  
  13. startActivityForResult(i, 100);
  14.  
  15. } catch (ActivityNotFoundException a) {
  16.  
  17. Toast.makeText(MainActivity.this, "Your device doesn't support Speech Recognition" , Toast.LENGTH_SHORT).show();
  18.  
  19. }
  20.  
  21. }

This method will start the listening Activity, which will display a dialog box with a text prompt. The language used for the speech is provided by the device through the Locale.getDefault() method.

The StartActivityForResult(i, 100) method waits for the current activity to return a result. 100 is just a random code attached to the started activity, but it can be any number that suits your needs. When the result is returned from the started activity, it contains this code and uses this code to distinguish multiple results from each other.

To capture the results from the launched activity, you need to add the following override method:

  1. @Override
  2.  
  3. protected void onActivityResult(intrequestCode, intresultCode, Intent data) {
  4.  
  5. super.onActivityResult(requestCode, resultCode, data);
  6.  
  7. if(requestCode == 100){
  8.  
  9. if (resultCode == RESULT_OK && null != data) {
  10.  
  11. ArrayList<String> res = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
  12.  
  13. String inSpeech = res.get(0);
  14.  
  15. recognition(inSpeech);
  16.  
  17. }
  18.  
  19. }
  20.  
  21. }

This method captures each result from the activity and uses the requestCode to process the language identifier result data. If requestCode equals 100, requestCode equals OK and the data from this result is not null. You will get the result string from res.get(0).

Next, create a new method, recognition, which will take a parameter of type String:

  1. private void recognition(String text){
  2.  
  3. Log.e( "Speech" , "" +text);
  4.  
  5. }

So far, when the user clicks the microphone button, the program can listen to the sound and convert the user's language into text data. The final result will be printed out through the Error log.

Add learning function

To make the program more interesting, in this section you will enable the application to learn some simple things, such as your name. To achieve this, you need to use the local storage feature.

First, add the following code to the onCreate method:

  1. private SharedPreferences preferences;
  2.  
  3. private SharedPreferences.Editor editor;
  4.  
  5. private static final String PREFS = "prefs" ;
  6.  
  7. private static final String NAME = "name" ;
  8.  
  9. private static final String AGE = "age" ;
  10.  
  11. private static final String AS_NAME = "as_name" ;

Then, add the following code to the onCreate method:

  1. preferences = getSharedPreferences(PREFS,0);
  2.  
  3. editor = preferences.edit();

First, you need to make the application ask questions, so you need to change speak("Hello") to speak("What is your name?").

Here you can use a simple logic; so when someone asks "What is your name?" and the answer is "My name is Dori", then take the name from the answer. A simple way to do this is to split the answer into strings separated by spaces and get the value of the last index.

So, we need to update the code in the recognition method as follows:

  1. private void recognition(String text){
  2.  
  3. Log.e( "Speech" , "" +text);
  4.  
  5. //creating an array which contains the words of the answer
  6.  
  7. String[] speech = text.split( " " );
  8.  
  9. //the last word is our name  
  10.  
  11. String name = speech[speech.length-1];
  12.  
  13. //we got the name , we can put it in   local storage and save changes
  14.  
  15. editor.putString( NAME , name ).apply();
  16.  
  17. //make the app tell our name  
  18.  
  19. speak( "Your name is " +preferences.getString( NAME , null ));
  20.  
  21. }

The recognize method uses all the results from the user's speech. Since the utterances may be different, you can use certain words they may contain to distinguish them.

For example, the code in this method could be:

  1. private void recognition(String text){
  2.  
  3. Log.e( "Speech" , "" +text);
  4.  
  5. String[] speech = text.split( " " );
  6.  
  7. //if the speech contains these words, the user   is saying their name  
  8.  
  9. if(text. contains ( "my name is" )){
  10.  
  11. String name = speech[speech.length-1];
  12.  
  13. Log.e( "Your name" , "" + name );
  14.  
  15. editor.putString( NAME , name ).apply();
  16.  
  17. speak( "Your name is " +preferences.getString( NAME , null ));
  18.  
  19. }
  20.  
  21. }

But it's still a simple interaction with the app. You can have it learn your age, or give it a name.

In the same method, you can try these simple conditions:

  1. //This must be the age
  2.  
  3. //Just speak: I am x years old.
  4.  
  5. if(text. contains ( "years" ) &&text. contains ( "old" )){
  6.  
  7. String age = speech[speech.length-3];
  8.  
  9. Log.e( "THIS" , "" + age);
  10.  
  11. editor.putString(AGE, age).apply();
  12.  
  13. }
  14.  
  15. // Then ask it for your age
  16.  
  17. if(text. contains ( "how old am I" )){
  18.  
  19. speak( "You are " +preferences.getString(AGE, null )+ " years old." );
  20.  
  21. }

The app can tell you the time:

  1. //Ask: What time   is it?
  2.  
  3. if(text. contains ( "what time is it" )){
  4.  
  5. SimpleDateFormatsdfDate = new SimpleDateFormat( "HH:mm" );//dd/MM/yyyy
  6.  
  7. Date now = new Date ();
  8.  
  9. String[] strDate = sdfDate.format(now).split( ":" );
  10.  
  11. if(strDate[1]. contains ( "00" ))strDate[1] = "o'clock" ;
  12.  
  13. speak( "The time is " + sdfDate.format(now));
  14.  
  15. }

summary

There are more examples in the GitHub project I created (https://github.com/sitepoint-editors/SpeechApplication), so you can fully experiment and develop your own real Android assistant application.

In the end, I hope you enjoyed this tutorial and that you were able to have a truly useful conversation with your phone.

<<:  51CTO Academy and Xingyu Spacetime have reached a strategic cooperation in VR online education - VR training courses will be launched

>>:  Domestic H5 development: Sunflower realizes H5 WeChat remote control and provides a new path for embedded development

Recommend

Ad creative writing secrets, take them!

In the process of bidding promotion , I believe t...

Tik Tok Operation Strategy in the Automotive Industry

1. Analysis of Douyin customers 1. As the short v...

How to do content operation? It’s all here!

"Content" in a broad sense includes mus...

Atushi SEO training: How to prevent a sudden drop in website keyword rankings

It is not uncommon for site owners or SEO optimiz...

How to promote brands on Bilibili | 6000-word strategy analysis

This article mainly aims to solve two problems: 1...

Douyu product operation analysis

When it comes to games, I believe everyone’s firs...

D, GO, Rust, which one will replace C in the future? Why?

Never mind my position as one of the creators of ...

Apple is going to add a major new feature! iOS 13 is coming

As the moat of the iPhone family, the iOS ecosyst...

Analysis of the rules and techniques of Douyin live streaming

Tik Tok live streaming is becoming more and more ...

Soul competitive product analysis!

Socializing with strangers is something that many...

What are the three common misunderstandings about product operation growth?

When growing, we need to make many trade-offs, an...

Kuaishou information flow advertising, you will understand after reading it

summary: To help you understand the relevant know...

How to use coupons for promotion, here are 4 tips for you!

Coupons are the most commonly used tool in our op...

How to mine free resources from App stores at low cost

The content of this article is based on a speech I...

Weibo advertising creative optimization skills, placement and traffic generation

I believe most advertisers are very familiar with...