Tutorial on using TensorFlow on iOS (Part 1)

Tutorial on using TensorFlow on iOS (Part 1)

Before using deep learning networks for predictive analysis, we first need to train them. There are a lot of tools on the market that can be used for neural network training, but TensorFlow is undoubtedly one of the most important first choices.

You can use TensorFlow to train your own machine learning models and use them to perform predictive analytics. Training is usually done on a very powerful device or cloud resources, but what you may not imagine is that TensorFlow can also work smoothly on iOS - but there are certain limitations.

In today’s blog post, we will learn about the design ideas behind TensorFlow, how to use it to train a simple classifier, and how to introduce the above results into your iOS application.

In this example, we will use the "Gender Determination from Voice and Conversation Analysis" dataset to learn how to determine whether a voice is male or female based on an audio recording. Dataset address: https://www.kaggle.com/primaryobjects/voicegender

Get the relevant code: You can get the source code of this example through the corresponding project on GitHub: https://github.com/hollance/TensorFlow-iOS-Example

What is TensorFlow and why do we need to use it?

TensorFlow is a software library for building computational graphs for machine learning.

Other tools work at a higher level of abstraction. For example, in Caffe, you need to interconnect different types of "layers" to design a neural network. BNNS and MPSCNN on iOS can also achieve similar functions.

In TensorFlow, you can also work with these layers, but at a much deeper level—right down to the individual calculations in your algorithm.

You can think of TensorFlow as a toolset for implementing new machine learning algorithms, while other deep learning tools are used to help users use these algorithms.

Of course, this does not mean that users need to build everything from scratch in TensorFlow. TensorFlow has a complete set of reusable building blocks, including Keras and other resource libraries that provide TensorFlow users with a large number of convenient modules.

Therefore, TensorFlow does not require you to be proficient in relevant mathematical expertise when using it. Of course, if you are willing to build it yourself, TensorFlow can also provide corresponding tools.

Using Logistic Regression to Implement Binary Classification

In today's blog post, we're going to create a classifier using the logistic regression algorithm. Yes, we're going to build it from scratch, so be prepared - this is going to be a bit of a complex task. A classifier basically takes input data and tells the user what category - or class - that data belongs to. In this project, we're only going to have two categories: male and female - so we're going to build a binary classifier.

Note: Binary classifiers are the simplest type of classifiers, but their basic concepts and design ideas are exactly the same as those used to distinguish hundreds or thousands of different categories. Therefore, although we will not go too deep into this tutorial, I believe that you can still get a glimpse of the design of classifiers.

For the input data, we will use a given recording of 20 digits spoken aloud, covering a variety of acoustic characteristics, which I will explain in detail later in the article, including audio frequencies and other related information.

In the diagram below, you can see that all 20 numbers are connected to a small box called sum. These connections have different weights, which represent the different importance of each of the 20 numbers to the classifier.

The following block diagram shows how this logistic classifier works:

In the sum box, the input data range is from x0 to x19, and the corresponding connection weights w0 to w19 are directly added. The following is a common dot product:

  1. sum = x[0]*w[0] + x[1]*w[1] + x[2]*w[2] + ... + x[19]*w[19] + b

We also add a b to the end of what's called a bias term. It just represents another number.

The weights and values ​​b in the array w represent the experience learned by this classifier. The process of training the classifier is actually to help it find the correct number that matches w and b. Initially, we will first set all w and b to 0. After several rounds of training, w and b will contain a set of numbers that the classifier will use to distinguish male and female voices in the input speech. In order to be able to convert sum into a probability value - whose value is between 0 and 1 - we use the logistic sigmoid function here:

  1. y_pred = 1 / (1 + exp(- sum ))

This equation looks scary, but the practice is very simple: if sum is a large positive number, the sigmoid function returns 1 or a probability of 100%; if sum is a large negative number, the sigmoid function returns 0. Therefore, for large positive or negative numbers, we can get a more certain "yes" or "no" prediction conclusion.

However, if sum approaches 0, the sigmoid function will give a probability close to 50% because it cannot be sure of the prediction result. When we first train the classifier, its initial expected result will be 50% because the classifier itself is not fully trained, that is, there is no confidence in the judgment result. But as the training progresses, the probability it gives begins to approach 1 and 0, that is, the classifier is more certain about the result.

Now the predictions contained in y_pred show that the voice is more likely to be male. If the probability is higher than 0.5 (or 50%), we consider the voice to be male; otherwise, it is female.

This is the basic design principle of our binary classifier using logistic regression. The input data to the classifier is an audio recording of 20 numbers being read aloud. We calculate a weighted sum and apply the sigmoid function, and the output probability we get indicates that the reader should be male.

However, we still need to establish a mechanism for training the classifier, and this is where today's protagonist, TensorFlow, comes in. Implementing this classifier in TensorFlow To use this classifier in TensorFlow, we first need to convert its design into a computational graph. A computational graph consists of multiple nodes responsible for performing calculations, and input data flows between the nodes.

The computational graph of our logistic regression algorithm is as follows:

It looks a bit different from the diagram given before, but this is mainly because the input x is no longer 20 independent numbers, but a vector containing 20 elements. Here, the weights are represented by the matrix W. Therefore, the dot product obtained before is also replaced by a matrix multiplication here.

In addition, this diagram also includes an input y. It is used to train the classifier and verify its operation effect. The dataset we use here is a set of 3168 example voice records, each of which is clearly marked as male or female. These known male or female voice results are also called the label of the dataset and serve as the input content we deliver to y.

To train our classifier, we need to load an example into x and allow the computational graph to make a prediction: is the voice male or female? Since the initial weights are all 0, the classifier is likely to make an incorrect prediction. We need a way to calculate the "specific degree" of its error, and this goal needs to be achieved through the loss function. The loss function compares the predicted result y_pred with the correct output y.

After feeding the loss function to the training examples, we use a technique called backpropagation to backtrack through the computational graph, aiming to make small adjustments to the weights of W and b in the right direction. If the prediction was a male voice but the actual result was a female voice, the weight value will be adjusted slightly up or down, increasing the probability of judging it as a "female voice" the next time it is faced with the same input.

This training procedure is repeated over and over again using all the examples in the dataset until the computation graph itself has acquired an optimal set of weights, and the loss function, which measures how wrong the predictions are, gets lower over time.

Backpropagation plays a huge role in training a computational graph, but we need to add a little math to make it more accurate. This is where TensorFlow excels: we just express all the "forward" operations as nodes in the computational graph, and it automatically recognizes that the "backward" operation represents backpropagation - without having to do any math ourselves. Awesome!

What exactly is Tensorflow?

In the above computational graph, data flows from left to right, which means from input to output. This is the origin of the "flow" in TensorFlow. But what is Tensor?

The word Tensor means tensor, and all data flows in this computational graph exist in the form of tensors. A tensor actually represents an n-dimensional array. I mentioned that W is a weight matrix, but from the perspective of TensorFlow, it is actually a second-order tensor - in other words, a two-array.

  • A scalar represents a zero-rank tensor.
  • A vector represents a first-rank tensor.
  • A matrix represents a second-order tensor.
  • A three-dimensional array representing a third-order tensor.

And so on...

That’s what Tensor is all about. In deep learning scenarios like convolutional neural networks, you’ll need to work with four-dimensional tensors. But the logistic classifier mentioned in this example is much simpler, so we’ll only deal with second-order tensors here, that is, matrices.

I mentioned earlier that x is a vector—or a first-order tensor—but we will treat it as a matrix. The same is true for y. This allows us to calculate the loss for the entire database set.

A simple example speech contains 20 data elements. If you load all 3168 examples into x, x will become a 3168 x 20 matrix. Multiplying x by W, the result y_pred is a 3168 x 1 matrix. Specifically, y_pred represents a prediction conclusion for each speech example in the dataset.

By expressing our computation graph in the form of a matrix/tensor, we can make predictions for multiple examples at once.

Install TensorFlow

OK, the above is the theoretical basis of this tutorial, and then we will enter the practical operation stage.

We will use TensorFlow with Python. Your Mac may already have a version of Python installed, but it may be an older version. In this tutorial, I use Python 3.6, so it is best to install the same version.

Installing Python 3.6 is very simple, you just need to use the Homebrew package manager. If you don't have homebrew installed yet, click here to refer to the relevant instructions.

Next, open a terminal and enter the following command to install the latest version of Python:

  1. brew install python3

Python also has its own package manager, pip, which we will use to install other packages we need. Enter the following command in the terminal:

  1. pip3 install numpy
  2.  
  3. pip3 install scipy
  4.  
  5. pip3 install scikit-learn
  6.  
  7. pip3 install pandas
  8.  
  9. pip3 install tensorflow

In addition to TensorFlow, we also need to install NumPy, SciPy, pandas, and scikit-learn:

NumPy is a library for working with n-level arrays. Does it sound familiar? NumPy doesn't call them tensors, but as mentioned before, arrays are a type of tensor. The TensorFlow Python API is built on top of NumPy.

SciPy is a set of libraries for numerical computing. Other software packages need to be based on it to work.

Pandas is responsible for loading and cleaning the data set.

Scikit-learn can be considered a competitor of TensorFlow in a sense, because it is also a library for machine learning. We use it in this project because it has many convenient features. Since both TensorFlow and scikit-learn use NumPy arrays, they can work together smoothly.

Actually, you don’t need pandas and scikit-learn to use TensorFlow, but they do provide convenience features that every data scientist would love to use.

As you know, these packages will be installed in /usr/local/lib/python3.6/site-packages. If you need to view some TensorFlow source code that is not published on its official website, you can find it here.

Note: pip should automatically install the best version of TensorFlow for your system. If you wish to install a different version, please refer to the official security guide here. Alternatively, you can build TensorFlow yourself from source, which we will explain later in the Building TensorFlow for iOS section.

Let's do a quick test to make sure everything is installed. Create a new tryit.py file with the following content:

  1. import tensorflow as tf
  2. a = tf.constant([1, 2, 3])
  3. b = tf.constant([4, 5, 6])
  4. sess = tf.Session(config=tf.ConfigProto(log_device_placement= True ))
  5. print(sess.run(a + b))

Then run this script through the terminal:

  1. python3 tryit.py

It will show some debugging information about the device TensorFlow is running on (mostly CPU information, but if you are using a Mac with an Nvidia GPU, it may also provide GPU information). The final result is:

  1. [5 7 9]

This represents the sum of two vectors a and b. In addition, you may also see the following information:

  1. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.

If the above content appears, it means that the TensorFlow installed on your system is not the best version for the current CPU. One way to fix it is to build TensorFlow yourself from source code, because this allows you to configure all options. But in this example, since it has no effect, you can just ignore it. A closer look at the training dataset

To train a classifier, we naturally need data.

In this project, we use the "Gender from Voice" dataset from Kory Becker. To make this tutorial different from the MNIST digit recognition on TensorFlow guide, I decided to look for a dataset on Kaggle.com and finally chose this one.

So how can we determine gender based on audio? After downloading the dataset and opening the voice.csv file, you will see that it contains rows of numbers:

First of all, we need to emphasize that this is not actual audio data! Instead, these numbers represent different acoustic features in the speech recordings. These properties or features are extracted from the audio recordings by a script and converted into this CSV file. The specific extraction method is beyond the scope of this article, but if you are interested, you can click here to view the original R source code.

This dataset contains 3168 examples (each example is a row in the table above), and basically half of them are male voice recordings and half are female voice recordings. There are 20 acoustic features in each example, such as:

  • Average frequency in kHz
  • Standard deviation of frequency
  • Spectral flatness
  • Spectral entropy
  • Kurtosis
  • The maximum fundamental frequency measured in the acoustic signal
  • Modulation Index
  • etc……

Don't worry, we don't know what most of these items actually mean, but that won't affect this tutorial. What we really need to care about is how to use this data to train our own classifier, so that it can distinguish between male and female voices based on the above features.

If you want to use this classifier in an application to detect the gender of a voice from a recording or audio from a microphone, you first need to extract acoustic features from the audio data. Once you have these 20 numbers, you can train a classifier to determine whether the voice is male or female.

Therefore, our classifier does not process the audio recordings directly, but rather the acoustic features extracted from them.

Note: This is a good starting point to understand the difference between deep learning and traditional algorithms such as logistic regression. The classifier we trained cannot learn very complex content, and you need to extract more data features in the preprocessing stage to help it. In this specific dataset, we only need to consider extracting audio data from audio recordings.

The cool thing about deep learning is that you can train a neural network to learn how to extract these acoustic features on its own. This way, you can use a deep learning system to take raw audio as input and extract whatever acoustic features it deems important and then classify it without any preprocessing.

This is certainly an interesting direction for deep learning exploration, but it is not within the scope of our discussion today, so perhaps we will write a separate article in the future.

Create a training set and a test set

In the previous article, I mentioned that we need to train the classifier in the following steps:

  • Feed it with all the examples from the dataset.
  • A measure of how wrong the forecast is.
  • Adjust the weights according to the loss.

It turns out that we shouldn't use all of the data for training. We only need a specific portion of it - the test set - to evaluate how well our classifier actually works. So we'll split the entire dataset into two parts: a training set, which we'll use to train our classifier, and a test set, which we'll use to see how well our classifier predicts.

In order to split the data into training and test sets, I created a Python script called split_data.py, the contents of which are as follows:

  1. import numpy as np # 1
  2. import pandas as pd df = pd.read_csv( "voice.csv" , header=0) #2
  3. labels = (df[ "label" ] == "male" ). values ​​* 1 # 3
  4. labels = labels.reshape(-1, 1) # 4
  5. del df[ "label" ] # 5
  6. data = df.values    
  7.  
  8. #6
  9. from sklearn.model_selection import train_test_split X_train,
  10. X_test, y_train, y_test = train_test_split(data, labels, test_size=0.3, random_state=123456)
  11. np.save( "X_train.npy" , X_train) # 7
  12. np.save( "X_test.npy" , X_test)
  13. np.save( "y_train.npy" , y_train)
  14. np.save( "y_test.npy" , y_test)

Let's take a look at how this script works step by step:

  • First, import the NumPy and pandas packages. Pandas can easily load CSV files and preprocess the data.
  • We use pandas to load the dataset from voice.csv and make it as a dataframe. This object is very similar to a spreadsheet or SQL table.
  • The label column contains the labels for the dataset: whether the example is male or female. Here, we extract these labels into a new NumPy array. The original labels were in text form, but we convert them to numbers, where 1=male and 0=female. (You can choose the number assignment here, in a binary classifier, we usually use 1 to represent the 'positive' class, or the class we are trying to detect.)
  • The new labels array created here is a one-dimensional array, but our TensorFlow script expects a two-dimensional tensor, where each of the 3168 rows corresponds to a column. So we need to "reshape" the array here to convert it into a two-dimensional form. This will not affect the data in memory, but only change the way NumPy interprets the data.
  • After we have completed the label column, we remove it from the dataframe so that we are left with only 20 features describing the input. We will also convert the dataframe into a regular NumPy array.
  • Here, we use a helper function from scikit-learn to split the data and labels arrays into two parts. This random shuffling of examples in the dataset is based on random_state, a type of random generator. Regardless of the specific content, as long as the content is the same, we have created a reproducible experiment.
  • Finally, save the four new arrays to NumPy’s binary file format. Now we have a training set and a test set!

You can also perform some additional preprocessing to adjust the data in the script, such as scaling the features so that they have zero mean and equal variance, but since this example project is relatively simple, there is no need to make in-depth adjustments.

Run the script in the terminal using the following command:

  1. python3 split_data.py

This will give us 4 new files containing the training examples (X_train.npy), the corresponding labels for those examples (y_train.npy), and the test examples (X_test.npy) and their corresponding labels (y_test.npy).

Note: You may be wondering why some of these variable names are uppercase and some are lowercase. In mathematics, matrices are usually uppercase and vectors are lowercase. In our script, X represents a matrix and y represents a vector. This is a convention that is followed in most machine learning code.

Building a computational graph

Now that we have organized our data, we can write a script to train this logistic classifier using TensorFlow. This script is called train.py. To save space, I will not list the specific contents of the script here, but you can click here to view it on GitHub.

As usual, we first need to import the required packages. After that, we load the training data into two NumPy arrays, X_train and y_train. (We will not use the test data in this script.)

  1. import numpy as np
  2. import tensorflow as tf
  3. X_train = np. load ( "X_train.npy" )
  4. y_train = np. load ( "y_train.npy" )

Now we can build our computational graph. First, we define so-called placeholders for our inputs x and y:

  1. num_inputs = 20
  2. num_classes = 1
  3.  
  4. with tf.name_scope( "inputs" ):
  5. x = tf.placeholder(tf.float32, [None, num_inputs], name = "x-input" )
  6. y = tf.placeholder(tf.float32, [None, num_classes], name = "y-input" )

tf.name_scope("...") can be used to group different parts of the graph into different scopes, making it easier to understand the contents of the graph. We add x and y to the "inputs" scope. We also name them "x-input" and "y-input" so that they can be easily referenced later.

As you'll recall, each input example is a vector with 20 elements. Each example also has a label (1 for male voice, 0 for female voice). I also mentioned earlier that we can combine all the examples into a single matrix so that we can perform full computations on them at once. That's why we define x and y as 2D tensors: x has dimensions [None, 20] and y has dimensions [None, 1].

None means that the first dimension is flexible and currently unknown. In the training set, we imported 2217 examples into x and y; in the test set, we imported 951 examples. Now that TensorFlow understands our input, let's define the parameters of the classifier:

  1. with tf.name_scope( "model" ):
  2. W = tf.Variable(tf.zeros([num_inputs, num_classes]), name = "W" )
  3. b = tf.Variable(tf.zeros([num_classes]), name = "b" )

The tensor W contains the weights that the classifier will learn (it is a 20 x 1 matrix because it contains 20 input features and 1 output result), and b contains the bias value. Both are declared as TensorFlow variables, which means that they can be updated during the backpropagation process.

With everything in place, we can now declare the computational flow at the heart of our logistic regression classifier:

  1. y_pred = tf.sigmoid(tf.matmul(x, W) + b)

Here, x is multiplied by W, and the deviation value b is added, and then the logistic sigmoid curve is taken. In this way, the result in y_pred is the probability of being judged as a male voice based on the descriptive characteristics of the audio data in x.

Note: The above code doesn't actually do any computation yet - so far, we've just built the necessary computational graph. This line simply adds nodes to the computational graph for matrix multiplication (tf.matmul), addition (+), and the sigmoid function (tf.sigmoid). Once the entire computational graph is built, we can create a TensorFlow session and run it with actual data.

The task is not yet complete. In order to train this model, we need to define a loss function. For a binary logistic regression classifier, we need to use log loss. Fortunately, TensorFlow itself has a built-in log_loss() function that can be used directly:

  1. with tf.name_scope( "loss-function" ):
  2. loss = tf.losses.log_loss(labels=y, predictions=y_pred)
  3. loss += regularization * tf.nn.l2_loss(W)

The log_loss graph node takes as input y, and we get the example label associated with it and compare it with our prediction y_pred. The result displayed as a number is the loss value.

At the beginning of training, the prediction y_pred for all examples will be 0.5 (or 50% male voices), because the classifier itself does not know how to get the correct answer. Its initial loss is 0.693146 after -1n(0.5). As training progresses, its loss value will become smaller and smaller.

The second line calculates the loss and adds the so-called L2 regularization. This is to prevent overfitting from preventing the classifier from remembering the training data accurately. This is relatively simple, since our classifier's "memory" only contains 20 weights and bias values. However, regularization itself is a common machine learning technique, so it is worth mentioning here.

The regularization value here is another placeholder:

  1. with tf.name_scope( "hyperparameters" ):
  2. regularization = tf.placeholder(tf.float32, name = "regularization" )
  3. learning_rate = tf.placeholder(tf.float32, name = "learning-rate" )

We will also use placeholders to define our inputs x and y, but their purpose is to define hyperparameters.

Hyperparameters allow you to configure the model and how it is trained. They are called "hyper" parameters because, unlike the more common W and b parameters, they are not learned by the model itself - you need to set them to appropriate values ​​yourself.

The learning_rate hyperparameter tells the optimizer how much adjustment to make. The optimizer is responsible for performing backpropagation: it extracts the loss value and passes it back to the computational graph to determine what adjustments need to be made to the weights and biases. There are many different optimizer options to choose from, and the one we use is ADAM:

  1. with tf.name_scope( "train" ):
  2. optimizer = tf.train.AdamOptimizer(learning_rate)
  3. train_op = optimizer.minimize(loss)

This creates a node in the computational graph called train_op. We will run this node later to train the classifier. To determine how well the classifier is performing, we will also need to occasionally take snapshots during training and count how many examples in the training set it has correctly predicted. The accuracy of the training set is not the final word on how well the classifier is performing, but tracking it can help us understand the training process and prediction accuracy trends to some extent. Specifically, if the training results are getting worse, then something is wrong!

Next we define the calculation precision for a computation graph node:

  1. with tf.name_scope( "score" ):
  2. correct_prediction = tf.equal(tf.to_float(y_pred > 0.5), y)
  3. accuracy = tf.reduce_mean(tf.to_float(correct_prediction), name = "accuracy" )

We can run the accuracy node to see how many examples were correctly predicted. As you remember, y_pred contains a probability between 0 and 1. By doing tf.to_float(y_pred > 0.5), we get a value of 0 if the prediction was female and 1 if it was male. We can compare this to y, which contains the correct values. The accuracy value is the number of correct predictions divided by the total number of predictions.

After this, we will use the same accuracy node on the test set to see how well the classifier actually works.

In addition, we need to define another node. This node is used to make predictions for data that we do not have corresponding labels for:

  1. with tf.name_scope( "inference" ):
  2. inference = tf.to_float(y_pred > 0.5, name = "inference" )

To use this classifier in our application, we need to record a few words of spoken text, analyze them to extract 20 acoustic features, and then feed them to the classifier. Since we are dealing with brand new data, not data from the training or test set, we obviously don't have the labels associated with it. We can only feed the data directly to the classifier and hope that it can give the correct prediction. This is where the inference node comes in.

OK, so we've put a lot of work into building this computational graph. Now we want to actually train it using our training set.

Training the classifier

Training is usually done in an infinite loop. This is a bit exaggerated for this simple logistic classifier, which trains in less than a minute. But for deep neural networks, you may need to run the script for hours or even days, until it achieves satisfactory accuracy or you start to lose patience.

Here is the first part of the training loop in train.py:

  1. with tf.Session() as sess:
  2. tf.train.write_graph(sess.graph_def, checkpoint_dir, "graph.pb" , False )
  3.  
  4. sess.run(init)
  5.  
  6. step = 0
  7. while True :
  8. # here comes the training code

We first create a new TensorFlow session object. To run the computation graph, you need to establish a session. Calling sess.run(init) will reset W and b to 0.

We also need to write the computational graph to a file. We put all the node sequences we created earlier into the /tmp/voice/graph.pb file. We will then use this computational graph definition to run the classifier based on the test set and try to introduce the trained classifier into the iOS application.

Inside the while True: loop, we use the following:

  1. perm = np.arange(len(X_train))
  2. np.random.shuffle(perm)
  3. X_train = X_train[perm]
  4. y_train = y_train[perm]

First, we randomly shuffle the training examples. This is important because you certainly don’t want the classifier to make decisions based on the exact order of the examples — rather than their acoustic features.

Now comes the most important part: we ask the session to run the train_op node. It will run a training on the computation graph:

  1. feed = {x: X_train, y: y_train, learning_rate: 1e-2,
  2. regularization: 1e-5}
  3. sess.run(train_op, feed_dict=feed)

When running sess.run(), you also need to provide a set of feed dictionaries. This will be responsible for telling TensorFlow the actual value of the current placeholder node.

Since this is just a very simple classifier, we will always train on the entire training set at once, so here we pass the X_train array into the placeholder x and the y_train array into the placeholder y. (For larger datasets, you can start with small batches of data, such as between 100 and 1000 examples.)

At this point, our operation is over. Since we use an infinite loop, the train_op node will be executed over and over again. In each iteration, the backpropagation mechanism will make small adjustments to the weight values ​​W and b. Over time, this will gradually bring the weight values ​​closer to the optimal value.

Of course, we need to understand the progress of the training, so we need to output progress reports regularly (in this example project, the results are output every 1000 training runs):

  1. if step % print_every == 0:
  2. train_accuracy, loss_value = sess.run([accuracy, loss], feed_dict=feed)
  3. print( "step: %4d, loss: %.4f, training accuracy: %.4f" % \
  4. (step, loss_value, train_accuracy))

This time we don't run the train_op node, but the accuracy and loss nodes. We use the same feed dictionary, so both accuracy and loss are calculated based on the training set. As mentioned before, a high prediction accuracy on the training set does not mean that the classifier will perform well on the test set, but of course we hope that the accuracy value will continue to increase as training progresses. At the same time, the loss value should continue to decrease.

In addition, we also need to save a checkpoint from time to time:

  1. if step % save_every == 0:
  2. checkpoint_file = os.path.join (checkpoint_dir, "model" )
  3. saver.save(sess, checkpoint_file)
  4. print( "*** SAVED MODEL ***" )

It will get the W and b values ​​that the classifier has learned so far and save them as a checkpoint file. This checkpoint can be used for reference to determine whether the classifier is ready to process the test set. The checkpointinit file is also saved in the /tmp/voice/ directory.

Run the training script in your terminal using the following command:

  1. python3 train.py

The output should look like this:

  1. Training set   size : (2217, 20)
  2. Initial loss: 0.693146
  3. step: 0, loss: 0.7432, training accuracy: 0.4754
  4. step: 1000, loss: 0.4160, training accuracy: 0.8904
  5. step: 2000, loss: 0.3259, training accuracy: 0.9170
  6. step: 3000, loss: 0.2750, training accuracy: 0.9229
  7. step: 4000, loss: 0.2408, training accuracy: 0.9337
  8. step: 5000, loss: 0.2152, training accuracy: 0.9405
  9. step: 6000, loss: 0.1957, training accuracy: 0.9553
  10. step: 7000, loss: 0.1819, training accuracy: 0.9594
  11. step: 8000, loss: 0.1717, training accuracy: 0.9635
  12. step: 9000, loss: 0.1652, training accuracy: 0.9666
  13. *** SAVED MODEL ***
  14. step: 10000, loss: 0.1611, training accuracy: 0.9702
  15. step: 11000, loss: 0.1589, training accuracy: 0.9707
  16. . . .

When you find that the loss value no longer drops, wait until you see the next *** SAVED MODEL *** message. At this time, press Ctrl+C to stop training.

In the hyperparameter settings, I chose regularization and learning rate. You should see that the accuracy of the training set has reached about 97%, while the loss value is about 0.157. (If you set regularization to 0 in the feed dictionary, the loss can even be further reduced.)

The first half ends here, and the second half will look at the actual results of the training and how to use TensorFlow on iOS. Finally, we will discuss the advantages and disadvantages of using TensorFlow on iOS.

<<:  The most elegant way to exit an Android app

>>:  Talk about the application of 23 design patterns in Android projects

Recommend

APP promotion and operation: How to improve user traffic conversion

Traffic is the object of operations . How to quic...

Let's watch together! The "Chinese Dove Flower" from the Ice Age

China is a vast country with rich resources and c...

The 2020 Guide to Bargaining for Marketing Ads

There is a famous saying in the marketing world: ...

7 open source software that supports the entire Internet era

Open source software has now become the supportin...

How to build your own traffic circulation system?

Before I start sharing how to build a traffic cir...

What to do if user churn rate is too high?

User operation is a very detailed job, and variou...

Some "pitfalls" and "solutions" for iOS game development and submission

[[156840]] In view of the fact that I always enco...

10 minutes and 58 seconds to understand the three major brand models

Last month, a piece of news from Procter & Ga...

If skin care products have good ingredients, will they definitely be effective?

Review expert: Li Xixi, PhD in Biomedical Enginee...