Building a simple logistic regression model from scratch using TensorFlow

Building a simple logistic regression model from scratch using TensorFlow
TensorFlow is a Python-based machine learning framework. After learning the course content of logistic regression on Coursera, I wanted to re-implement the content implemented in MATLAB using TensorFlow as a stepping stone to learn Python and the framework.

Target audience

Know what logistic regression is, know a little Python, and have heard of TensorFlow

Dataset

ex2data1.txt from Andrew's machine learning course on Coursera, which determines whether a student will be admitted based on his or her two test scores.

environment

Python 2.7 - 3.x

pandas, matplotlib, numpy

Install TensorFlow

Install the TensorFlow framework on your computer. The installation process is not described here. The CPU version is relatively easier, and the GPU version requires CUDA support. You can install it according to your needs.

start

Create a folder (for example, called tensorflow), create a Python file main.py in the folder, and put the dataset file in this folder:

Data format:

The first two columns are the scores of the two exams (x1, x2), and the last column is whether the student is admitted (y), 1 means admitted, and 0 means not.

In the source file main.py, we first import the required packages:

import pandas as pd # used to read data files import tensorflow as tf
import matplotlib.pyplot as plt # for drawing import numpy as np # for subsequent calculations

Pandas is a data processing package that can read and perform various other operations on data sets; matplotlib can be used to plot our data sets into charts.

Then we read the dataset file into the program for subsequent training:

# Read data file df = pd.read_csv("ex2data1.txt", header=None)
train_data = df.values

The pandas function read_csv can read the data in the csv (comma-separated values) file into the df variable and convert the DataFrame into a two-dimensional array through df.values:

After we have the data, we need to put the features (x1, x2) and labels (y) into two variables respectively so that we can substitute them into the formula during training:

# Separate features and labels, and get data dimensions train_X = train_data[:, :-1]
train_y = train_data[:, -1:]
feature_num = len(train_X[0])
sample_num = len(train_X)
print("Size of train_X: {}x{}".format(sample_num, feature_num))
print("Size of train_y: {}x{}".format(len(train_y), len(train_y[0])))

[[195335]]

As you can see, there are 100 samples in our data set, and the number of features of each sample is 2.

TensorFlow model design

In logistic regression, the prediction function (Hypothesis) we use is:

hθ(x)=sigmoid(XW+b)

Among them, sigmoid is an activation function, which represents the probability of a student being admitted:

P(y=1|x,θ)

Please Baidu the shape of this function

W and b are our next learning goals. W is the weight matrix (Weights), and b is the bias (Bias, also called intercept in the image).

The loss function we use is:

J(θ)=−1m[∑i=1my(i)log(hθ(x(i)))+(1−y(i))log(1−hθ(x(i)))]

Since our data set has only two features, there is no need to worry about overfitting, so the regularization term in the loss function is not needed.

First, we use TensorFlow to define two variables to store our training data:

# Dataset X = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

Here, X and y are not general variables, but placeholders, which means that the values ​​of these two variables are unspecified until you start training the model and you need to assign the given data to the variables.

Next, we define the W and b we want to train:

# Training target W = tf.Variable(tf.zeros([feature_num, 1]))
b = tf.Variable([-.9])

Here, their type is Variable, which means that these two variables will continue to change during the training iteration and eventually get the values ​​we expect. As you can see, we set the initial value of W to the 0 vector of feature_num dimension and the initial value of b to -0.9 (just set it casually, don't mind 😶)

Next, we need to express the loss function using TensorFlow:

db = tf.matmul(X, tf.reshape(W, [-1, 1])) + b
hyp = tf.sigmoid(db)

cost0 = y * tf.log(hyp)
cost1 = (1 - y) * tf.log(1 - hyp)
cost = (cost0 + cost1) / -sample_num
loss = tf.reduce_sum(cost)

As you can see, I express the loss function in three steps: first express the two parts in the sum separately, then add them together and calculate with the external constant m, and finally sum this vector to get the value of the loss function.

Next, we define the optimization method to use:

optimizer = tf.train.GradientDescentOptimizer(0.001)
train = optimizer.minimize(loss)

Among them, the first step is to select the optimizer. Here we choose the gradient descent method; the second step is to optimize the target. As the name of the function suggests, our optimization goal is to minimize the value of the loss function.

Note: The learning rate here (0.001) should be as small as possible, otherwise the problem of log(0) appearing in the loss calculation may occur.

train

After completing the above work, we can start training our model.

In TensorFlow, we first need to initialize the previously defined Variable:

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

Here, we see a tf.Session(), which is the subject of task execution. We have defined a bunch of things above, which are just the execution steps and frameworks that a model needs to get results, something like a flowchart. A flowchart alone is not enough, we need a subject to actually run it, which is the role of Session.

----------Special Tips----------

If you are using the GPU version of TensorFlow, and you want to train the model when the graphics card is highly occupied (such as playing games), you must allocate a fixed amount of video memory to it when initializing the session, otherwise you may get an error and exit directly when starting training:

2017-06-27 20:39:21.955486: E c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_blas.cc:365] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
Traceback (most recent call last):
  File "C:\Users\DYZ\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1139, in _do_call
    return fn(*args)
  File "C:\Users\DYZ\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1121, in _run_fn
    status, run_metadata)
  File "C:\Users\DYZ\Anaconda3\envs\tensorflow\lib\contextlib.py", line 66, in __exit__
    next(self.gen)
  File "C:\Users\DYZ\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Blas GEMV launch failed: m=2, n=100
         [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_arg_Placeholder_0_0/_3, Reshape)]]

At this time you need to create a Session using the following method:

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

The 0.333 here is the share of your total video memory.

----------End Special Tips----------

Now we use our dataset to train the model:

feed_dict = {X: train_X, y: train_y}

for step in range(1000000):
    sess.run(train, {X: train_X, y: train_y})
    if step % 100 == 0:
        print(step, sess.run(W).flatten(), sess.run(b).flatten())

First, store the data to be passed in a variable and pass it to sess.run() when training the model; we perform 10,000 training runs, with each run running for 100 seconds.
Output the current target parameter W, b once.

At this point, the training code is complete and you can run it using your own python command. If you strictly follow the code above and no errors occur, you should now be able to see the training status being continuously output in the console:

Graphical representation of results

When the training is completed, you can get a W and a b, so that we can visually display the data set and the fitting results through charts.

While writing, I used the above code to train a result:

We write it directly into the code, namely:

w = [0.12888144, 0.12310864]
b = -15.47322273

Let's first represent the data set on a chart (x1 is the horizontal axis and x2 is the vertical axis):

x1 = train_data[:, 0]
x2 = train_data[:, 1]
y = train_data[:, -1:]

for x1p, x2p, yp in zip(x1, x2, y):
    if yp == 0:
        plt.scatter(x1p, x2p, marker='x', c='r')
    else:
        plt.scatter(x1p, x2p, marker='o', c='g')

Among them, we use red x to represent not being admitted and green o to represent being admitted.

Next, we plot the decision boundary XW + b = 0 obtained through training on the graph:

# Get the straight line according to the parameters x = np.linspace(20, 100, 10)
y = []
for i in x:
    y.append((i * -w[1] - b) / w[0])
    
plt.plot(x, y)
plt.show()

At this point, if your code is correct, run it again and you will get the following results:

As you can see, the parameters we obtained through training draw a straight line that very appropriately distinguishes the two different data samples.

At this point, a complete and simple logistic regression model has been implemented. I hope that through this article, you can have a preliminary understanding of the implementation of machine learning models in TensorFlow. I am also in the initial learning process. If there is anything inappropriate, please feel free to criticize in the comment area. If you encounter any problems in the process of implementing the above code, please feel free to fire in the comment area.

<<:  Aiti Tribe Stories (21): How difficult is it to take over a project midway? Teach you how to be a good takeover man!

>>:  Android Study: findViewById's evolution

Recommend

Brand promotion and marketing, how to create another Jiang Xiaobai? !

In this article, the author explains in detail wh...

How to write a valuable competitive product analysis report?

Before writing a competitive product analysis rep...

Your phone battery is not durable again. You ignored these charging details...

Do you sometimes wonder: Why does the battery of ...

Growth Tips: Increase Transaction Rate by Over 900%

Let’s get into today’s topic, a sigh from the bot...

WIFI transmission

Source code introduction: Similar to the WIFI tra...

Weekly crooked review: Since it’s Black Friday, let’s have fun with it

I was surprised to hear a great piece of good new...

Inositol, is it really that magical?

This is the 4774th article of Da Yi Xiao Hu Recen...

A complete method for running a good event, with 12 cases

Even if the product's features and experience...