How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch III

in #machine-learning5 years ago (edited)

Hi,

In my previous post, I have derived the basic equations you need to know to implement the simple neural. Since it is the basis of the neural network, it is mandatory to understand how to derive those equations. Before I start today's blog, I will summarize all the necessary equations we derived in the previous post. Because in this post I am going to show you that how to program a simple neural network using those equations.


eqn (1) : z1(i) = w1x1(i) + w2x2(i) + ... + w12288x12288(i) + b
eqn (2) : (i) = a1(i) = g(z1(i))
eqn (3) : L(a1(i), y(i))(i) = -y(i)log(a1(i)) - (1 - y(i))log(1 - a1(i))
eqn (4) : J = (Σi=1mL(a1(i), y(i))(i))/m
eqn (6) : ∂w1 = (Σi=1m(a1(i) - y(i))x1(i))/m
The eqn (6) should be applied for all the weights. Therefore,
∂w2 = (Σi=1m(a1(i) - y(i))x2(i))/m ...
∂w12288 = (Σi=1m(a1(i) - y(i))x12288(i))/m
eqn (7) : ∂b = ∂J/∂b = (Σi=1m(a1(i) - y(i)))/m
eqn (8) : w1 = w1 - α*∂w1
The eqn (8) should be applied for all the weights. Therefore,
w2 = w2 - α*∂w2 ...
w12288 = w12288 - α*∂w12288
eqn (9) : b = b - α*∂b

According to the above equations, I have implemented a simple neural network. You can find the code here. Also you can find the training and test data here. In this blog, I am not going to explain each line because I have added the necessary comments to the code. If you have any unclear point, please raise that. Also, I recommend you to implement the code on your own and compare it with my code.

I need to discuss a few points based on my code.

  1. If you run the code, you will notice the training accuracy and the test accuracy is very low. But the cost is decreasing. The reason is that I am training the network only for a few numbers of iterations (100) which is not enough. You can try it by increasing the number of iterations.
  2. If you run the code, you will notice that it will take a considerable amount of time to train the network(more than 5mins in my laptop). The reason is that I have used for loops over the code that causes to slow down the execution of the code. In the next blog, I will show you how to speed up the training of the network using Vectorization.

Please feel free to raise any concerns/suggestions on this blog post. Let's meet in the next post.

My previous blogs,
How did I learn Machine Learning : part 1 - Create the coding environment
How did I learn Machine Learning : part 2 - Setup conda environment in PyCharm
How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch I
How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch II

References:
cousera