Hi,
If you go through the code, I presented in the previous blog you will find that there are several for loops. It is the main reason for the slowness of the training process. Since the training process takes more time, we cannot train the network for a large number of iteration. Because of that, we cannot gain a large training/test accuracy. So, Vectorization is the answer for those issues.
Let's consider the eqn (1) in the previous blog.
eqn (1) : z1(i) = w1x1(i) + w2x2(i) + ... + w12288x12288(i) + b
According to that equation we consider a single input image(12288 total pixels) and multiply each pixel values with corresponding weight value and add them altogether(also the biase value). Since we consider a single image we need to repeat this for whole training dataset. But by using matrix manipulations, we can do the same computation very quickly.
Lets consider, X represents the whole training dataset. So the columns represent the test images and raws represent the each pixel of a single image. Thefore in this case the dimensions of X is 12288*209(There are 209 training images). In stead of considering each weight value as a single scalar, lets consider them as a metrix W which has the dimension of 1*12288. Now we can write the eqn (1) for whole input dataset as follows.
eqn (10) : Z = WX + B
---- eqn (1)
Here the dimensions of Ŷ is 1*12288. In this way, we can write same equation which are quivalant to the equations we derived in previous posts.
eqn (11): Ŷ = g(Z)
---- eqn (2)
eqn (12): J = np.sum(-Ylog(Ŷ) - (1 - Y)log(1 - Ŷ))
---- eqn (3) and (4)
(here the np.sum means adding the all elimants in the matrix.)
eqn (13): ∂W = (Ŷ - Y)XT/m
---- eqn (6)
eqn (14): ∂B = np.sum(Ŷ - Y)/m
---- eqn (7)
eqn (15): W = W -α∂W
---- eqn (8)
eqn (16): B = B -α∂B
---- eqn (9)
In the next blog, I will show you the code which is written according to these equations and you will be able to see that, the new code runs very quickly and it shows a high training/test accuracy.
Please feel free to raise any concerns/suggestions on this blog post. Let's meet in the next post.
My previous blogs,
How did I learn Machine Learning : part 1 - Create the coding environment
How did I learn Machine Learning : part 2 - Setup conda environment in PyCharm
How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch I
How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch II
How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch III
Congratulations @boostyslf! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :
You can view your badges on your Steem Board and compare to others on the Steem Ranking
If you no longer want to receive notifications, reply to this comment with the word
STOP
Vote for @Steemitboard as a witness to get one more award and increased upvotes!
👍
~Smartsteem Curation Team