We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.
Here is another one I liked, which focused mainly on the mathematical explanation and coding.
In the "Multi-class perceptron" section you say that "Each column of the weight matrix contains the weights of a separate linear classifier - one for each class. Instead of the dot product wTx, we compute xW, which returns a vector in RC, each of whose entries can be seen as the output of the dot product for a different column of the weight matrix." Forgive me if I simply don't understand what you're saying, because I'm not to familiar with linear algebra, but the entries of the vector should be the dot product of the vector with one row of the matrix and not the column?! And similarly the matrix should then have C rows and not C columns?!
Edit: I have found my mistake. You were right of course. Thank you for this great tutorial.
Thanks for the article. I loved going through it. Very informative and impact. This in now on top of my bookmark list with this one about tensorflow in the second.