Wednesday, September 26, 2018

Forward Propagation and Backward Propagation

I have been trying my hands on machine learning. Started with Supervised learning and later unsupervised learning. While I was able to breeze through these quickly I was stuck when I started with Deep learning. Apparently if you want to understand the back-end algo/process you must need to be very clear about mathematical concepts (Algebra, Calculus, Coordinate geometry). Considering I have almost forgotten the basic concepts I really had a hard time understanding the deep learning concepts.

After spending half day browsing through various portals finally I was able to make sense of what forward and backward propagation all about. Will be putting the details in mostly layman terms. A rather complex but easy to follow (can qualify for an oxymoron) instruction can be found here. Continuing further - 

In simple terms - forward propagation is about calculating output (output node) based on given input (input node) and multiplier value (weight). This calculated value may or may not match the actual expected value. Resulting difference between calculated and actual expected output is what is called error. This is the point where the forward propagation concept ends. You may have a ReLU, Sigmoid, etc functions as equation where you will be using input node and weight data to calculate the output node (more or this later) 

Coming to the backward propagation - this is actually a super-set of forward propagation. Intent here is to reduce the error which is calculated so that optimal value of weight can be identified. This will be series of repeated cycle back and forth to calculate error and then keep on reducing based on changed weight. As we are going back from output node back towards input node, this involves taking the derivative (basic calculus - minimize the dy/dx) 

This takes us to question why are we even bothered to do this exercise - well apart from fun of doing the exercise, intent is to get machine learn to identify the output based on random set of input. Deep learning is probably easier to do with libraries like Keras (yes a python fan - still working on this)

P.S. - Please correct me if you are an expert here. And don't bother to ask any question related to complex algorithm if you happen to land here by mistake, as I am also exploring and may not have answers to your queries.