Neural networks explained
Read below a sneak peek of the articles in our Mobile Machine Learning course with Nimish Narang.
Neural networks are a network of interconnected nodes. Each node has weights and biases associated with it. Data flows through the network from the input to the output nodes.
Each neural network contains layers of nodes.
Each layer is connected to all the nodes in the next layer. Every input will travel through the network until it reaches the very end. It will contain certain values along the way that determine how important that particular pathway is.
In more general terms, neural networks are a set of algorithms designed to recognize patterns.
This makes them great at image classification and image recognition. To the machine or model that you build, and image is just a set of data, an array of values typically between 0 and 255 that represent pixel values. If it can represent certain patterns associated with certain outcomes, it can learn to solve a problem.
The name ‘neural network’ is derived from their structure of a network of nodes. It’s modeled around the human brain, which contains many different neurons each connected to each other and requiring activation to produce results.
Each network contains several layers. Each layer contains operations to either process inputs or map pathways through the network. These pathways produce specific outputs.
Each layer is a mini-network itself of many interconnected nodes. The number of nodes depends on the complexity of the problem you’re trying to solve. Each node is assigned a weight and a bias.
Weights are assigned an initial value that changes over time. When you train your model, you alter the weights. The weights are assigned specifically to a connection between nodes rather than one node.
A bias is a constant value assigned to each node.
How is the weight useful to us?
The weight determines which path to take once a node receives an input. More specifically: how important that path is. This repeats until we reach the end of the network.
At each layer an activation function will be attached to each node at the layer. Certain activation functions are better for certain tasks. An activation function is a way to sum up all the inputs of a layer. It will transform the sum depending on the function you use and will produce an output.
The output will propagate throughout the network until the end. At the end, you can sum all the input values, perform a function, and produce a meaningful output.
Machine learning example
For example, suppose the end sum of your activation function produces an output of 0.1 on the inputs. Suppose in this example, we’re using a neural network to determine if an image is of a face. The closer to 1 that the output value is, the more likelihood it is that the image has a face.
Based on this example, an output of 0.1 means we don’t have a face. If the sum of the weights after the activation function is 0.8, you probably have an image of a face.
For each node in the middle of the network, the input of a node is the output to another node. During training, when you receive inputs and feed in a bunch of data to a network over and over again, the weights will be adjusted in such a way that it starts to learn certain patterns and associate them with outputs.
The network adjusts the weights accordingly so that if you get a face image, the network will know to take certain pathways. Whereas if you get a non-face image, the network will know to take different pathways to the weights we’ve trained. The data runs through one side to the other until a pattern is recognized and we get similar data mapping to similar pathways.