Neural Networks: The Core of Deep Learning
What are neural networks?
Neural networks are a subset of machine learning and are at the heart of deep learning algorithms. Their name and structure are inspired by the human brain, mimicking the way that biological neurons signal to one another¹.
Neural networks, also known as artificial neural networks (ANNs) or simulated neural networks (SNNs), are comprised of node layers, containing an input layer, one or more hidden layers, and an output layer. Each node, or artificial neuron, connects to another and has an associated weight and threshold. If the output of any individual node is above the specified threshold value, that node is activated, sending data to the next layer of the network. Otherwise, no data is passed along to the next layer of the network.
How do neural networks work?
Think of each node as its linear regression model, composed of input data, weights, a bias (or threshold), and an output. The formula would look something like this:
$$\sum w_ix_i + bias = w_1x_1 + w_2x_2 + w_3x_3 + bias$$
$$output = f(x) = 1 \text{ if } \sum w_1x_1 + b \geq 0; 0 \text{ if } \sum w_1x_1 + b < 0$$
Once an input layer is determined, weights are assigned. These weights help determine the importance of any given variable, with larger ones contributing more significantly to the output compared to other inputs. All inputs are then multiplied by their respective weights and then summed. Afterwards, the output is passed through an activation function, which determines the output. If that output exceeds a given threshold, it “fires” (or activates) the node, passing data to the next layer in the network. This results in the output of one node becoming in the input of the next node. This process of passing data from one layer to the next layer defines this neural network as a feedforward network².
Let’s break down what one single node might look like using binary values. We can apply this concept to a more tangible example, like whether you should go surfing (Yes: 1, No: 0). The decision to go or not to go is our predicted outcome, or y-hat.
Let’s assume that there are three factors influencing your decision-making:
Are the waves good? (Yes: 1, No: 0)
Is the line-up empty? (Yes: 1, No: 0)
Has there been a recent shark attack? (Yes: 0, No: 1)
Then, let’s assume the following weights and bias values:
Weight for waves: 3
Weight for the line-up: 2
Weight for shark attack: -5
Bias: -2
We can then calculate our output as follows:
$$output = f(x) = 1 \text{ if } 3x_1 + 2x_2 -5x_3 -2 \geq 0; 0 \text{ if } 3x_1 + 2x_2 -5x_3 -2 < 0$$
If we plug in some values for our inputs (e.g. x1 = 1, x2 = 0, x3 = 0), we get:
$$output = f(x) = 1 \text{ if } 3(1) + 2(0) -5(0) -2 \geq 0; 0 \text{ if } 3(1) + 2(0) -5(0) -2 < 0$$
$$output = f(x) = 1 \text{ if } 3 -2 \geq 0; 0 \text{ if } 3 -2 < 0$$
$$output = f(x) = 1 \text{ if } 1 \geq 0; 0 \text{ if } 1 < 0$$
$$output = f(x) = 1$$
This means that the node is activated and sends a positive signal to the next layer, indicating that you should go surfing.
How can we use neural networks in deep learning?
To use neural networks in deep learning, we need to learn how to create them, train them, and evaluate them using a deep learning framework such as TensorFlow or PyTorch.
Here are some steps to follow to use neural networks in deep learning:
Define the problem and the data: What are we trying to achieve? What kind of data do we have? How can we preprocess and split the data into training, validation, and test sets?
Define the network architecture: How many layers do we need? How many nodes per layer? What kind of activation functions do we use? What kind of loss function do we use? What kind of optimizer do we use?
Train the network: How do we feed the data to the network? How do we update the weights and biases using backpropagation and gradient descent? How do we monitor the training progress and performance?
Evaluate the network: How do we measure the accuracy and generalization of the network on unseen data? How do we identify and avoid overfitting or underfitting? How do we fine-tune the network parameters and hyperparameters?
Conclusion
Neural networks are a subset of machine learning and are at the heart of deep learning algorithms. They are inspired by the human brain and consist of layers of nodes that connect and perform mathematical operations on data. Neural networks can learn from data and improve their accuracy over time, solving complex problems in computer science and artificial intelligence. Neural networks can be created, trained, and evaluated using deep learning frameworks such as TensorFlow or PyTorch.
I hope this blog post helped you understand what neural networks are and how they can be used in deep learning. If you have any questions or feedback, please let me know in the comments below.
Thanks :)