Post

Building a simple Rust Neural Network from scratch

Goal

I’ve recently took upon myself the challenge of building a Neural Network to identify digits (i.e.: from 0 to 9) using Rust for study purposes. However, I did not want to use existing libraries, but rather build it from scratch so that I could learn the concepts behind Neural Networks.

Dataset

I’ve decided to use the MNIST dataset available in Kaggle, which is a dataset containing thousands of digits data as images of 28x28 pixels. There are two files, one used for training and another one used for testing the Neural Network.

Layers

The Neural Network that I’ve created contains three layers.

The input layer has a size of 784 (28px * 28px), which will receive the pixel data of each image.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
let mut model = DeepNeuralNetwork {
layers: vec![
            Layer {
                size: 784,
                activation_function: Box::new(ReLU {}),
            },
            Layer {
                size: 10,
                activation_function: Box::new(ReLU {}),
            },
            Layer {
                size: 10,
                activation_function: Box::new(Softmax {}), // Sigmoid also supported
            },
        ],
        learning_rate: 0.1,
        params: HashMap::new(),
    };

Then there is an hidden layer with size 10, and finally an output layer of size 10 as well.

The output layer has to contain 10 classes (size 10), the reason for that is because this is not a neural network that provides a binary output or true or false (e.g.: a neural network that checks if an input image contains a cat), but rather a network that is performing a multiclass classification. In other words, we are giving it an input that can be anything from zero to nine, and we are asking the neural network to classify. In this case, assuming we give an input which is the pixels of an image containing the number 9, the output layer will be similar to:

outputprobability
00.0417
10.0178
20.0946
30.1927
40.1958
50.0125
60.2850
70.1818
80.3102
90.9127

As you can see, it outputs the likelihood of the input pixels to be classified as each of the possible alternatives, nine being the higher in our case.

Other types of neural networks are:

  • Multilabel classification
  • Regression
  • Sequence to sequence
  • Generative models

Activation functions

I’ve also wanted to code the activation functions manually instead of relying on existing implementation. In the network I’ve developed, I’m using ReLU for the hidden layer and Softmax for the output layer. I have also implemented Sigmoid and created simple interfaces so that other activation methods can be easily added.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
pub trait ActivationFunction {
    fn activate(&self, z: Array2<f32>) -> Array2<f32>;
    fn derive(&self, da: Array2<f32>, z: Array2<f32>, labels: Array2<f32>) -> Array2<f32>;
}

pub struct Sigmoid {}
pub struct ReLU {}
pub struct Softmax {}

impl ActivationFunction for Sigmoid {
    fn activate(&self, z: Array2<f32>) -> Array2<f32> {
        z.mapv(|x| sigmoid(&x))
    }

    fn derive(&self, da: Array2<f32>, z: Array2<f32>, _labels: Array2<f32>) -> Array2<f32> {
        da * z.mapv(|x| sigmoid_derivative(&x))
    }
}

// sigmoid functions
fn sigmoid(z: &f32) -> f32 {
    1.0 / (1.0 + E.powf(-z))
}

fn sigmoid_derivative(z: &f32) -> f32 {
    sigmoid(z) * (1.0 - sigmoid(z))
}

// ...code for ReLU and Softmax in the repository

Code

You can check the full code in my rust-neuralnets repo. Please feel free to share any feedbacks.

What’s next

  • Add tests (!!!)
  • Fix the scoring function (currently broken)
  • Fix the loss function (currently broken)
  • Support dynamic parameter initialization methods
  • Test the neural network w/ binary output cases and multilabel classification
This post is licensed under CC BY 4.0 by the author.