< Back

ML Model Implementation

Building models from scratch, writing the core mathematical logic

After working on RL for humanoids and projects with ML, I realized I need to better understand how things are actually working under the hood to be able to understand more complex ML solutions/algorithms and how I can contribute to them. To help better my understanding I will implement solutions to the basic models; linear regression, logistic regression and classification with minimal dependencies.

Sidenote: Recently I have been working on implementing Activation Functions which are applied to a neuron's (within a neural network) output to introduce non-linearity which helps model complex data. The problems are from TensorTonic. It's really fun!

Linear Regression

Linear regression finds the relationship between features (the data we have) and a label (the output we are trying to predict). If we have a scatter plot graph that contains some amount of data, a linear regression model will try to find a weight and bias that reduces loss.

Linear model: y' = mx + b (representation of a model with one feature/column)

  • y' is the predicted label
  • m is the weight
  • b is the bias

There are different measures of loss: Absolute difference (L1), Mean absolute Error (MAE), Squared difference (L2), Mean squared error (MSE), Root mean squared error (RSE)

I also understand when to use a certain loss measure. For example, MSE moves the weight and bias in favour of outliers as squaring the loss makes it larger as opposed to MAE

To find the most optimal weight and bias which graphically changes the line, gradient descent is used. Starting with weight 0 and bias 0 we first calculate loss, then find the amount that moves weight and bias closer to optimal solution, and keep going through the process with weight and bias being updated in every iteration and used as the input for the upcoming iteration.

Hyperparameters control how the model is trained.

  • Learning rate - number that you set that updates the magnitude of weight and bias upon every iteration. It's important to choose the right rate so that the model converges appropriately.
  • Batch size - # of examples model ingests before making changes to the variables
  • Epochs - # of times the model processes the same data

Something I found interesting was the test implementation for industry standard. Testing regular software is like testing a calculator where you ensure that the output is exactly equal to the input. Testing a machine learning model is more like testing a child, you give them a textbook, let them study and then give them an exam to see if they got the concept, knowing they won't get every single question perfectly right. This is exactly why we test for convergence with boundaries.

Understanding these core formulas and implementation with numpy and how the iterations can be manipulated I am excited to dive into more complex problems!