Machine learning

Matrix operations

When you start to peel away the layers of a machine learning model, you’ll find it’s just a bunch of matrix multiplications under the hood. Whether the input data to your model is images, text, categorical, or numerical it’ll all be converted into matrices.
Sara Robinson, Predicting Stack Overflow Tags with Google’s Cloud AI

The basic matrix multiplication seems to be i*w+b=p (i=inputs, w=weights, b=bias, p=prediction).

Free form text might be converted into matrices with the Bag-of-words model.

Learning paradigms

There are three major learning paradigms:

Supervised learning
Unsupervised learning
Reinforcement learning

Supervised vs non-supervised learning

In supervised learning algorithms, a dependent variable is assigned the role of a target variable.

Then, known values are provided for the target variable.

Supervised learning

There are two categories of supervised learning:

Classification (which predicts a class (from a set of discrete classes) that data belongs to)
Regression (which predicts a number that data belongs to)

Some classification algorithms are

K-Nearest-Neighbors (K-NN, one of the simplest)
Naïve Bayes
decision trees
Logistic regression

Logistic regression is a special case of regression where the output is binary (a boolean yes or no). Logistic regression might be used to determine if an email message is spam or not. In a way, the logistic regression is classifier with two classes)

Unsupervised learning

The two main(?) categories of unsupervised learning are

Clustering
Association

Regression

Examples of regression include

Linear regression
Support vector regression (SVR)
Regression trees

Cost function

The cost function estimates how bad (or wrong) the performance of a model is.

The goal of machine learning is to minimize the cost function.

Automatic summarization

Automatic summarization tries to determine the semantic meaning of a text.

Approaches:

Extraction: select a subset of words, phrases or sentences
Abstraction: build an internal semantic representation

Automatic summarization is used, for examples, in search engines.

Deep learning

Deep learning is a class (or subfield) of machine learning algorithms that uses multiple layers to progressively extract higher-level features from the raw input.

Deep learning is also referred to as representation learning (see Jones, The learning machines, 2014).

Training the model, hyperparameters etc.

Setting so called hyperparameters allows to control the training process of a model.

Hyperparameters include

Number of epochs	How many times the entire training data set is passed through the neural net
Batch size	The number of samples that is processed processed in one forward/backward pass. The model's parameters are adjusted once per batch (rather than after each individual sample)
Learning rate	A scalar with which the gradient is multiplied to adjust the model's parameter.

An epoch consists of the following two parts

Train loop: Use the training samples to try to converge to the optimal parameters
Validation/test loop: Use the test samples to check if the model performance is improving.

A larger batch size may result in faster training (for example because of parallel processing), but also requires more memory.

A larger learning rate may lead to faster convergence at the risk of overshooting or the loss function becoming oscillating.

The samples of batches are typically reshuffled for each epoch to reduce model overfitting.

Maximum inner product search (MIPS)

MIPS refers to the problem of finding the vector («embedding») in a dataset that has the maximum inner product with a given query vector.

MIPS is essential to solve large scale classification and retrieval tasks such as recommendation systems.

MLOPs (Machine Learning Operations)

MLOps is a core function of Machine Learning engineering.

MLOps focuses on streamlining the process of taking machine learning models to production, and then maintaining and monitoring them.

MLOps is a collaborative function, often comprising data scientists, devops engineers and IT.

Misc

The terms machine learning, pattern recognition, data mining and knowledge discovery in databases (KDD) overlap in scope. Thus, they are hard to separate.

RBLM = Rule based machine learning

Links

TensorFlow.js is a library for developing and training ML models in JavaScript, and deploying in browser or on Node.js

ManimML is a Python project focused on providing animations and visualizations of common machine learning concepts.