With automation gaining momentum, Artificial Intelligence (AI) and Machine Learning (ML) are becoming must-have skills to survive in the workforce of the future. AI and ML are two sides of the same coin, Machine Learning trains a computer in a set of known quantities and when applied to a set of unknown quantities the machine (computer) will be able to arrive to a conclusion on its own without any interference from humans, this is the basis of Artificial Intelligence.

ML is a subset of AI, which requires data to understand and learn data-sets which will then be used to solve problems. One needs to be careful when choosing these data-sets, sub-standard data and logic will train the machine inadequately, which will lead to inappropriate/inaccurate output from the computer.

Thus, it is important to understand first, how machine learning works and the best practices to train an ML model.

How do machines learn?

Machine Learning places emphasis on training computers with historic data to make them capable of making human-like decisions. The algorithm (set of instructions) that is used to train a machine is known as a model.

There are two different techniques that are used for training a machine: supervised learning and unsupervised learning, and reinforcement learning.

Supervised learning trains a Machine Learning model using known input and output data to predict the future output. In unsupervised learning, the model to interpret data (output) solely depends on the basis of input data. Reinforcement learning: Whenever there are consequences to the inaccurate outcomes, reinforced learning is used. It penalizes the wrong outcome and rewards the correct solution. This type of machine learning is useful for designing driverless cars.

We’ll discuss how machines learn in the simplest possible way using a supervised machine learning technique.

The training data set

Machine Learning relies on the concept of functions.

Consider a simple mathematical function: Y=f(X)

Where Y is the output and X is the input. There is no limit to the number of inputs and there can be as many outputs corresponding to each input. In mathematics, we provide input value and derive the output based on the given function.

For example, D=2R is the function that calculates the diameter of a circle when the radius is known. If R (radius) = 2, then D (diameter) will be 4.

But, in the case of machine learning (supervised), both input and output values are known. The developer has to determine the function that is valid for different sets of input and output values. To build an efficient ML model, a larger set of input and output values are required, which are best represented by matrices.

In the world of Machine Learning, we provide input and output vectors which can be a row matrix or a column matrix. We will then need to figure out the function which will predict the output for any given input, irrespective of the data used during the training.

The ML model will depend on the quality of data (inputs and outputs) and the function. However, the model may not always be accurate.

The level of accuracy in Machine Learning is determined by the cost function. The cost function helps in understanding how good the machine’s predictions are with respect to the provided data.

A machine learning algorithm can comprise more than one function. i.e. there can be a network of functions, where the output from one function acts as an input for the next function. This is known as an artificial neural network, which is inspired by the real-life network of neurons (brain cells).

The artificial neural network

Artificial neurons are elementary units in the artificial neural network and are inspired by real-life neurons. The dendrites receive input from one of our five senses – touch, smell, vision, taste, and sound; processes them and generates reaction through axon in the form of a reaction.

Similarly, in machine learning, inputs are equivalent to dendrites and outputs are equivalent to the axon. To understand how the system works, let’s take a simple example of an artificial neuron.

In the above figure, consider the following notations:

1. i1 is the first input and let’s assume its value as 5.

2. i2 is the second input and let’s assume its value as 2.

3. i1 is thinner than i2. In the above visual, the width indicates the importance of two inputs.

4. The weightage for i1 is 1 and i2 is 12.

5. o1 and o2 are outputs and their values are known already and they are 29 and -19 respectively.

6. The first output is a summation of inputs and weights. The second output is the subtraction of inputs and weights.

The functions that are used to calculate outputs are known as activation functions.

The primary goal of the machine learning model is to establish a relationship between inputs and outputs correctly – when their values are known. In other words – if we already have the activation function, then the only value that we can change in the activation function is the weightage of the inputs.

In the above example, we have used a single neuron. But, in reality, applications rely on more than one neuron or a combination of neurons - a neural network.

Usually, people use linear or very basic non-linear functions to define the activation function because it is easy to guess. However, the efficiency of the ML model depends on the selection of the activation function as well as the order in which they are used in the network. In other words, the order of activation functions will change the weightage and might result in a different model. Thus, the effective ML model depends not just on the type of activation function but also on the order in which activation functions are sequenced.

The layers of artificial neural network

ML models used in real-life applications comprise a series of neurons (i.e. the output from one neuron is the input to the next neuron).

There can be as many layers between the input neurons and the final output. All the intermediate layers are called hidden layers. The depth of the neural network is determined by the number of layers. Theoretically, the more the number of layers a neural network has, the more capable it is. However, the efficiency of an ML model is not completely dependent on the number of layers, but, in general, ML models with more number of layers tend to be more accurate.

For example, consider screen pixels. The more pixels, the better is the image quality. But, if you’re looking at a white picture, the number of pixels doesn’t really matter. Even just 5 pixels will make the picture look white.

More layers in the neural network will make the model highly sensitive to the inputs. It also results in more unknown weights, occupies more memory, and will require more time for training.

This ends part one of a two-part series, you can find part 2 of this blog post here.

Skill-Lync understands the role that Machine Learning will play in the career of a student and has launched a course - Machine Learning and Artificial Learning for Mechanical Engineers.

This blog has been written based on the webinar - Fundamentals of AI-ML conducted by Sarang, co-founder of Skill-Lync.

You can view both parts of the webinar here -

Get a 1-on-1 demo to understand what is included in the CAE course and how it can benefit you from an experienced career consultant.

Request a Demo Session