An introduction to Machine Learning

So, you have heard a lot of people talking about machine learning and you wonder what the hype is about Machine Learning.

Very briefly, before we go into a proper explanation. Let me excite you with a simple scenario. So, when you were given birth to, you started learning a lot of things based on interaction with those things. For example, you learnt to not touch fire because you were instructed never to touch fires because they can hurt you, but probably, you were very curious like and decided to find out how this happens, so you step near the lit candle and tried touching the beautiful teardrop-like rainbow and ouch!, you discovered, this beautiful “rainbow” ain’t so nice at all. You have now learnt what touching a teardrop like rainbow feels like. Based on this little experience of yours, the million dollar question is:

‍WOULD YOU TOUCH A RAINBOW ON A GAS STOVE?

Your answer is probably, DEFINITELY NOT! Because you have been able to generalize an experience to other scenarios without being explicitly told. You probably might be scared to touch fires even in textbooks, but, over time, when you interact much with other drawings in a book, you can make proper deductions of what scenarios you can be hurt if you touch “the rainbow”.

You feel smart right? You are able to learn from your experiences. Even a dog that is not human can learn from experiences and can tell if you are a danger or not. Even the little cockroach can sense some danger and dribble you around to avoid death. Oh! God must be smart to have created these. What if you can do the same? Wow! You would feel like God, right? Well, that’s what we call ARTIFICIAL INTELLIGENCE. Artificial Intelligence is simply creating systems which are able to reason and make deductions simulating human intelligence. This means, Artificial Intelligence is just not some piece of metal doing some mechanical work, rather, it is a system which tries to reason before making a certain decision (and of course, can be used to power some arranged piece of metal to carry out a task). Now, let’s get a bit technical.

Machine Learning and Artificial Intelligence

A common mistake people make in the industry is thinking machine learning and artificial intelligence are the same. Although a lot of people use artificial intelligence as a synonym for machine learning, they are quite different.

Artificial Intelligence (AI)

AI is a broad discipline which aims at creating intelligent machines. Intelligence in this case is the ability to reason and make deductions or inference given a certain task. Alan Turing, said, A system can be deemed intelligent if it can trick a person into thinking it is a human.

Turing Test

Given an interrogator, human participant and a Computer participant, the interrogator who is unaware of the two participants’ location communicates with the two participants. The Computer will be deemed “Intelligent” if the human cannot tell with certainty Computer is apart from the human.

What is machine learning?

Machine Learning is an application of artificial intelligence which provides systems the ability to learn and improve from experience without being explicitly programmed. In machine learning, we simply provide some data to some algorithm, and allow the algorithm learn patterns of the provided data, thereby developing an ability to predict (and generalize) unseen but relatable scenarios. The primary aim in machine learning is to allow computers learn automatically from data without human intervention. Machine learning can be understood as a branch of Artificial Intelligence. Today, machine learning forms a major part of Artificial Intelligence.

Supervised Learning

Supervised Machine Learning algorithms make use of labelled data to train. This means that we feed in data which we know what the output is to the algorithm.

The algorithm thus learns patterns (or features) of the data and maps it to the output which is provided. The labelled data thus serves as guidance for the model to learn. The model checks the relationship between features of an instance, take a peek at what the output is to see if it has correctly made a prediction, if not, it adjusts its learning to achieve an optimal solution (reducing the difference between the actual labels and predictions).

Supervised machine learning algorithms are broadly classified into two:

‍Classification and Regression.

Classification: is a supervised machine learning algorithm which maps input features to a particular discrete label. In classification, the target values (dependent variable) is a discrete value. Classification tasks can be binary or multi classification, where binary classification task is where the target values are only 2 (e.g. Yes or No; Fraud or Not Fraud; Normal or Abnormal), while, multi classification task is where the target values are more than 2 (e.g. A to Z; 0 to 9; Triangle-Square-or-Circle.

Regression: is a supervised machine learning approach which maps input features to a continuous value label. In regression, the target value (dependent variable) is a continuous value. Examples of regression tasks are a prediction of product weight, the prediction of an object height etc.

Quick Exercise:

Is this a classification problem or Regression?

1. Predicting House Prices based on area

2. Predicting whether a document is related to sighting of UFOs?AIMLDL3. Predicting Stock prices in finance

4. Predicting Power usage in high-performance computing

5. Predicting Age of employees at Eblocks

6. Predicting nationality of a person

7. Predicting whether stock price of a company will increase tomorrow

8. Predicting the rate of exchange between Dollar and Rand on 1st December 2019

9. Predicting if Eblocks will close for the year in December

10. Predicting the gender of a person by his/her handwriting style

11. Predicting the number of copies a music album will be sold next month

12. Predicting the momentum of CBR for the next iteration.

13. Predicting the strength of Rand against Dollar

Unsupervised Learning

Unsupervised Machine Learning algorithms are those algorithms which do not require labelled data to train. You can feed in an unstructured or unlabeled data to your model and it would learn from the data even though the model has no access to the actual outcome. You would observe that this differs quite well from supervised learning where the model has some guidance as it is able to adjust its learning based by minimizing the difference between the actual outcome and predicted outcome.

In unsupervised learning, the algorithm simply checks the relationships between features of a particular instance and observes how a similar pattern is repeated in some other instances in the data provided. Unsupervised learning is a very efficient machine learning algorithm as it helps to find all kinds of unknown patterns in data. With unsupervised learning, you can easily find which feature to use for classification. Another advantage of supervised learning over unsupervised learning is that in the real world, it is easier and common to get unlabeled data than labelled.

Quick Exercise

1. Can you separate all employees of Eblocks software based on your discretion?

2. Can you separate Eblocks Boardroom based on your discretion?

Semi Supervised Learning

Semi Supervised Learning is a hybrid of supervised and unsupervised machine learning. In semi supervised learning, you provide a partially labelled data for a model to learn from. There are few samples of labelled data available, so you want to make the model learn based on the labelling. In this case, the model is partially guided to know what particular outcome we are interested in and looks at all other instances and tries to separate them based on the outcomes (or target values) provided.

Quick Exercise

If Tayla and Heather are selected to be part of a group and Tafadzwa and Sashi are selected to be part of another group, what groups will you put Candice, Jeremiah and Deon?

‍Reinforcement Learning

This type of machine learning algorithm is a bit different from the ones mentioned above. In reinforcement learning, the algorithm is made to relate with its environment and learn better through the use of carrot and stick approach. The algorithm is penalized if it makes errors and rewarded if it does well. The goal is to have a reduced penalty and obtain maximum reward.

Quick Exercise

Jeremiah is a year old baby, his elder sister Tayla has been instructed by their Father, Sashi, to teach Jeremiah to walk in a space of few months and if Jeremiah is not able to walk within the time period given, Tayla would be grounded for 6 months and won’t be given the usual R150 a day of upkeep. Tayla wants to teach Jeremiah to walk, but Jeremiah would rather prefer to crawl around looking for where he might see some sweets. HOW CAN TAYLA HELP JEREMIAH TO WALK TO AVOID BEING GROUNDED?