Using computers to analyze data and find patterns in large datasets is achieved with Machine Learning. However, there are different ways that data can be observed.
Classification - Looking at elements in a dataset and determining what category the element belongs to, based on preset parameters.
|Visual representation of elements divided by line which was created by a function, created by Classification|
Imagine that I worked in a post office and I had to sort mail into different categories, Packages and letters. The only information that I am given is the weight of the mail. Based on the weight of the incoming mail, a model can be built so that mail can be classified as a package or a letter.
Clustering - Grouping elements in a dataset into categories, based on attributes of the elements.
An example of this would be if I had a data set of what students used to read textbooks. The program could first look at the given data set and separate the electronic methods from the non-electronic. Then we would have 2 groups which we could label as: Physical Books and Electronic devices. We would have the program continue and it could separate the Electronic devices by screen size and we would now have 3 groups: Physical books, Phones, and Laptops.
This differs from Classification in that the data is not trained since this is an unsupervised Machine Learning method.
Regression - Creating a function to predict what a value will be, when given certain parameters.
Let’s say I wanted to buy a car. I could create a model which looks at the year of the car, gas mileage and other various information and produce an estimated price of the car.
These are the most common types of machine learning algorithms. I'll be delving into them in more detail in my later posts