Into to Machine Learning: Supervised learning
Bài đăng này đã không được cập nhật trong 3 năm
A machine learning has divided into two main categories supervised learning and unsupervised learning. However in this article, we are going to focus on supervised learning only. We'll discuss about what is supervised learning? How does it work? and a few links to resources where you do a further research about this topic.
What is supervised learning?
Supervised learning is the machine learning task of inferring a function from labeled training data. source: wiki
In other words, supervised learning is an algorithm used to generalize a given set of data(sample) and try to predict an output of new data. And this video will explain you more about supervised learning.
Supervised learning problems can be grouped into two group:
- Classification: A classification problem is when the output variable is a category, such as “red” or “blue” or “disease” and “no disease”.
- Regression: A regression problem is when the output variable is a real value, such as “dollars” or “weight”.
Ex: Giving a set of example where the input: [1, 2, 3, 4, 5, 6]
, and its output: [1, 4, 9, 16, 25, 36]
.
Then when the new input is 7
, what is the result of its output?
You will answer 49
.
Why?
How does supervised learning work?
These are steps of supervised learning:
- First, train a machine learning model using labeled data
- "Labeled data" is a set of given examples(dataset) with its output(label).
EX: dataset: Input:
[1, 2, 3, 4, 5, 6]
, Output:[1, 4, 9, 16, 25, 36]
where Output:[1, 4, 9, 16, 25, 36]
is a know label for given attributes. - "Machine learning model" learn the relationship between the attributes of data with its label.
In the case above we got:
1=>1, 2=>4, 3=>9, 4=>16, 5=>25, 6=>36
, then we can generalize that it isx^2
.
- Then, make predictions on new data for which label is unknow
we got "Machine learning model" where output =
input^2
, soinput =
7 ->7^2 = 49
.
A supervised learning in practice.
Now, let's train our machine to solve the example above.
To boot thing up, we are going to use a scikit-learn
a python machine learning library, and we use LinearRegression algorithm to model our dataset.
But first we need to install scikit-learn
library by using pip
pip install -U scikit-learn
then let's train our model
import numpy as np
from sklearn.linear_model import LinearRegression
# assign x = array([[1],[2],[3],[4],[5],[6],[7],[8],[9],[10]])
x = np.arange(6)[:, None]
# assign y= array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81]) (a contiguous flattened array and power of 2)
y = np.ravel(x) ** 2
# assign = array([1, 2])
p = np.array([1, 2])
# train with given examples
model = LinearRegression().fit(x ** p, y)
#predict the outcome
print(model.predict([7 ** p]))
[49]
print(model.predict([11 ** p]))
[121]
then you should see OO yes, it works.
Resources
- http://work.caltech.edu/library/014.html
- https://www.saylor.org/site/wp-content/uploads/2011/11/CS405-6.2.1.2-WIKIPEDIA.pdf
- http://blog.kaggle.com/2015/04/08/new-video-series-introduction-to-machine-learning-with-scikit-learn/
- http://machinelearningmastery.com/get-your-hands-dirty-with-scikit-learn-now/
- http://machinelearningmastery.com/a-gentle-introduction-to-scikit-learn-a-python-machine-learning-library/
- http://stackoverflow.com/questions/33710829/linear-regression-with-quadratic-terms/33712121
- https://docs.scipy.org/doc/numpy/reference/generated/numpy.ravel.html
All rights reserved