Brief Introduction to Supervised Learning


Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples.

Supervised learning uses classification algorithms and regression techniques to develop predictive models. The algorithms include linear regression, logistic regression, and neural networks as well, apart from the decision tree, Support Vector Machine (SVM), random forest, naive Bayes, and k-nearest neighbor.

Supervised machine learning is used in a wide range of sectors (such as finance, online advertising, and analytics) because it allows you to train your system to make pricing predictions, campaign adjustments, customer recommendations, and much more while the system self-adjusts and makes decisions on its own.

Requirements

For this post, you will need to install the following software, if you haven't already done so:

 A simple example

Let's look at a quick example of what machine learning is and what it is not. Here, we're using scikit-learn's datasets, submodule to create two objects and variables, also known as features, which are along the column axis. y is a vector with the same number of values as there are rows in X. In this case, y is a class label. For the sake of an example, y here could be a binary label corresponding to a real-world occurrence, such as the malignancy of a tumor.X is then a matrix of attributes that describe y. One feature could be the diameter of the tumor, and another could indicate its density. The preceding explanation can be seen in the following code:

import numpy as np
from sklearn.datasets import make_classification

X,y=make_classification(n_samples=100,random_state=42)

print(X)

[[-2.02514259  0.0291022  -0.47494531 ... -0.33450124  0.86575519
  -1.20029641]
 [ 1.61371127  0.65992405 -0.15005559 ...  1.37570681  0.70117274
  -0.2975635 ]
 [ 0.16645221  0.95057302  1.42050425 ...  1.18901653 -0.55547712
  -0.63738713]
 ...
 [-0.03955515 -1.60499282  0.22213377 ... -0.30917212 -0.46227529
  -0.43449623]
 [ 1.08589557  1.2031659  -0.6095122  ... -0.3052247  -1.31183623
  -1.06511366]
 [-0.00607091  1.30857636 -0.17495976 ...  0.99204235  0.32169781
  -0.66809045]]

print(y)

[0 0 1 1 0 0 0 1 0 1 1 0 0 0 1 1 1 0 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0
 0 1 1 1 0 1 0 0 1 1 0 0 1 1 1 0 1 0 0 1 1 0 1 1 1 1 1 0 1 0 0 1 0 1 0 1 0
 1 1 1 0 0 0 1 0 1 0 1 1 1 1 1 0 0 1 0 1 1 0 1 1 0 0]


from sklearn.linear_model import LogisticRegression

model = LogisticRegression().fit(X, y)
decision = model.predict(X)
print(decision)


[0 0 1 1 0 0 0 1 0 1 1 0 0 0 1 1 1 0 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0
 0 1 1 1 0 1 0 0 1 1 0 0 1 1 1 0 1 0 0 1 1 0 1 1 1 1 1 0 1 0 0 1 0 1 0 1 0
 1 1 1 0 0 0 1 0 1 0 1 1 1 1 1 0 0 1 0 1 1 0 1 1 0 0]

So, now we're at a point where we need to define specifically what supervised learning is. Supervised learning is precisely the example we just described previously. Given our matrix of examples, X, in a vector of corresponding labels, y, that learns a function which approximates the value of y.



No comments

Powered by Blogger.