Machine Learning ppt

About This Presentation

Title:

Machine Learning ppt

Description:

Supervised Learning, Unsupervised Learning and Reinforcement – PowerPoint PPT presentation

Number of Views:27

Learn more at: http://www.powershow.com/presentation/upload

Slides: 94

Provided by: srilekh1214

more less

Transcript and Presenter's Notes

Title: Machine Learning ppt

1
Machine Learning-1

CODE -18CSE392T
III/IOT/5

2
UNIT -1

Machine learning What and Why ?-Types of Machine
learning-Supervised learning-Unsupervised
Learning -Reinforcement Learning- Linear
Regression-The Curse of dimensionality over
fitting and under fitting-Bias and variance
Tradeoff-Testing Cross validation
regularization- learning curve -classification-Err
or and noise Parametric Vs non Parametric models
Linear algebra for Machine Learning

3
What is machine learning?

Machine learning (ML) is a type of artificial
intelligence (AI) that allows software
applications to become more accurate at
predicting outcomes without being explicitly
programmed to do so.
Machine learning algorithms use historical data
as input to predict new output values.
Recommendation engines are a common use case for
machine learning. Other popular uses include
fraud detection, spam filtering, malware threat
detection, business process automation (BPA)
and predictive maintenance.

4
(No Transcript)
5
Why is machine learning important?

Machine learning is important because it gives
enterprises a view of trends in customer behavior
and business operational patterns, as well as
supports the development of new products.
Many of today's leading companies, such as
Facebook, Google and Uber, make machine learning
a central part of their operations.
Machine learning has become a significant
competitive differentiator for many companies.

6
When Do We Use Machine Learning?

ML is used when
Human expertise does not exist (navigating on
Mars)
Humans cant explain their expertise (speech
recognition)
Models must be customized (personalized
medicine)

7
A classic example of a task that requires machine
learningIt is very hard to say what makes a 2
8
Some more examples of tasks that are bestsolved
by using a learning algorithm

Recognizing patterns
Facial identities or facial expressions
Handwritten or spoken words
Medical images
Generating patterns
Generating images or motion sequences
Recognizing anomalies
Unusual credit card transactions
Unusual patterns of sensor readings in a
nuclear power plant
Prediction
Future stock prices or currency exchange rates

9
Applications

Email spam detection
Face detection and matching (e.g., iPhone X)
Web search
Sports predictions
Post office (e.g., sorting letters by zip
codes)
ATMs (e.g., reading checks)
Credit card fraud
Stock predictions
Product recommendations (e.g., Netix, Amazon)
Self-driving cars (e.g., Uber, Tesla)
Language translation (Google translate)

10
Definition of learning

Definition
A computer program is said to learn from
experience E with respect to some class of tasks
T and performance measure P, if its performance
at tasks T, as measured by P, improves with
experience E.
A computer program which learns from experience
is called a machine learning program or simply a
learning program. Such a program is sometimes
also referred to as a learner.

11
Examples

A chess learning problem
Task T Playing chess
Performance measure P Percent of games won
against opponents
Training experience E Playing practice games
against itself

12
What are the different types of machine
learning?

Classical machine learning is often categorized
by how an algorithm learns to become more
accurate in its predictions.
There are four basic approaches
Supervised Learning,
Unsupervised Learning,
Semi-supervised Learning,
Reinforcement Learning.
The type of algorithm data scientists choose to
use depends on what type of data they want to
predict.

13
Supervised Learning

Supervised learning is a type of machine learning
method in which we provide sample labeled data to
the machine learning system in order to train it,
and on that basis, it predicts the output.
The system creates a model using labeled data to
understand the datasets and learn about each
data, once the training and processing are done
then we test the model by providing a sample data
to check whether it is predicting the exact
output or not.

Supervised learning is a process of providing
input data as well as correct output data to the
machine learning model.
The aim of a supervised learning algorithm is
to find a mapping function to map the input
variable(x) with the output variable(y).
In the real-world, supervised learning can be
used for Risk Assessment, Image classification,
Fraud Detection, spam filtering, etc.

15
How Supervised Learning Works?

In supervised learning, models are trained using
labeled dataset, where the model learns about
each type of data.
Once the training process is completed, the
model is tested on the basis of test data (a
subset of the training set), and then it predicts
the output.

Now, after training, we test our model using the
test set, and the task of the model is to
identify the shape.
The machine is already trained on all types of
shapes, and when it finds a new shape, it
classifies the shape on the bases of a number of
sides, and predicts the output.

17
Steps Involved in Supervised Learning

First Determine the type of training dataset
Collect/Gather the labeled training data.
Split the training dataset into training dataset,
test dataset, and validation dataset.
Determine the input features of the training
dataset, which should have enough knowledge so
that the model can accurately predict the output.

Determine the suitable algorithm for the model,
such as support vector machine, decision tree,
etc.
Execute the algorithm on the training dataset.
Sometimes we need validation sets as the control
parameters, which are the subset of training
datasets.
Evaluate the accuracy of the model by providing
the test set. If the model predicts the correct
output, which means our model is accurate.

19
Types of supervised Machine learning Algorithms

1. Regression
Regression algorithms are used if there is a
relationship between the input variable and the
output variable.
It is used for the prediction of continuous
variables, such as Weather forecasting, Market
Trends, etc.

20
2. Classification

Classification algorithms are used when the
output variable is categorical, which means there
are two classes such as Yes-No, Male-Female,
True-false, etc.
Advantages of Supervised learning
With the help of supervised learning, the model
can predict the output on the basis of prior
experiences.
In supervised learning, we can have an exact idea
about the classes of objects.
Supervised learning model helps us to solve
various real-world problems such as fraud
detection, spam filtering, etc.

21
Disadvantages of supervised learning

Supervised learning models are not suitable for
handling the complex tasks.
Supervised learning cannot predict the correct
output if the test data is different from the
training dataset.

22
What is Unsupervised Learning?

unsupervised learning is a machine learning
technique in which models are not supervised
using training dataset.
Instead, models itself find the hidden patterns
and insights from the given data.
It can be compared to learning which takes place
in the human brain while learning new things. It
can be defined as
Unsupervised learning is a type of machine
learning in which models are trained using
unlabeled dataset and are allowed to act on that
data without any supervision.

Unsupervised learning cannot be directly applied
to a regression or classification problem because
unlike supervised learning, we have the input
data but no corresponding output data.
The goal of unsupervised learning is to find the
underlying structure of dataset, group that data
according to similarities, and represent that
dataset in a compressed format.
Example
Suppose the unsupervised learning algorithm is
given an input dataset containing images of
different types of cats and dogs.

24
Why use Unsupervised Learning?

Below are some main reasons which describe the
importance of Unsupervised Learning
Unsupervised learning is helpful for finding
useful insights from the data.
Unsupervised learning is much similar as a human
learns to think by their own experiences, which
makes it closer to the real AI.
Unsupervised learning works on unlabeled and
uncategorized data which make unsupervised
learning more important.
In real-world, we do not always have input data
with the corresponding output so to solve such
cases, we need unsupervised learning.

25
Working of Unsupervised Learning

Working of unsupervised learning can be
understood by the below diagram

Here, we have taken an unlabeled input data,
which means it is not categorized and
corresponding outputs are also not given. Now,
this unlabeled input data is fed to the machine
learning model in order to train it. Firstly, it
will interpret the raw data to find the hidden
patterns from the data and then will apply
suitable algorithms such as k-means clustering,
Decision tree, etc.
26
Types of Unsupervised Learning Algorithm

The unsupervised learning algorithm can be
further categorized into two types of problems

Clustering Clustering is a method of grouping
the objects into clusters such that objects with
most similarities remains into a group and has
less or no similarities with the objects of
another group.
Cluster analysis finds the commonalities between
the data objects and categorizes them as per the
presence and absence of those commonalities.
Association An association rule is an
unsupervised learning method which is used for
finding the relationships between variables in
the large database.
It determines the set of items that occurs
together in the dataset. Association rule makes
marketing strategy more effective.
Such as people who buy X item (suppose a bread)
are also tend to purchase Y (Butter/Jam) item.
A typical example of Association rule is Market
Basket Analysis.

Advantages of Unsupervised Learning
Unsupervised learning is used for more complex
tasks as compared to supervised learning because,
in unsupervised learning, we don't have labeled
input data.
Unsupervised learning is preferable as it is easy
to get unlabeled data in comparison to labeled
data.
Disadvantages of Unsupervised Learning
Unsupervised learning is intrinsically more
difficult than supervised learning as it does not
have corresponding output.
The result of the unsupervised learning algorithm
might be less accurate as input data is not
labeled, and algorithms do not know the exact
output in advance.

29
What is Reinforcement Learning?

Reinforcement Learning is a feedback-based
Machine learning technique in which an agent
learns to behave in an environment by performing
the actions and seeing the results of actions.
For each good action, the agent gets positive
feedback, and for each bad action, the agent gets
negative feedback or penalty.
In Reinforcement Learning, the agent learns
automatically using feedbacks without any labeled
data, unlike supervised learning.
Since there is no labeled data, so the agent is
bound to learn by its experience only.
RL solves a specific type of problem where
decision making is sequential, and the goal is
long-term, such as game-playing, robotics, etc.

"Reinforcement learning is a type of machine
learning method where an intelligent agent
(computer program) interacts with the environment
and learns to act within that.
It is a core part of Artificial intelligence, and
all AI agent works on the concept of
reinforcement learning. Here we do not need to
pre-program the agent, as it learns from its own
experience without any human intervention.
Example
The agent continues doing these three things
(take action, change state/remain in the same
state, and get feedback), and by doing these
actions, he learns and explores the environment.
The agent learns that what actions lead to
positive feedback or rewards and what actions
lead to negative feedback penalty. As a positive
reward, the agent gets a positive point, and as a
penalty, it gets a negative point.

31
(No Transcript)
32
Key Features of Reinforcement Learning

In RL, the agent is not instructed about the
environment and what actions need to be taken.
It is based on the hit and trial process.
The agent takes the next action and changes
states according to the feedback of the previous
action.
The agent may get a delayed reward.
The environment is stochastic, and the agent
needs to explore it to reach to get the maximum
positive rewards.

33
How does Reinforcement Learning Work?

To understand the working process of the RL, we
need to consider two main things
Environment It can be anything such as a room,
maze, football ground, etc.
Agent An intelligent agent such as AI robot.

Let's take an example of a maze environment that
the agent needs to explore. Consider the below
image

In the above image, the agent is at the very
first block of the maze. The maze is consisting
of an S6 block, which is a wall, S8 a fire pit,
and S4 a diamond block.
The agent cannot cross the S6 block, as it is a
solid wall. If the agent reaches the S4 block,
then get the 1 reward if it reaches the fire
pit, then gets -1 reward point. It can take four
actions move up, move down, move left, and move
right.
The agent can take any path to reach to the final
point, but he needs to make it in possible fewer
steps. Suppose the agent considers the
path S9-S5-S1-S2-S3, so he will get the 1-reward
point.

36
Types of Reinforcement learning

There are mainly two types of reinforcement
learning, which are
Positive Reinforcement
Negative Reinforcement

Positive Reinforcement
The positive reinforcement learning means adding
something to increase the tendency that expected
behavior would occur again.
It impacts positively on the behavior of the
agent and increases the strength of the behavior.
This type of reinforcement can sustain the
changes for a long time, but too much positive
reinforcement may lead to an overload of states
that can reduce the consequences.
Negative Reinforcement
The negative reinforcement learning is opposite
to the positive reinforcement as it increases the
tendency that the specific behavior will occur
again by avoiding the negative condition.
It can be more effective than the positive
reinforcement depending on situation and
behavior, but it provides reinforcement only to
meet minimum behavior.

38
(No Transcript)
39
(No Transcript)
40
The Curse of Dimensionality

Handling the high-dimensional data is very
difficult in practice, commonly known as
the curse of dimensionality.
If the dimensionality of the input dataset
increases, any machine learning algorithm and
model becomes more complex.
As the number of features increases, the number
of samples also gets increased proportionally,
and the chance of overfitting also increases.
If the machine learning model is trained on
high-dimensional data, it becomes over fitted and
results in poor performance.
Hence, it is often required to reduce the number
of features, which can be done with
dimensionality reduction.

Problems
High Dimensional Data is difficult to work with
since
Adding more features can increase the noise, and
hence the error
There usually arent enough observations to get
good estimates
This causes
Increase in running time
Overfitting
Number of Samples Required

42
Overfitting and Under fitting in Machine
Learning

.
Before understanding the overfitting and
underfitting, let's understand some basic term
that will help to understand this topic well
Signal It refers to the true underlying pattern
of the data that helps the machine learning model
to learn from the data.
Noise Noise is unnecessary and irrelevant data
that reduces the performance of the model.
Bias Bias is a prediction error that is
introduced in the model due to oversimplifying
the machine learning algorithms. Or it is the
difference between the predicted values and the
actual values.
Variance If the machine learning model performs
well with the training dataset, but does not
perform well with the test dataset, then variance
occurs.

43
Overfitting

Overfitting occurs when our machine
learning model tries to cover all the data points
or more than the required data points present in
the given dataset.
Because of this, the model starts caching noise
and inaccurate values present in the dataset, and
all these factors reduce the efficiency and
accuracy of the model. The over fitted model
has low bias and high variance.
The chances of occurrence of overfitting increase
as much we provide training to our model.
It means the more we train our model, the more
chances of occurring the over fitted model.
Overfitting is the main problem that occurs
in supervised learning.

Example The concept of the overfitting can be
understood by the below graph of the linear
regression output

As we can see from the above graph, the model
tries to cover all the data points present in the
scatter plot.
It may look efficient, but in reality, it is not
so. Because the goal of the regression model to
find the best fit line, but here we have not got
any best fit, so, it will generate the prediction
errors.
How to avoid the Overfitting in Model
Both overfitting and under fitting cause the
degraded performance of the machine learning
model.
But the main cause is overfitting, so there are
some ways by which we can reduce the occurrence
of overfitting in our model.
Cross-Validation
Training with more data
Removing features
Early stopping the training
Regularization

46
Underfitting

Underfitting occurs when our machine learning
model is not able to capture the underlying trend
of the data.
To avoid the overfitting in the model, the fed
of training data can be stopped at an early
stage, due to which the model may not learn
enough from the training data.
As a result, it may fail to find the best fit of
the dominant trend in the data.
In the case of underfitting, the model is not
able to learn enough from the training data, and
hence it reduces the accuracy and produces
unreliable predictions.
An underfitted model has high bias and low
variance.

How to avoid underfitting
By increasing the training time of the model.
By increasing the number of features.

48
Linear Regression

Linear regression is one of the easiest and most
popular Machine Learning algorithms. It is a
statistical method that is used for predictive
analysis.
Linear regression makes predictions for
continuous/real or numeric variables such
as sales, salary, age, product price, etc.
Linear regression algorithm shows a linear
relationship between a dependent (y) and one or
more independent (y) variables, hence called as
linear regression.
Since linear regression shows the linear
relationship, which means it finds how the value
of the dependent variable is changing according
to the value of the independent variable.

The linear regression model provides a sloped
straight line representing the relationship
between the variables. Consider the below image

y a0a1x e
Here,
Y Dependent Variable (Target Variable)X
Independent Variable (predictor Variable)a0
intercept of the line (Gives an additional degree
of freedom)a1 Linear regression coefficient
(scale factor to each input value).e random
error
The values for x and y variables are training
datasets for Linear Regression model
representation.

51
Types of Linear Regression

Linear regression can be further divided into
two types of the algorithm
Simple Linear Regression
If a single independent variable is used to
predict the value of a numerical dependent
variable, then such a Linear Regression algorithm
is called Simple Linear Regression.
Multiple Linear regression
If more than one independent variable is used to
predict the value of a numerical dependent
variable, then such a Linear Regression algorithm
is called Multiple Linear Regression.

52
(No Transcript)
53
Linear Regression Line

A linear line showing the relationship between
the dependent and independent variables is called
a regression line. A regression line can show two
types of relationship
Positive Linear Relationship
If the dependent variable increases on the Y-axis
and independent variable increases on X-axis,
then such a relationship is termed as a Positive
linear relationship.

Negative Linear Relationship
If the dependent variable decreases on the Y-axis
and independent variable increases on the X-axis,
then such a relationship is called a negative
linear relationship.

55
Bias Variance tradeoff

When we work with a supervised machine learning
algorithm, the model learns from the training
data.
The model always tries to best estimate the
mapping function between the output variable(Y)
and the input variable(X).
The estimation for target function may generate
the prediction error, which can be divided mainly
into Bias error, and Variance error. These errors
can be explained as
Bias Error Bias is a prediction error which is
introduced in the model due to oversimplifying
the machine learning algorithms. It is the
difference of predicted output and actual output.
There are two types of bias
High Bias If the suggested predicted values are
much different from actual value, then it is
called as high bias.

Due to high bias, an algorithm may miss the
relevant relationships between the input features
and target output, which is called underfitting.
Low Bias If the suggested predicted values are
less different from actual value, then it is
called as low bias.
Variance Error If the machine learning model
performs well with training dataset, but does not
perform well with test dataset, then variance
occurs.
It can also be defined as an error caused by the
model's sensitivity to small fluctuation in
training dataset.
The high variance would cause Overfitting in
machine learning model, which means an algorithm
introduce noise along with the underlying pattern
in data to the model.

In the machine learning model, we always try to
have low bias and low variance, and
If we try to increase the bias, the variance
decreases
If we try to increase the variance, the bias
decreases.
Hence, trying to get an optimal bias and variance
is called bias-variance trade-off. We can define
it using the Bull eye diagram given below.

58
(No Transcript)
59

There are four cases of bias and variances
If there is low bias and low variance, the
predicted output is mostly close to the desired
output.
If there is low bias and high variance, the model
is not consistent.
If there is high variance and low bias, the model
is consistent but predicted results are far away
from the actual output.
If there is high bias and high variance, then the
model is inconsistent, and also predictions are
much different with actual value. It is the worst
case of bias and variance.

60
What is ML Testing?

During the model development phase, data
scientists test the model performance by
comparing the model outputs (predicted values)
with the actual values.
Some of the techniques used to perform Black
Box testing on ML models are ...
Metamorphic testing This attempts to reduce
the test oracle problem.

61
Model evaluation in machine learning testing

Usually, software testing includes
Unit tests. The program is broken down into
blocks, and each element (unit) is tested
separately.
Regression tests. They cover already tested
software to see if it doesnt suddenly break.
Integration tests. This type of testing observes
how multiple components of the program work
together.

62
Cross-Validation in Machine Learning

Cross-validation is a technique for validating
the model efficiency by training it on the subset
of input data and testing on previously unseen
subset of the input data. We can also say that it
is a technique to check how a statistical model
generalizes to an independent dataset.
In machine learning, there is always the need to
test the stability of the model. It means based
only on the training dataset we can't fit our
model on the training dataset. For this purpose,
we reserve a particular sample of the dataset,
which was not part of the training dataset.
After that, we test our model on that sample
before deployment, and this complete process
comes under cross-validation.
This is something different from the general
train-test split.
Hence the basic steps of cross-validations are
Reserve a subset of the dataset as a validation
set.
Provide the training to the model using the
training dataset.
Now, evaluate model performance using the
validation set. If the model performs well with
the validation set, perform the further step,
else check for the issues.

63
Methods used for Cross-Validation

There are some common methods that are used for
cross-validation. These methods are given below
Validation Set Approach
Leave-P-out cross-validation
Leave one out cross-validation
K-fold cross-validation
Stratified k-fold cross-validation

64
Validation Set Approach

We divide our input dataset into a training set
and test or validation set in the validation set
approach. Both the subsets are given 50 of the
dataset.
But it has one of the big disadvantages that we
are just using a 50 dataset to train our model,
so the model may miss out to capture important
information of the dataset. It also tends to give
the underfitted model.

65
Leave-P-out cross-validation

In this approach, the p datasets are left out of
the training data.
It means, if there are total n datapoints in the
original input dataset, then n-p data points will
be used as the training dataset and the p data
points as the validation set.
This complete process is repeated for all the
samples, and the average error is calculated to
know the effectiveness of the model.
There is a disadvantage of this technique that
is, it can be computationally difficult for the
large p.

66
Leave-one-out cross-validation (LOOCV)

Leave-one-out cross-validation (LOOCV) is an
exhaustive cross-validation technique. It is a
category of LpOCV with the case of p1.
For a dataset having n rows, 1st row is selected
for validation, and the rest (n-1) rows are used
to train the model. For the next iteration, the
2nd row is selected for validation and rest to
train the model. Similarly, the process is
repeated until n steps or the desired number of
operations.
Both the above two cross-validation techniques
are the types of exhaustive cross-validation.
Exhaustive cross-validation methods are
cross-validation methods that learn and test in
all possible ways. They have the same pros and
cons discussed below
Pros
Simple, easy to understand, and implement.
Cons
The model may lead to a low bias.
The computation time required is high.

67
(No Transcript)
68
K-Fold Cross-Validation

K-fold cross-validation approach divides the
input dataset into K groups of samples of equal
sizes. These samples are called folds. For each
learning set, the prediction function uses k-1
folds, and the rest of the folds are used for the
test set. This approach is a very popular CV
approach because it is easy to understand, and
the output is less biased than other methods.
The steps for k-fold cross-validation are
Split the input dataset into K groups
For each group
Take one group as the reserve or test data set.
Use remaining groups as the training dataset
Fit the model on the training set and evaluate
the performance of the model using the test set.
Let's take an example of 5-folds
cross-validation. So, the dataset is grouped into
5 folds. On 1st iteration, the first fold is
reserved for test the model, and rest are used to
train the model. On 2nd iteration, the second
fold is used to test the model, and rest are used
to train the model. This process will continue
until each fold is not used for the test fold.

The final accuracy of the model is computed by
taking the mean accuracy of the k-models
validation data.
LOOCV is a variant of k-fold cross-validation
where kn.
Pros
The model has low bias
Low time complexity
The entire dataset is utilized for both training
and validation.
Cons
Not suitable for an imbalanced dataset.

70
(No Transcript)
71
(No Transcript)
72
Stratified k-fold cross-validation

For all the cross-validation techniques
discussed above, they may not work well with an
imbalanced dataset. Stratified k-fold
cross-validation solved the problem of an
imbalanced dataset.
In Stratified k-fold cross-validation, the
dataset is partitioned into k groups or folds
such that the validation data has an equal number
of instances of target class label. This ensures
that one particular class is not over present in
the validation or train data especially when the
dataset is imbalanced.
It can be understood with an example of housing
prices, such that the price of some houses can be
much high than other houses.
To tackle such situations, a stratified k-fold
cross-validation technique is useful.
Pros
Works well for an imbalanced dataset.
Cons
Now suitable for time series dataset.

You may get confused why I uploaded the same
picture with different heading for stratified and
k fold cross validation but both types of CV
works in a similar way. The only difference is
whenever a test-train split is done, the
proportion of classes in variables will get
distributed equally among test train sets. So
test-train split is balance. Thus the
disadvantage of k fold is solved by stratified
cross validation.

74
Time Series cross-validation

The order of the data is very important for
time-series related problem. For time-related
dataset random split or k-fold split of data into
train and validation may not yield good results.
For the time-series dataset, the split of data
into train and validation is according to the
time also referred to as forward chaining
method or rolling cross-validation. For a
particular iteration, the next instance of train
data can be treated as validation data.

As mentioned in the above diagram, for the 1st
iteration, 1st 3 rows are considered as train
data and the next instance T4 is validation data.
The chance of choice of train and validation data
is forwarded for further iterations.

76
Limitations of Cross-Validation

There are some limitations of the
cross-validation technique, which are given
below
For the ideal conditions, it provides the optimum
output. But for the inconsistent data, it may
produce a drastic result.
The data evolves over a period, due to which, it
may face the differences between the training set
and validation sets. Such as if we create a model
for the prediction of stock market values, and
the data is trained on the previous 5 years stock
values, but the realistic future values for the
next 5 years may drastically different, so it is
difficult to expect the correct output for such
situations.

77
Applications of Cross-Validation

This technique can be used to compare the
performance of different predictive modeling
methods.
It has great scope in the medical research field.
It can also be used for the meta-analysis, as it
is already being used by the data scientists in
the field of medical statistics.

78
Learning Curve

A learning curve is a correlation between a
learner's performance on a task and the number of
attempts or time required to complete the task
this can be represented as a direct proportion on
a graph.

The diagram below should help you visualize the
process described so far.
On the training set column you can see that we
constantly increase the size of the training
sets. This causes a slight change in our
models f.
In the first row, where n 1 (n is the number
of training instances), the model fits perfectly
that single training data point.
However, the very same model fits really bad a
validation set of 20 different data points. So
the models error is 0 on the training set, but
much higher on the validation set.
As we increase the training set size, the model
cannot fit perfectly anymore the training set.
So the training error becomes larger. However,
the model is trained on more data, so it manages
to fit better the validation set.
Thus, the validation error decreases. To remind
you, the validation set stays the same across all
three cases.

80
(No Transcript)
81
Classification

As we know, the Supervised Machine Learning
algorithm can be broadly classified into
Regression and Classification Algorithms.
In Regression algorithms, we have predicted the
output for continuous values, but to predict the
categorical values, we need Classification
algorithms.

Classification is a process of categorizing a
given set of data into classes, It can be
performed on both structured or unstructured
data.
The process starts with predicting the class of
given data points. The classes are often referred
to as target, label or categories.
The classification predictive modeling is the
task of approximating the mapping function from
input variables to discrete output variables.
The main goal is to identify which class/category
the new data will fall into.
Let us try to understand this with a simple
example.
Heart disease detection can be identified as a
classification problem, this is a binary
classification since there can be only two
classes i.e has heart disease or does not have
heart disease. The classifier, in this case,
needs training data to understand how the given
input variables are related to the class. And
once the classifier is trained accurately, it can
be used to detect whether heart disease is there
or not for a particular patient.
Since classification is a type of supervised
learning, even the targets are also provided with
the input data. Let us get familiar with the
classification in machine learning terminologies.

83
What is the Classification Algorithm?

The Classification algorithm is a Supervised
Learning technique that is used to identify the
category of new observations on the basis of
training data.
In Classification, a program learns from the
given dataset or observations and then classifies
new observation into a number of classes or
groups.
Such as, Yes or No, 0 or 1, Spam or Not Spam, cat
or dog, etc. Classes can be called as
targets/labels or categories.
Unlike regression, the output variable of
Classification is a category, not a value, such
as "Green or Blue", "fruit or animal", etc.
Since the Classification algorithm is a
Supervised learning technique, hence it takes
labeled input data, which means it contains input
with the corresponding output.
In classification algorithm, a discrete output
function(y) is mapped to input variable(x).

yf(x), where y categorical output
The best example of an ML classification
algorithm is Email Spam Detector.
The main goal of the Classification algorithm is
to identify the category of a given dataset, and
these algorithms are mainly used to predict the
output for the categorical data.
Classification algorithms can be better
understood using the below diagram.
In the below diagram, there are two classes,
class A and Class B.
These classes have features that are similar to
each other and dissimilar to other classes.

he algorithm which implements the classification
on a dataset is known as a classifier. There are
two types of Classifications
Binary Classifier If the classification problem
has only two possible outcomes, then it is called
as Binary Classifier.Examples YES or NO, MALE
or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.
Multi-class Classifier If a classification
problem has more than two outcomes, then it is
called as Multi-class Classifier.Example Classif
ications of types of crops, Classification of
types of music.

86
Learners in Classification Problems

In the classification problems, there are two
types of learners
Lazy Learners Lazy Learner firstly stores the
training dataset and wait until it receives the
test dataset. In Lazy learner case,
classification is done on the basis of the most
related data stored in the training dataset. It
takes less time in training but more time for
predictions.Example K-NN algorithm, Case-based
reasoning
Eager LearnersEager Learners develop a
classification model based on a training dataset
before receiving a test dataset. Opposite to Lazy
learners, Eager Learner takes more time in
learning, and less time in prediction. Example De
cision Trees, Naïve Bayes, ANN.

87
Types of ML Classification Algorithms

Classification Algorithms can be further divided
into the Mainly two category
Linear Models
Logistic Regression
Support Vector Machines
Non-linear Models
K-Nearest Neighbours
Kernel SVM
Naïve Bayes
Decision Tree Classification
Random Forest Classification

88
Error in ML

For supervised learning applications in machine
learning and statistical learning theory,
generalization error (also known as the
out-of-sample error or the risk) is a measure of
how accurately an algorithm is able to predict
outcome values for previously unseen data.
Error-rate is normally measured on the testing
set only. In this case, it may also be called the
off-training-set error-rate or OTS error.
It may also be called the prediction error-rate,
or generalization error-rate.

If error is calculated on the training set, then
it would be called the training error-rate.
For binary classification problems, there are two
primary types of errors. Type 1 errors (false
positives) and Type 2 errors (false negatives).
It's often possible through model selection and
tuning to increase one while decreasing the
other, and often one must choose which error
type is more acceptable.

90
What are the common types of error in machine
learning?

Below we will cover the following types of
error measurements
Specificity or True Negative Rate (TNR)
Precision,
Positive Predictive Value (PPV) Recall,
Sensitivity,
Hit Rate or True Positive Rate (TPR)

91
Noise

The errors are known as noise. Without the proper
training,data noise can create issues in machine
learning algorithms, as the algorithm thinks of
that noise as a pattern and can start
generalizing from it.
The image below shows how different types of
noise can impact datasets

92
What does noise mean in ML?

Noise is anything that is spurious and extraneous
to the original data, that is not intended to be
present in the first place, but was introduced
due to faulty capturing process.

93
How does machine learning deal with noise?

The simplest way to handle noisy data is to
collect more data.
The more data you collect, the better will you be
able to identify the underlying phenomenon that
is generating the data. This will eventually help
in reducing the effect of noise.

Write a Comment

User Comments (0)