Machine Learning ppt

About This Presentation
Title:

Machine Learning ppt

Description:

Supervised Learning, Unsupervised Learning and Reinforcement – PowerPoint PPT presentation

Number of Views:27
Learn more at: http://www.powershow.com/presentation/upload

less

Transcript and Presenter's Notes

Title: Machine Learning ppt


1
Machine Learning-1
  • CODE -18CSE392T
  • III/IOT/5

2
UNIT -1
  • Machine learning What and Why ?-Types of Machine
    learning-Supervised learning-Unsupervised
    Learning -Reinforcement Learning- Linear
    Regression-The Curse of dimensionality over
    fitting and under fitting-Bias and variance
    Tradeoff-Testing Cross validation
    regularization- learning curve -classification-Err
    or and noise Parametric Vs non Parametric models
    Linear algebra for Machine Learning

3
What is machine learning?
  • Machine learning (ML) is a type of artificial
    intelligence (AI) that allows software
    applications to become more accurate at
    predicting outcomes without being explicitly
    programmed to do so.
  • Machine learning algorithms use historical data
    as input to predict new output values.
  • Recommendation engines are a common use case for
    machine learning. Other popular uses include
    fraud detection, spam filtering, malware threat
    detection, business process automation (BPA)
    and predictive maintenance.

4
(No Transcript)
5
Why is machine learning important?
  • Machine learning is important because it gives
    enterprises a view of trends in customer behavior
    and business operational patterns, as well as
    supports the development of new products.
  • Many of today's leading companies, such as
    Facebook, Google and Uber, make machine learning
    a central part of their operations.
  • Machine learning has become a significant
    competitive differentiator for many companies.

6
When Do We Use Machine Learning?
  • ML is used when
  • Human expertise does not exist (navigating on
    Mars)
  • Humans cant explain their expertise (speech
    recognition)
  • Models must be customized (personalized
    medicine)

7
A classic example of a task that requires machine
learningIt is very hard to say what makes a 2
8
Some more examples of tasks that are bestsolved
by using a learning algorithm
  • Recognizing patterns
  • Facial identities or facial expressions
  • Handwritten or spoken words
  • Medical images
  • Generating patterns
  • Generating images or motion sequences
  • Recognizing anomalies
  • Unusual credit card transactions
  • Unusual patterns of sensor readings in a
    nuclear power plant
  • Prediction
  • Future stock prices or currency exchange rates

9
Applications
  • Email spam detection
  • Face detection and matching (e.g., iPhone X)
  • Web search
  • Sports predictions
  • Post office (e.g., sorting letters by zip
    codes)
  • ATMs (e.g., reading checks)
  • Credit card fraud
  • Stock predictions
  • Product recommendations (e.g., Netix, Amazon)
  • Self-driving cars (e.g., Uber, Tesla)
  • Language translation (Google translate)

10
Definition of learning
  • Definition
  • A computer program is said to learn from
    experience E with respect to some class of tasks
    T and performance measure P, if its performance
    at tasks T, as measured by P, improves with
    experience E.
  • A computer program which learns from experience
    is called a machine learning program or simply a
    learning program. Such a program is sometimes
    also referred to as a learner.

11
Examples
  • A chess learning problem
  • Task T Playing chess
  • Performance measure P Percent of games won
    against opponents
  • Training experience E Playing practice games
    against itself

12
What are the different types of machine
learning?
  • Classical machine learning is often categorized
    by how an algorithm learns to become more
    accurate in its predictions.
  • There are four basic approaches 
  • Supervised Learning, 
  • Unsupervised Learning,
  • Semi-supervised Learning,
  • Reinforcement Learning.
  • The type of algorithm data scientists choose to
    use depends on what type of data they want to
    predict.

13
 Supervised Learning
  • Supervised learning is a type of machine learning
    method in which we provide sample labeled data to
    the machine learning system in order to train it,
    and on that basis, it predicts the output.
  • The system creates a model using labeled data to
    understand the datasets and learn about each
    data, once the training and processing are done
    then we test the model by providing a sample data
    to check whether it is predicting the exact
    output or not.

14
  • Supervised learning is a process of providing
    input data as well as correct output data to the
    machine learning model.
  • The aim of a supervised learning algorithm is
    to find a mapping function to map the input
    variable(x) with the output variable(y).
  • In the real-world, supervised learning can be
    used for Risk Assessment, Image classification,
    Fraud Detection, spam filtering, etc.

15
How Supervised Learning Works?
  • In supervised learning, models are trained using
    labeled dataset, where the model learns about
    each type of data.
  • Once the training process is completed, the
    model is tested on the basis of test data (a
    subset of the training set), and then it predicts
    the output.

16
  • Now, after training, we test our model using the
    test set, and the task of the model is to
    identify the shape.
  • The machine is already trained on all types of
    shapes, and when it finds a new shape, it
    classifies the shape on the bases of a number of
    sides, and predicts the output.

17
Steps Involved in Supervised Learning
  • First Determine the type of training dataset
  • Collect/Gather the labeled training data.
  • Split the training dataset into training dataset,
    test dataset, and validation dataset.
  • Determine the input features of the training
    dataset, which should have enough knowledge so
    that the model can accurately predict the output.

18
  • Determine the suitable algorithm for the model,
    such as support vector machine, decision tree,
    etc.
  • Execute the algorithm on the training dataset.
    Sometimes we need validation sets as the control
    parameters, which are the subset of training
    datasets.
  • Evaluate the accuracy of the model by providing
    the test set. If the model predicts the correct
    output, which means our model is accurate.

19
Types of supervised Machine learning Algorithms
  • 1. Regression
  • Regression algorithms are used if there is a
    relationship between the input variable and the
    output variable.
  • It is used for the prediction of continuous
    variables, such as Weather forecasting, Market
    Trends, etc.

20
2. Classification
  • Classification algorithms are used when the
    output variable is categorical, which means there
    are two classes such as Yes-No, Male-Female,
    True-false, etc.
  • Advantages of Supervised learning
  • With the help of supervised learning, the model
    can predict the output on the basis of prior
    experiences.
  • In supervised learning, we can have an exact idea
    about the classes of objects.
  • Supervised learning model helps us to solve
    various real-world problems such as fraud
    detection, spam filtering, etc.

21
Disadvantages of supervised learning
  • Supervised learning models are not suitable for
    handling the complex tasks.
  • Supervised learning cannot predict the correct
    output if the test data is different from the
    training dataset.

22
What is Unsupervised Learning?
  • unsupervised learning is a machine learning
    technique in which models are not supervised
    using training dataset.
  • Instead, models itself find the hidden patterns
    and insights from the given data.
  • It can be compared to learning which takes place
    in the human brain while learning new things. It
    can be defined as
  • Unsupervised learning is a type of machine
    learning in which models are trained using
    unlabeled dataset and are allowed to act on that
    data without any supervision.

23
  • Unsupervised learning cannot be directly applied
    to a regression or classification problem because
    unlike supervised learning, we have the input
    data but no corresponding output data.
  • The goal of unsupervised learning is to find the
    underlying structure of dataset, group that data
    according to similarities, and represent that
    dataset in a compressed format.
  • Example
  •  Suppose the unsupervised learning algorithm is
    given an input dataset containing images of
    different types of cats and dogs. 

24
Why use Unsupervised Learning?
  • Below are some main reasons which describe the
    importance of Unsupervised Learning
  • Unsupervised learning is helpful for finding
    useful insights from the data.
  • Unsupervised learning is much similar as a human
    learns to think by their own experiences, which
    makes it closer to the real AI.
  • Unsupervised learning works on unlabeled and
    uncategorized data which make unsupervised
    learning more important.
  • In real-world, we do not always have input data
    with the corresponding output so to solve such
    cases, we need unsupervised learning.

25
Working of Unsupervised Learning
  • Working of unsupervised learning can be
    understood by the below diagram

Here, we have taken an unlabeled input data,
which means it is not categorized and
corresponding outputs are also not given. Now,
this unlabeled input data is fed to the machine
learning model in order to train it. Firstly, it
will interpret the raw data to find the hidden
patterns from the data and then will apply
suitable algorithms such as k-means clustering,
Decision tree, etc.
26
Types of Unsupervised Learning Algorithm
  • The unsupervised learning algorithm can be
    further categorized into two types of problems

27
  • Clustering Clustering is a method of grouping
    the objects into clusters such that objects with
    most similarities remains into a group and has
    less or no similarities with the objects of
    another group.
  • Cluster analysis finds the commonalities between
    the data objects and categorizes them as per the
    presence and absence of those commonalities.
  • Association An association rule is an
    unsupervised learning method which is used for
    finding the relationships between variables in
    the large database.
  • It determines the set of items that occurs
    together in the dataset. Association rule makes
    marketing strategy more effective.
  • Such as people who buy X item (suppose a bread)
    are also tend to purchase Y (Butter/Jam) item.
  • A typical example of Association rule is Market
    Basket Analysis.

28
  • Advantages of Unsupervised Learning
  • Unsupervised learning is used for more complex
    tasks as compared to supervised learning because,
    in unsupervised learning, we don't have labeled
    input data.
  • Unsupervised learning is preferable as it is easy
    to get unlabeled data in comparison to labeled
    data.
  • Disadvantages of Unsupervised Learning
  • Unsupervised learning is intrinsically more
    difficult than supervised learning as it does not
    have corresponding output.
  • The result of the unsupervised learning algorithm
    might be less accurate as input data is not
    labeled, and algorithms do not know the exact
    output in advance.

29
What is Reinforcement Learning?
  • Reinforcement Learning is a feedback-based
    Machine learning technique in which an agent
    learns to behave in an environment by performing
    the actions and seeing the results of actions.
  • For each good action, the agent gets positive
    feedback, and for each bad action, the agent gets
    negative feedback or penalty.
  • In Reinforcement Learning, the agent learns
    automatically using feedbacks without any labeled
    data, unlike supervised learning.
  • Since there is no labeled data, so the agent is
    bound to learn by its experience only.
  • RL solves a specific type of problem where
    decision making is sequential, and the goal is
    long-term, such as game-playing, robotics, etc.

30
  • "Reinforcement learning is a type of machine
    learning method where an intelligent agent
    (computer program) interacts with the environment
    and learns to act within that.
  • It is a core part of Artificial intelligence, and
    all AI agent works on the concept of
    reinforcement learning. Here we do not need to
    pre-program the agent, as it learns from its own
    experience without any human intervention.
  • Example 
  • The agent continues doing these three things
    (take action, change state/remain in the same
    state, and get feedback), and by doing these
    actions, he learns and explores the environment.
  • The agent learns that what actions lead to
    positive feedback or rewards and what actions
    lead to negative feedback penalty. As a positive
    reward, the agent gets a positive point, and as a
    penalty, it gets a negative point.

31
(No Transcript)
32
Key Features of Reinforcement Learning
  • In RL, the agent is not instructed about the
    environment and what actions need to be taken.
  • It is based on the hit and trial process.
  • The agent takes the next action and changes
    states according to the feedback of the previous
    action.
  • The agent may get a delayed reward.
  • The environment is stochastic, and the agent
    needs to explore it to reach to get the maximum
    positive rewards.

33
How does Reinforcement Learning Work?
  • To understand the working process of the RL, we
    need to consider two main things
  • Environment It can be anything such as a room,
    maze, football ground, etc.
  • Agent An intelligent agent such as AI robot.

34
  • Let's take an example of a maze environment that
    the agent needs to explore. Consider the below
    image

35
  • In the above image, the agent is at the very
    first block of the maze. The maze is consisting
    of an S6 block, which is a wall, S8 a fire pit,
    and S4 a diamond block.
  • The agent cannot cross the S6 block, as it is a
    solid wall. If the agent reaches the S4 block,
    then get the 1 reward if it reaches the fire
    pit, then gets -1 reward point. It can take four
    actions move up, move down, move left, and move
    right.
  • The agent can take any path to reach to the final
    point, but he needs to make it in possible fewer
    steps. Suppose the agent considers the
    path S9-S5-S1-S2-S3, so he will get the 1-reward
    point.

36
Types of Reinforcement learning
  • There are mainly two types of reinforcement
    learning, which are
  • Positive Reinforcement
  • Negative Reinforcement

37
  • Positive Reinforcement
  • The positive reinforcement learning means adding
    something to increase the tendency that expected
    behavior would occur again.
  • It impacts positively on the behavior of the
    agent and increases the strength of the behavior.
  • This type of reinforcement can sustain the
    changes for a long time, but too much positive
    reinforcement may lead to an overload of states
    that can reduce the consequences.
  • Negative Reinforcement
  • The negative reinforcement learning is opposite
    to the positive reinforcement as it increases the
    tendency that the specific behavior will occur
    again by avoiding the negative condition.
  • It can be more effective than the positive
    reinforcement depending on situation and
    behavior, but it provides reinforcement only to
    meet minimum behavior.

38
(No Transcript)
39
(No Transcript)
40
The Curse of Dimensionality
  • Handling the high-dimensional data is very
    difficult in practice, commonly known as
    the curse of dimensionality.
  •  If the dimensionality of the input dataset
    increases, any machine learning algorithm and
    model becomes more complex.
  • As the number of features increases, the number
    of samples also gets increased proportionally,
    and the chance of overfitting also increases.
  • If the machine learning model is trained on
    high-dimensional data, it becomes over fitted and
    results in poor performance.
  • Hence, it is often required to reduce the number
    of features, which can be done with
    dimensionality reduction.

41
  • Problems
  • High Dimensional Data is difficult to work with
    since
  • Adding more features can increase the noise, and
    hence the error
  • There usually arent enough observations to get
    good estimates
  • This causes
  • Increase in running time
  • Overfitting
  • Number of Samples Required

42
Overfitting and Under fitting in Machine
Learning
  • .
  • Before understanding the overfitting and
    underfitting, let's understand some basic term
    that will help to understand this topic well
  • Signal It refers to the true underlying pattern
    of the data that helps the machine learning model
    to learn from the data.
  • Noise Noise is unnecessary and irrelevant data
    that reduces the performance of the model.
  • Bias Bias is a prediction error that is
    introduced in the model due to oversimplifying
    the machine learning algorithms. Or it is the
    difference between the predicted values and the
    actual values.
  • Variance If the machine learning model performs
    well with the training dataset, but does not
    perform well with the test dataset, then variance
    occurs.

43
Overfitting
  • Overfitting occurs when our machine
    learning model tries to cover all the data points
    or more than the required data points present in
    the given dataset.
  • Because of this, the model starts caching noise
    and inaccurate values present in the dataset, and
    all these factors reduce the efficiency and
    accuracy of the model. The over fitted model
    has low bias and high variance.
  • The chances of occurrence of overfitting increase
    as much we provide training to our model.
  • It means the more we train our model, the more
    chances of occurring the over fitted model.
  • Overfitting is the main problem that occurs
    in supervised learning.

44
  • Example The concept of the overfitting can be
    understood by the below graph of the linear
    regression output

45
  • As we can see from the above graph, the model
    tries to cover all the data points present in the
    scatter plot.
  • It may look efficient, but in reality, it is not
    so. Because the goal of the regression model to
    find the best fit line, but here we have not got
    any best fit, so, it will generate the prediction
    errors.
  • How to avoid the Overfitting in Model
  • Both overfitting and under fitting cause the
    degraded performance of the machine learning
    model.
  • But the main cause is overfitting, so there are
    some ways by which we can reduce the occurrence
    of overfitting in our model.
  • Cross-Validation
  • Training with more data
  • Removing features
  • Early stopping the training
  • Regularization

46
Underfitting
  • Underfitting occurs when our machine learning
    model is not able to capture the underlying trend
    of the data.
  • To avoid the overfitting in the model, the fed
    of training data can be stopped at an early
    stage, due to which the model may not learn
    enough from the training data.
  • As a result, it may fail to find the best fit of
    the dominant trend in the data.
  • In the case of underfitting, the model is not
    able to learn enough from the training data, and
    hence it reduces the accuracy and produces
    unreliable predictions.
  • An underfitted model has high bias and low
    variance.

47
  • How to avoid underfitting
  • By increasing the training time of the model.
  • By increasing the number of features.

48
Linear Regression
  • Linear regression is one of the easiest and most
    popular Machine Learning algorithms. It is a
    statistical method that is used for predictive
    analysis.
  • Linear regression makes predictions for
    continuous/real or numeric variables such
    as sales, salary, age, product price, etc.
  • Linear regression algorithm shows a linear
    relationship between a dependent (y) and one or
    more independent (y) variables, hence called as
    linear regression.
  • Since linear regression shows the linear
    relationship, which means it finds how the value
    of the dependent variable is changing according
    to the value of the independent variable.

49
  • The linear regression model provides a sloped
    straight line representing the relationship
    between the variables. Consider the below image

50
  • y a0a1x e
  • Here,
  • Y Dependent Variable (Target Variable)X
    Independent Variable (predictor Variable)a0
    intercept of the line (Gives an additional degree
    of freedom)a1 Linear regression coefficient
    (scale factor to each input value).e random
    error
  • The values for x and y variables are training
    datasets for Linear Regression model
    representation.

51
Types of Linear Regression
  • Linear regression can be further divided into
    two types of the algorithm
  • Simple Linear Regression
  • If a single independent variable is used to
    predict the value of a numerical dependent
    variable, then such a Linear Regression algorithm
    is called Simple Linear Regression.
  • Multiple Linear regression
  • If more than one independent variable is used to
    predict the value of a numerical dependent
    variable, then such a Linear Regression algorithm
    is called Multiple Linear Regression.

52
(No Transcript)
53
Linear Regression Line
  • A linear line showing the relationship between
    the dependent and independent variables is called
    a regression line. A regression line can show two
    types of relationship
  • Positive Linear Relationship
  • If the dependent variable increases on the Y-axis
    and independent variable increases on X-axis,
    then such a relationship is termed as a Positive
    linear relationship.

54
  • Negative Linear Relationship
  • If the dependent variable decreases on the Y-axis
    and independent variable increases on the X-axis,
    then such a relationship is called a negative
    linear relationship.

55
Bias Variance tradeoff
  • When we work with a supervised machine learning
    algorithm, the model learns from the training
    data.
  • The model always tries to best estimate the
    mapping function between the output variable(Y)
    and the input variable(X).
  • The estimation for target function may generate
    the prediction error, which can be divided mainly
    into Bias error, and Variance error. These errors
    can be explained as
  • Bias Error Bias is a prediction error which is
    introduced in the model due to oversimplifying
    the machine learning algorithms. It is the
    difference of predicted output and actual output.
    There are two types of bias
  • High Bias If the suggested predicted values are
    much different from actual value, then it is
    called as high bias.

56
  • Due to high bias, an algorithm may miss the
    relevant relationships between the input features
    and target output, which is called underfitting.
  • Low Bias If the suggested predicted values are
    less different from actual value, then it is
    called as low bias.
  • Variance Error If the machine learning model
    performs well with training dataset, but does not
    perform well with test dataset, then variance
    occurs.
  • It can also be defined as an error caused by the
    model's sensitivity to small fluctuation in
    training dataset.
  • The high variance would cause Overfitting in
    machine learning model, which means an algorithm
    introduce noise along with the underlying pattern
    in data to the model.

57
  • In the machine learning model, we always try to
    have low bias and low variance, and
  • If we try to increase the bias, the variance
    decreases
  • If we try to increase the variance, the bias
    decreases.
  • Hence, trying to get an optimal bias and variance
    is called bias-variance trade-off. We can define
    it using the Bull eye diagram given below.

58
(No Transcript)
59
  • There are four cases of bias and variances
  • If there is low bias and low variance, the
    predicted output is mostly close to the desired
    output.
  • If there is low bias and high variance, the model
    is not consistent.
  • If there is high variance and low bias, the model
    is consistent but predicted results are far away
    from the actual output.
  • If there is high bias and high variance, then the
    model is inconsistent, and also predictions are
    much different with actual value. It is the worst
    case of bias and variance.

60
What is ML Testing?
  • During the model development phase, data
    scientists test the model performance by
    comparing the model outputs (predicted values)
    with the actual values.
  • Some of the techniques used to perform Black
    Box testing on ML models are ...
    Metamorphic testing This attempts to reduce
    the test oracle problem.

61
Model evaluation in machine learning testing
  • Usually, software testing includes
  • Unit tests. The program is broken down into
    blocks, and each element (unit) is tested
    separately.
  • Regression tests. They cover already tested
    software to see if it doesnt suddenly break.
  • Integration tests. This type of testing observes
    how multiple components of the program work
    together.

62
Cross-Validation in Machine Learning
  • Cross-validation is a technique for validating
    the model efficiency by training it on the subset
    of input data and testing on previously unseen
    subset of the input data. We can also say that it
    is a technique to check how a statistical model
    generalizes to an independent dataset.
  • In machine learning, there is always the need to
    test the stability of the model. It means based
    only on the training dataset we can't fit our
    model on the training dataset. For this purpose,
    we reserve a particular sample of the dataset,
    which was not part of the training dataset.
  • After that, we test our model on that sample
    before deployment, and this complete process
    comes under cross-validation.
  • This is something different from the general
    train-test split.
  • Hence the basic steps of cross-validations are
  • Reserve a subset of the dataset as a validation
    set.
  • Provide the training to the model using the
    training dataset.
  • Now, evaluate model performance using the
    validation set. If the model performs well with
    the validation set, perform the further step,
    else check for the issues.

63
Methods used for Cross-Validation
  • There are some common methods that are used for
    cross-validation. These methods are given below
  • Validation Set Approach
  • Leave-P-out cross-validation
  • Leave one out cross-validation
  • K-fold cross-validation
  • Stratified k-fold cross-validation

64
Validation Set Approach
  • We divide our input dataset into a training set
    and test or validation set in the validation set
    approach. Both the subsets are given 50 of the
    dataset.
  • But it has one of the big disadvantages that we
    are just using a 50 dataset to train our model,
    so the model may miss out to capture important
    information of the dataset. It also tends to give
    the underfitted model.

65
Leave-P-out cross-validation
  • In this approach, the p datasets are left out of
    the training data.
  • It means, if there are total n datapoints in the
    original input dataset, then n-p data points will
    be used as the training dataset and the p data
    points as the validation set.
  • This complete process is repeated for all the
    samples, and the average error is calculated to
    know the effectiveness of the model.
  • There is a disadvantage of this technique that
    is, it can be computationally difficult for the
    large p.

66
Leave-one-out cross-validation (LOOCV)
  • Leave-one-out cross-validation (LOOCV) is an
    exhaustive cross-validation technique. It is a
    category of LpOCV with the case of p1.
  • For a dataset having n rows, 1st row is selected
    for validation, and the rest (n-1) rows are used
    to train the model. For the next iteration, the
    2nd row is selected for validation and rest to
    train the model. Similarly, the process is
    repeated until n steps or the desired number of
    operations.
  • Both the above two cross-validation techniques
    are the types of exhaustive cross-validation.
    Exhaustive cross-validation methods are
    cross-validation methods that learn and test in
    all possible ways. They have the same pros and
    cons discussed below
  • Pros
  • Simple, easy to understand, and implement.
  • Cons
  • The model may lead to a low bias.
  • The computation time required is high.

67
(No Transcript)
68
K-Fold Cross-Validation
  • K-fold cross-validation approach divides the
    input dataset into K groups of samples of equal
    sizes. These samples are called folds. For each
    learning set, the prediction function uses k-1
    folds, and the rest of the folds are used for the
    test set. This approach is a very popular CV
    approach because it is easy to understand, and
    the output is less biased than other methods.
  • The steps for k-fold cross-validation are
  • Split the input dataset into K groups
  • For each group
  • Take one group as the reserve or test data set.
  • Use remaining groups as the training dataset
  • Fit the model on the training set and evaluate
    the performance of the model using the test set.
  • Let's take an example of 5-folds
    cross-validation. So, the dataset is grouped into
    5 folds. On 1st iteration, the first fold is
    reserved for test the model, and rest are used to
    train the model. On 2nd iteration, the second
    fold is used to test the model, and rest are used
    to train the model. This process will continue
    until each fold is not used for the test fold.

69
  • The final accuracy of the model is computed by
    taking the mean accuracy of the k-models
    validation data.
  • LOOCV is a variant of k-fold cross-validation
    where kn.
  • Pros
  • The model has low bias
  • Low time complexity
  • The entire dataset is utilized for both training
    and validation.
  • Cons
  • Not suitable for an imbalanced dataset.

70
(No Transcript)
71
(No Transcript)
72
Stratified k-fold cross-validation
  • For all the cross-validation techniques
    discussed above, they may not work well with an
    imbalanced dataset. Stratified k-fold
    cross-validation solved the problem of an
    imbalanced dataset.
  • In Stratified k-fold cross-validation, the
    dataset is partitioned into k groups or folds
    such that the validation data has an equal number
    of instances of target class label. This ensures
    that one particular class is not over present in
    the validation or train data especially when the
    dataset is imbalanced.
  • It can be understood with an example of housing
    prices, such that the price of some houses can be
    much high than other houses.
  • To tackle such situations, a stratified k-fold
    cross-validation technique is useful.
  • Pros
  • Works well for an imbalanced dataset.
  • Cons
  • Now suitable for time series dataset.

73
  • You may get confused why I uploaded the same
    picture with different heading for stratified and
    k fold cross validation but both types of CV
    works in a similar way. The only difference is
    whenever a test-train split is done, the
    proportion of classes in variables will get
    distributed equally among test train sets. So
    test-train split is balance. Thus the
    disadvantage of k fold is solved by stratified
    cross validation.

74
Time Series cross-validation
  • The order of the data is very important for
    time-series related problem. For time-related
    dataset random split or k-fold split of data into
    train and validation may not yield good results.
  • For the time-series dataset, the split of data
    into train and validation is according to the
    time also referred to as forward chaining
    method or rolling cross-validation. For a
    particular iteration, the next instance of train
    data can be treated as validation data.

75
  • As mentioned in the above diagram, for the 1st
    iteration, 1st 3 rows are considered as train
    data and the next instance T4 is validation data.
    The chance of choice of train and validation data
    is forwarded for further iterations.

76
Limitations of Cross-Validation
  • There are some limitations of the
    cross-validation technique, which are given
    below
  • For the ideal conditions, it provides the optimum
    output. But for the inconsistent data, it may
    produce a drastic result.
  • The data evolves over a period, due to which, it
    may face the differences between the training set
    and validation sets. Such as if we create a model
    for the prediction of stock market values, and
    the data is trained on the previous 5 years stock
    values, but the realistic future values for the
    next 5 years may drastically different, so it is
    difficult to expect the correct output for such
    situations.

77
Applications of Cross-Validation
  • This technique can be used to compare the
    performance of different predictive modeling
    methods.
  • It has great scope in the medical research field.
  • It can also be used for the meta-analysis, as it
    is already being used by the data scientists in
    the field of medical statistics.

78
Learning Curve
  • A  learning curve  is a correlation between a
    learner's performance on a task and the number of
    attempts or time required to complete the task
    this can be represented as a direct proportion on
    a graph.

79
  • The diagram below should help you visualize the
    process described so far.
  • On the training set column you can see that we
    constantly increase the size of the training
    sets. This causes a slight change in our
    models f.
  • In the first row, where n 1 (n is the number
    of training instances), the model fits perfectly
    that single training data point.
  • However, the very same model fits really bad a
    validation set of 20 different data points. So
    the models error is 0 on the training set, but
    much higher on the validation set.
  • As we increase the training set size, the model
    cannot fit perfectly anymore the training set.
  • So the training error becomes larger. However,
    the model is trained on more data, so it manages
    to fit better the validation set.
  • Thus, the validation error decreases. To remind
    you, the validation set stays the same across all
    three cases.

80
(No Transcript)
81
Classification
  • As we know, the Supervised Machine Learning
    algorithm can be broadly classified into
    Regression and Classification Algorithms.
  • In Regression algorithms, we have predicted the
    output for continuous values, but to predict the
    categorical values, we need Classification
    algorithms.

82
  • Classification is a process of categorizing a
    given set of data into classes, It can be
    performed on both structured or unstructured
    data.
  • The process starts with predicting the class of
    given data points. The classes are often referred
    to as target, label or categories.
  • The classification predictive modeling is the
    task of approximating the mapping function from
    input variables to discrete output variables.
  • The main goal is to identify which class/category
    the new data will fall into.
  • Let us try to understand this with a simple
    example.
  • Heart disease detection can be identified as a
    classification problem, this is a binary
    classification since there can be only two
    classes i.e has heart disease or does not have
    heart disease. The classifier, in this case,
    needs training data to understand how the given
    input variables are related to the class. And
    once the classifier is trained accurately, it can
    be used to detect whether heart disease is there
    or not for a particular patient.
  • Since classification is a type of supervised
    learning, even the targets are also provided with
    the input data. Let us get familiar with the
    classification in machine learning terminologies.

83
What is the Classification Algorithm?
  • The Classification algorithm is a Supervised
    Learning technique that is used to identify the
    category of new observations on the basis of
    training data.
  • In Classification, a program learns from the
    given dataset or observations and then classifies
    new observation into a number of classes or
    groups.
  • Such as, Yes or No, 0 or 1, Spam or Not Spam, cat
    or dog, etc. Classes can be called as
    targets/labels or categories.
  • Unlike regression, the output variable of
    Classification is a category, not a value, such
    as "Green or Blue", "fruit or animal", etc.
  • Since the Classification algorithm is a
    Supervised learning technique, hence it takes
    labeled input data, which means it contains input
    with the corresponding output.
  • In classification algorithm, a discrete output
    function(y) is mapped to input variable(x).

84
  • yf(x), where y  categorical output 
  • The best example of an ML classification
    algorithm is Email Spam Detector.
  • The main goal of the Classification algorithm is
    to identify the category of a given dataset, and
    these algorithms are mainly used to predict the
    output for the categorical data.
  • Classification algorithms can be better
    understood using the below diagram.
  • In the below diagram, there are two classes,
    class A and Class B.
  • These classes have features that are similar to
    each other and dissimilar to other classes.

85
  • he algorithm which implements the classification
    on a dataset is known as a classifier. There are
    two types of Classifications
  • Binary Classifier If the classification problem
    has only two possible outcomes, then it is called
    as Binary Classifier.Examples YES or NO, MALE
    or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.
  • Multi-class Classifier If a classification
    problem has more than two outcomes, then it is
    called as Multi-class Classifier.Example Classif
    ications of types of crops, Classification of
    types of music.

86
Learners in Classification Problems
  • In the classification problems, there are two
    types of learners
  • Lazy Learners Lazy Learner firstly stores the
    training dataset and wait until it receives the
    test dataset. In Lazy learner case,
    classification is done on the basis of the most
    related data stored in the training dataset. It
    takes less time in training but more time for
    predictions.Example K-NN algorithm, Case-based
    reasoning
  • Eager LearnersEager Learners develop a
    classification model based on a training dataset
    before receiving a test dataset. Opposite to Lazy
    learners, Eager Learner takes more time in
    learning, and less time in prediction. Example De
    cision Trees, Naïve Bayes, ANN.

87
Types of ML Classification Algorithms
  • Classification Algorithms can be further divided
    into the Mainly two category
  • Linear Models
  • Logistic Regression
  • Support Vector Machines
  • Non-linear Models
  • K-Nearest Neighbours
  • Kernel SVM
  • Naïve Bayes
  • Decision Tree Classification
  • Random Forest Classification

88
Error in ML
  • For supervised learning applications in machine
    learning and statistical learning theory,
    generalization error (also known as the
    out-of-sample error or the risk) is a measure of
    how accurately an algorithm is able to predict
    outcome values for previously unseen data.
  • Error-rate is normally measured on the testing
    set only. In this case, it may also be called the
    off-training-set error-rate or OTS error.
  • It may also be called the prediction error-rate,
    or generalization error-rate.

89
  • If error is calculated on the training set, then
    it would be called the training error-rate.
  • For binary classification problems, there are two
    primary types of errors. Type 1 errors (false
    positives) and Type 2 errors (false negatives).
  • It's often possible through model selection and
    tuning to increase one while decreasing the
    other, and often one must choose which error
    type is more acceptable.

90
What are the common types of error in machine
learning?
  • Below we will cover the following types of
    error measurements
  • Specificity or True Negative Rate (TNR)
    Precision,
  • Positive Predictive Value (PPV) Recall,
    Sensitivity,
  • Hit Rate or True Positive Rate (TPR)

91
Noise
  • The errors are known as noise. Without the proper
    training,data noise can create issues in machine
    learning algorithms, as the algorithm thinks of
    that noise as a pattern and can start
    generalizing from it.
  • The image below shows how different types of
    noise can impact datasets

92
What does noise mean in ML?
  • Noise is anything that is spurious and extraneous
    to the original data, that is not intended to be
    present in the first place, but was introduced
    due to faulty capturing process.

93
How does machine learning deal with noise?
  • The simplest way to handle noisy data is to
    collect more data.
  • The more data you collect, the better will you be
    able to identify the underlying phenomenon that
    is generating the data. This will eventually help
    in reducing the effect of noise.
Write a Comment
User Comments (0)