Skip to content

Ilm o Irfan

Technologies

  • Home
  • Courses
    • Free Courses
  • Motivational Seminars
  • Blog
  • Books
  • Contact
Menu
  • Home
  • Courses
    • Free Courses
  • Motivational Seminars
  • Blog
  • Books
  • Contact
Search
Close this search box.
Explore Courses

Machine Learning

  • January 12, 2023
  • No Comments
  • Home
  • Free Book

Table of Contents

Chapter 9 - Version Space Algorithms

Chapter Outline

  • Chapter Outline
  • Quick Recap
  • Version Space 
  • List Then Eliminate Algorithm
  • Candidate Elimination Algorithm 
  • Chapter Summary

Quick Recap

  • Quick Recap – FIND-S Algorithm
  • Learning is a Searching Problem
  • Research is defined as a systematic investigation of sources and materials , to develop solutions for Real-world Problems to improve the quality (peace and happiness) of human life 
  • A Researcher is a man of character and (s)he should safeguard his character 
  • The word Research is made up of
    •  Re and
    •  search 
  • Therefore, in Research, we mainly refine the solution(s) proposed for a Real-world Problem
  • The main steps of a Research Cycle are as follows
    • Step 1: Identify the Real-world Problem 
    • Step 2: Propose Solution (called Solution 01) to solve the Real-world Problem
    • Step 3: List down Strengths and Weaknesses of Solution 01
    • Step 4: Propose Solution (called Solution 02) to 
      •  further strengthen the Strengths of Solution 01
      •  overcome limitations of Solution 01
    • Step 5: List down Strengths and Weaknesses of Solution 02
    • Step 6: Propose Solution (called Solution 03) to 
      •  further strengthen the Strengths of Solution 02
      •  overcome limitations of Solution 02
    • Step 7: Continue this cycle till the Day of Judgment 😊
  • Considering FIND-S Algorithm 
    • Input to Learner (FIND-S Algorithm)
      • Set of Training Examples (D)
      • Set of Functions / Hypothesis (H)
    • Output by Learner (FIND-S Algorithm)
      • A Hypothesis (h) from H which best fits the Training Examples (D)
        • Note that h is an approximation of Target Function
  • Inductive Bias Is the set of assumptions needed in addition to Training Examples to justify Deductively Learner’s Classification
  • FIND-S Algorithm – Summary
    • Representation of Example
      • Attribute-Value Pair
    • Representation of Hypothesis (h)
      • Conjunction (AND) of Constraints on Attributes
    • Searching Strategy
      • I am not clear about this. Please drop me an email if you know.
    • Training Regime
      • Incremental Method
    • Inductive Bias of FIND-S Algorithm
      • Training Data is error-free 
      • Target Function / Concept is present in the Hypothesis Space (H)
    • Strengths
      • Returns a Model (h), which can be used to make predictions on unseen data
    • Weaknesses
      • Only works on error-free Data
        • However, Real-world Data is noisy 
      • Works on assumption that Target Function is present in the Hypothesis Space (H)
        • However, we may / may not find the Target Function in the Hypothesis Space (H) and this may / may not be known 
      • Only returns one hypothesis which best fits the Training Data
        • However, there can be multiple hypothesis , which best fit the Training Data

Version Space

  • Recap - FIND-S Algorithm
  • Strengths
    • Returns a Model (h), which can be used to make predictions on unseen data
  • Weaknesses
    • Only works on error free Data
      • However, Real-world Data is noisy
    • Works on assumption that Target Function f is present in the Hypothesis Space (H)
      • However, we may / may not find the Target Function f in the Hypothesis Space (H) and this may / may not be known
    • Only returns one hypothesis which best fits the Training Data
      • However, there can be multiple hypothesis, which best fit the Training Data
  • Main Problem in FIND-S Algorithm
  • Problem
    • Only returns one hypothesis which best fits the Training Data
      • However, there can be multiple hypothesis, which best fit the Training Data
  • Proposed Solution
    • Version Space
  • Version Space vs Version Space Algorithm
  • Version Space (VSH,D)
    • Version Space (VSH,D) contains set of all hypothesis consistent with the Training Examples
  •  Version Space Algorithm
    • Version Space Algorithm computes the set of all hypothesis consistent with the Training Examples
  • Version Space Algorithms
  • In this Chapter, In sha Allah we will study two Version Space Algorithms 
    • List Then Eliminate Algorithm
    • Candidate Elimination Algorithm
  • Gender Identification Problem
  • In this Chapter, we will use the same Gender Identfication Probelm presented in 
    • Chapter 8 – FIND-S Algorithm
  • Representation of Training Example (d) and Hypothesis (h)
  • Representation of Training Example (d) 
    • Attribute-Value Pair
  • Representation of Hypothesis (h)
    • Conjunction of Constraints on Input Attributes
  • Sample Data – Vector Representation
  • Vector Representation of Examples

              x1= <Short, Light, Short, Yes, Yes, Half> +

              x2= <Short, Light, Long, Yes, Yes, Half> +

             x3 = <Tall, Heavy, Long, Yes, Yes, Full> –

             x4 = <Short, Light, Long, Yes, No, Full> +

             x5 = <Short, Light, Short, Yes, Yes, Half> –

             x6 = <Tall, Light, Short, No, Yes, Full> –

  • Training Data – Vector Representation
  • Vector Representation of Training Examples

               x1= <Short, Light, Short, Yes, Yes, Half> +

               x2= <Short, Light, Long, Yes, Yes, Half> +

               x3 = <Tall, Heavy, Long, Yes, Yes, Full> –

               x4 = <Short, Light, Long, Yes, No, Full> +

  • Testing Data – Vector Representation
  • Vector Representation of Test Examples

               x5 = <Short, Light, Short, Yes, Yes, Half> –

               x6 = <Tall, Light, Short, No, Yes, Full> –

List Then Eliminate Algorithm

  • Machine Learning Cycle
  • Four phases of a Machine Learning Cycle are
    • Training Phase
      • Build the Model using Training Data
    • Testing Phase
      • Evaluate the performance of Model using Testing Data
    • Application Phase
      • Deploy the Model in Real-world, to make prediction on Real-time unseen Data
    • Feedback Phase
      • Take Feedback form the Users and Domain Experts to improve the Model
  • Note
  • For List Then Eliminate Algorithm, In sha Allah I will only execute the 
    • Training Phase 😊
  • Hyptoehsis Sapce (H)
  • Consider a Hyptoehsis Sapce (H) of 15 Hypothesis (h)
    • H = {h1, h2, h3, h4, h5, h6, h7, h8, h9, h10, h11, h12, h13, h14, h15}
  • In sha Allah, in next Slides, I will try to explain how to use List then Eliminate Algoirthm to find out 
    • All hypotheis that are consistent with the Training Data 
      • i.e. Version Space
  • List Then Eliminate Algorithm
Source : Tom Mitchel Book
  • Training Phase
  • Version Space  ← a list containing every hypothesis in H
    • VSH,D = {h1, h2, h3, h4, h5, h6, h7, h8, h9, h10, h11, h12, h13, h14, h15}
  • Training Phase, Cont…
  • Remove from Version Space any hypothesis that is inconsistent with the Training Example h(x) ≠ c(x) 
    • Training Example x1
      • h1, h2, h3 are inconsistent with Training Example x1
      • VSH,D = { h4, h5, h6, h7, h8, h9, h10, h11, h12, h13, h14, h15}
    • Training Example x2
      • h14, h15 are inconsistent with Training Example x2
      • VSH,D = { h4, h5, h6, h7, h8, h9, h10, h11, h12, h13}
    • Training Example x3
      • h6, h9, h12 are inconsistent with Training Example x3
      • VSH,D = { h4, h5, h7, h8, h10, h11, h13}
    • Training Example x4
      • h5, h11 are inconsistent with Training Example x4
      • VSH,D = { h4, h7, h8, h10, h13}
  • Training Phase, Cont…
  • Output the list of hypotheses in VersionSpace
    • The list of hypotheses in VersionSpace are 
      • VSH,D = { h4, h7, h8, h10, h13}
  • Conclusion 
    • List Then Eliminate Algorithm returned 
      • 5 Hypotheses that are consistent with the Training Data i.e. Version Space (VSH,D)
  • Inductive Bias - List Then Eliminate Algorithm
  • Inductive Bias 
    • Inductive Bias is the set of assumptions needed in addition to Training Examples to justify Deductively Learner’s Classification
  • Inductive Bias of List Then Eliminate Algorithm
    • Training data is error free
    • Target Function / Concept is present in the Hypothesis Space (H)
  • Training Regime- List Then Eliminate Algorithm
  • Incremental Method 
    • See Chapter 2 – Basics of Machine Learning
  • Strengths and Weaknesses – List Then Eliminate Algorithm
  • Strengths 
    • List Then Eliminate Algorithm overcomes the limitation of FIND-S Algorithm and 
      • returns multiple hypothesis, which best fits the Training Data i.e. Version Space (VSH,D )
  • Weaknesses 
    • Only works on error free Data
      • However, Real-world Data is noisy
    • Works on assumption that Target Function is present in the Hypothesis Space (H)
      • However, we may / may not find the Target Function f in the Hypothesis Space (H) and this may / may not be known
    • List Then Eliminate Algorithm is computationally expensive because it makes a 
      • pairwise comparison of each Training Example with all hypothesis in Hypothesis Space (H)
  • Why List Then Eliminate Algorithm is Computationally Expensive?
  • Question
    • Why List Then Eliminate Algorithm is computationally expensive?
  • Answer
    • List Then Eliminate Algorithm is computationally expensive because
      • Hypothesis Space (H) is represented as a List
    • Representing Hypothesis Space (H) as a List, forces
      • List Then Eliminate Algorithm to make pairwise comparisons
  • Conclusion 
    • To reduce the computational cost of Version Space Algorithms 
      • Have a more compact representation of Hypothesis Space (H)

Candidate Elimination Algorithm

  • Limitation of List Then Eliminate Algorithm
  • Problem
    • List Then Eliminate Algorithm is computationally expensive because
      • representation of Hypothesis Space (H) is not compact
  • Proposed Solution 
    • Represent Hypothesis Space (H) as
      • General to Specific Ordering of Hypothesis
  • Representation of Hypothesis Space (H) in Candidate Elimination Algorithm
  • In Candidate Elimination Algorithm, Hypothesis Space (H) is represented as
    • General to Specific Ordering of Hypothesis

  • Representation of Hypothesis Space (H) in Candidate Elimination Algorithm
  • General to Specific Ordering of Hypothesis

  • Machine Learning Cycle
  • Four phases of a Machine Learning Cycle are
    • Training Phase
      • Build the Model using Training Data
    • Testing Phase
      • Evaluate the performance of Model using Testing Data
    • Application Phase
      • Deploy the Model in Real-world, to make prediction on Real-time unseen Data
    • Feedback Phase
      • Take Feedback form the Users and Domain Experts to improve the Model
  • Training Phase – Candidate Elimination Algorithm
Source : Tom Mitchel Book
  • Training Phase
  • Representing Version Space (VSH,D)
    • The General Boundary G of Version Space VSH,D is the set of maximally general members (or hypotheses) 
    • The Specific Boundary S of Version Space VSH,D is the set of maximally specific members (or hypotheses)
    • Every member (or hypotheses) of the Version Space VSH,D lies between these boundaries

  • After observing all the Training Examples, the Candidate Elimination Algorithm will 
    • Output Version Space (comprising of 6 hypothesis)
    • VSH,D  
      • <Short, Light, ?, Yes, ?, ?>
      • <Short, ?, ?, Yes, ?, ?>
      • <Short, Light, ?, ?, ?, ?>
      • <?, Light, ?, Yes, ?, ?>
      • <Short, ?, ?, ?, ?, ?>
      • <?, Light, ?, ?, ?, ?>  
  • Note that all 6 hypotheses in the Version Space (VSH,D) are an approximation of the Target Function f
  • Recall

  • Training Data
    • x1= <Short, Light, Short, Yes, Yes, Half> +
    • x2= <Short, Light, Long, Yes, Yes, Half> +
    • x3 = <Tall, Heavy, Long, Yes, Yes, Full> –
    • x4 = <Short, Light, Long, Yes, No, Full> +
  • 6 Models in the Version Space VSH,D
    • <Short, Light, ?, Yes, ?, ?>
    • <Short, ?, ?, Yes, ?, ?>
    • <Short, Light, ?, ?, ?, ?>
    • <?, Light, ?, Yes, ?, ?>
    • <Short, ?, ?, ?, ?, ?>
    • <?, Light, ?, ?, ?, ?> 
  • Models – in the form of Rules

  • In the next phase i.e. Testing Phase, we will 
    • Evaluate the performance of the Model(s)
  • Testing Phase
  • Question 
    • How good Model(s) has learned?
  • Answer
    • Evaluate the performance of the Model(s) on unseen data (or Testing Data)
  • Evolution Measures
  • Evaluation will be carried out using 
    • Error measure
  • Error
  • Definition
    • Error is defined as the proportion of incorrectly classified Test instances

  • Note
    • Accuracy=1-Error
  • Testing Phase, Cont…
  • Two Approaches for Testing Phase
    • Single Model
    • Ensemble Model
  • Single Model
    • A Single Model is trained on the Training Data and used to make predictions on the Test Data
  • For Candidate Elimination Algorithm, we can
    • Randomly select a Model (or h) from the VSH,D and use it to make predictions on the Test Data
  • Strengths 
    • It is computationally fast and requires less time to make predictions compared to Ensemble Model
  • Weaknesses
    • Error is likely to be high compared to Ensemble Model
  • Ensemble Model
    • An Ensemble Model works by training different Models on the same Training Data and using each Model to individually make predictions on the Test Data
  • For Candidate Elimination Algorithm, we can
    • Select all 6 Models (or h) from the VSH,D (Ensemble Model) and use them to make predictions on the Test Data
  • Strengths 
    • Error is likely to be low compared to Single Model
  • Weaknesses
    • It is computationally expensive and requires more time to make predictions compared to Single Model
  • Problem
    • How to combine predictions of different Models to make a Final Prediction
  • A Possible Solution
    • Use Voting Approach
  • Voting Approach
  • Voting Approach
    • Voting is one of the simplest approaches to combine predictions from different Models 
  • Model Weight
    • In Voting Approach, we may assign
      • Same weight to all Models
      • Different weights to all Models
  • Steps – How Voting Approach Works?
    • Given a Test instance (x)
      • Step 1: Make individual predictions using different Models
      • Step 2: Combine predictions of individual Models
      • Step 3: Final Prediction of x will be the class which has the majority vote
  • Example 1 – Voting Approach
  • Assumption
    • All Models have the same weight i.e. 1
  • Consider a Binary Classification Problem
    • Class 1 = Yes
    • Class 2 = No
  • Five different Models are trained on same Training Data
  • Question
    • What will be the Final Prediction of Test Instance (x) using Ensemble Model?
  • Using Voting Approach to Make Predictions on Text Instance (x)
    • Step 1: Make individual predictions using different Models

    • Step 2: “Combine predictions” of “individual” Models
      • No. of Votes for Yes Class = 1 +1 +1 = 3
      • No. of Votes for No Class = 1 + 1 = 2
    • Step 3: “Final Prediction of x” will be the “class” which has the majority vote
      • Majority Vote = Yes
    • Final Prediction of x
      • Yes
  • Note
    • For Binary Classification Problem with same weight
      • Use “odd” number of Models i.e. 3, 5, 7, 11 etc. 😊
  • Example 2 – Voting Approach
  • Assumption
    • All Models have “different” weights
  • Consider a Binary Classification Problem
    • Class 1 = Yes
    • Class 2 = No
  • Five “different” Models are trained on “same” Training Data

  • Question
    • What will be the “Final Prediction” of Test Instance (x) using Ensemble Model?
  • Using Voting Approach to Make Predictions on Text Instance (x)
    • Step 1: Make individual predictions using different Models

    • Step 2: Combine predictions of individual Models
      • No. of Weighted Votes for Yes Class = 0.2 + 0.6 + 0.4 = 1.2
      • No. of Weighted Votes for No Class = 1.0 + 0.8 = 1.8
    • Step 3: Final Prediction of x will be the class which has the majority vote
      • Majority Vote = No
    • Final Prediction of x
      • No
  • Effect of Weights on Models in Making Final Predictions
  • When the Models had same weights
    • Final Prediction was Yes
  • When the Models had different weights
    • Final Prediction was No
  • Conclusion
    • Model Weight plays an important role in making the Final Prediction
  • Voting vs Voting Classifier
  • Voting
    • Voting is one of the simplest methods to combine predictions from multiple Machine Learning Algorithms
  • Voting Classifier
    • Voting Classifier is not an actual Classifier (Machine Learning Algorithm) but a wrapper for a set of different Machine Learning Algorithms, which are trained and tested on the “same” Data
    • A Voting Classifier combines predictions of individual Machine Learning Algorithms to make Final Predictions on unseen Data
  • Note
  • In sha Allah, in the next Slides, I will first try to show the 
    • Testing Phase, Application Phase and Feedback Phase using
      • Single Model Approach
  • After that, I will try to show 
    • Testing Phase, Application Phase and Feedback Phase using

Ensemble Model Approach

Single Model Approach

  • Single Model – Testing Phase
  • Randomly select a Single Model form VSH,D
    • Single Model = <Short, Light, ?, Yes, ?, ?>
  • Apply Single Model on Test Data
      • x5 = < Short, Light, Short, Yes, Yes, Half> –
      • x6 = <Tall, Light, Short, No, Yes, Full> –
  • Applying Single Model on x5

  • Prediction returned by Single Model
    • x5 is predicted Positive (Incorrectly Classified Instance)
  • Applying Single Model on x6

  • Prediction returned by Single Model
    • x6 is predicted Negative (Correctly Classified Instance)
  • Teating Phase – Single Model

  • Single Model - Application Phase
  • We assume that our Single Model 
    • performed well on large Test Data and can be deployed in Real-world
  • Single Model is deployed in the Real-world and now we can make
    • predictions on Real-time Data
  • Steps – Making Predictions on Real-Time Data
  • Step 1: Take Input from User
  • Step 2: Convert User Input into Feature Vector
    • Exactly same as Feature Vectors of Training and Testing Data
  • Step 3: Apply Model on the Feature Vector
  • Step 4: Return Prediction to the User
  • Example - Making Predictions on Real-time Data
  • Step 1: Take input from User

  • Step 2: Convert User Input into Feature Vector
    • Note that order of Attributes must be exactly same as that of Training and Testing Examples

Step 3: Apply Single Model on Feature Vector

  • Sterp 4: Return Prediction to the User
    • Positive 
  • Note 
    • You can take Input from user, apply Model and return predictions as many times as you like 😊
  • Single Model - Feedback Phase
  • Only Allah is Perfect 😊
  • Take Feedback on your deployed Single Model from
    • Domain Experts and
    • Users

Improve your Single Model based on Feedback 😊

Ensemble Model Approach

  • Ensemble Model – Testing Phase
  • Ensemble Model 
    • Comprises of all six hypothesis (Models) in the Version Space VSH,D
  • Below are the six Models used to make the Ensemble Model 
      • Model 1 = <Short, Light, ?, Yes, ?, ?>
      • Model 2 = <Short, ?, ?, Yes, ?, ?>
      • Model 3 = <Short, Light, ?, ?, ?, ?>
      • Model 4 = <?, Light, ?, Yes, ?, ?>
      • Model 5 = <Short, ?, ?, ?, ?, ?>
      • Model 6 = <?, Light, ?, ?, ?, ?>
  • Apply Ensemble Model on Test Data
      • x5 = < Short, Light, Short, Yes, Yes, Half> –
      • x6 = <Tall, Light, Short, No, Yes, Full> –
  • Applying Ensemble Model on x5

  • Note 
    • Final Prediction was calculated based on Majority Vote
  • Prediction returned by Ensemble Model
    • x5 is predicted Positive (Incorrectly Classified Instance)
  • Applying Ensemble Model on x6

  • Note 
    • Final Prediction was calculated based on Majority Vote
  • Prediction returned by Ensemble Model
    • x6 is predicted Negative (Correctly Classified Instance)
  • Teating Phase – Ensemble Model

  • Ensemble Model – Application Phase
  • We assume that our Ensemble Model 
    • performed well on large Test Data and can be deployed in Real-world
  • Ensemble Model is deployed in Real-world and now we can make
    • predictions on Real-time Data
  • Steps – Making Predictions on Real-time Data
  • Step 1: Take Input from User
  • Step 2: Convert User Input into Feature Vector
    • Exactly same as Feature Vectors of Training and Testing Data
  • Step 3: Apply Ensemble Model on the Feature Vector
  • Step 4: Return Prediction to the User
  • Making Predictions on Real-time Data
  • Step 1: Take input from User

  • Step 2: Conver User Input into Feature Vector
    • Note that order of Attributes must be exactly same as that of Training and Testing Examples

Step 3: Apply Ensemble Model on Feature Vector

  • Note 
    • Final Prediction was calculated based on Majority Vote
  • Step 4: Return Prediction to the User
    • Ensemble Model predicts the unseen example as
      • Positive 
  • Note 
    • You can take Input from user, apply Model and return predictions as many times as you like 😊
  • Ensemble Model – Feedback Phase
  • Only Allah is Perfect 😊
  • Take Feedback on your deployed Ensemble Model from
    • Domain Experts and
    • Users

Improve your Ensemble Model based on Feedback

Discussion – Candidate Elimination Algorithm

  • Inductive Bias - Candidate Elimination Algorithm
  • Inductive Bias 
    • Inductive Bias is the set of assumptions needed in addition to Training Examples to justify Deductively Learner’s Classification
  • Inductive Bias of Candidate Elimination Algorithm
    • Training Data is error free
    • Target Function / Concept is present in the Hypothesis Space (H)
  • Training Regime- Candidate Elimination Algorithm
  • Incremental Method 
    • See Chapter 3 – Basics of Machine Learning
  • Strengths and Weaknesses – Candidate Elimination Algorithm
  • Strengths 
    • Candidate Elimination Algorithm overcomes the limitation of List Then Eliminate Algorithm and 
      • provides a more compact representation of the Version Space
    • Version Space returned by Candidate Elimination Algorithm can be used to make predictions on unseen Data using either a
      • Single Model or
      • Ensemble Model
  • Weaknesses 
    • Only works on error free Data
      • However, Real-world Data is noisy
    • Works on assumption that Target Function f is present in the Hypothesis Space (H)
      • However, we may / may not find the Target Function f in the Hypothesis Space (H) and this may / may not be known
    • Only works when values of Attributes are Categorical
      • However, in Real-world many Machine Learning Problems have 
  • Attributes / Features with Numeric values
    • Only works for simple representations of Data
      • However, in Real-world many Machine Learning Problems have complex representation (for e.g. Face Detection, Face Recognition etc.)

TODO and Your Turn​

Todo Tasks
Your Turn Tasks
Todo Tasks

TODO Task 1

  • Task 1
    • Consider the Titanic Dataset with the following Attributes
      • Gender: Male, Female
      • Ticket Class: Upper, Middle, Lower
      • Parent/Child Aboard: Zero, One, Two, Three
      • Embarked: Cherbourg, Queenstown, Southampton
      • Survival: No, Yes
  • We obtained the following Sample Data

  • Sample Data was split into Training and Testing Data in a 
    • Train-Test Split Ratio of 67%-33%
  • Training Data 

  • Testing Data

  • Note
    • Consider Candidate Elimination Algorithm when answering questions given below
    • Your answer should be 
      • Well Justified
  • Questions
    • Write down the Input and Output for the above Machine Learning Problem?
    • How Training Example is represented?
    • How Hypothesis (h) should be represented?
    • Execute the Machine Learning Cycle?
      • Execute Testing and Application Phases using a
        • Single Model and
        • Ensemble Model
    • Write down your observations that you observed during the execution of Machine Learning Cycle? 
Your Turn Tasks

Your Turn Task 1

  • Task 1
    • Select a Machine Learning Problem (similar to: Titanic – Machine Learning form Disaster) and answer the questions given below.
    • Note
      • Consider Candidate Elimination Algorithm in answering all the questions.
      • Questions
      • Write down the Input and Output for the selected Machine Learning Problem?
      • How Training Example is represented?
      • How Hypothesis (h) should be represented?
      • Calculate Size of Instance Space, Concept Space, Syntactically Distinct Hypothesis and Semantically Distinct Hypothesis?
      • Execute the Machine Learning Cycle?
        • Execute Testing and Application Phases using a
          • Single Model and
          • Ensemble Model
    • Write down your observations that you observed during the execution of Machine Learning Cycle?

Chapter Summary

  • Chapter Summary
  • Problem – FIND-S Algorithm 
    • Only returns one hypothesis which best fits the Training Data
      • However, there can be multiple hypothesis, which best fit the Training Data
  • Proposed Solution
    • Version Space
  • Version Space (VSH,D) contains set of all hypothesis consistent with the Training Examples
  • Version Space Algorithm computes the set of all hypothesis consistent with the Training Examples
  • Two Algorithms which return a Version Space are
    • List Ten Eliminate Algorithm
    • Candidate Elimination Algorithm 
  • List Ten Eliminate Algorithm – Summary
    • Representation of Example
      • Attribute-Value Pair
    • Representation of Hypothesis (h)
      • Conjunction (AND) of Constraints on Attributes
    • Searching Strategy
      • I am not clear about this. Please drop me an email if you know. Jazak Allah Khair
    • Training Regime
      • Incremental Method
    • Inductive Bias of List Ten Eliminate Algorithm 
      • Training Data is error-free 
      • Target Function / Concept is present in the Hypothesis Space (H)
    • Strengths 
      • List Then Eliminate Algorithm overcomes the limitation of FIND-S Algorithm and 
        • returns multiple hypothesis, which best fits the Training Data i.e. Version Space (VSH,D)
    • Weaknesses 
      • Only works on error-free Data
        • However, Real-world Data is noisy 
      • Works on the assumption that Target Function is present in the Hypothesis Space (H)
        • However, we may / may not find the Target Function in the Hypothesis Space (H) and this may / may not be known 
      • List Then Eliminate Algorithm is computationally expensive because it makes a 
        •  pairwise comparison of each Training Example with all hypothesis in Hypothesis Space (H) 
  • List Then Eliminate Algorithm is computationally expensive because
    • Hypothesis Space (H) is represented as a List , which forces 
      • List Then Eliminate Algorithm to make pairwise comparisons 
  • To reduce the computational cost of Version Space Algorithms 
    • Have a more compact representation of Hypothesis Space (H) 
      • for e.g. Candidate Elimination Algorithm
  • Candidate Elimination Algorithm – Summary
    • Representation of Example
      • Attribute-Value Pair
    • Representation of Hypothesis (h)
      • Conjunction (AND) of Constraints on Attributes
    • Searching Strategy
      • I am not clear about this. Please drop me an email if you know. Jazak Allah Khair
    • Training Regime
      • Incremental Method
    • Inductive Bias of Candidate Elimination Algorithm 
      • Training Data is error-free 
      • Target Function / Concept is present in the Hypothesis Space (H)
    • Strengths
      • Candidate Elimination Algorithm overcomes the limitation of List Then Eliminate Algorithm and 
        • provides a more compact representation of the Version Space
      • Version Space returned by Candidate Elimination Algorithm can be used to make predictions on unseen Data using either a
        • Single Model or
        • Ensemble Model
    • Weaknesses 
      • Only works on error-free Data
        • However, Real-world Data is noisy 
      • Works on assumption that Target Function is present in the Hypothesis Space (H)
        • However, we may / may not find the Target Function in the Hypothesis Space (H) and this may / may not be known 
      • Only works when values of Attributes are Categorical 
        • However, in Real-world many Machine Learning Problems have 
          • Attributes / Features with Numeric values
      • Only works for simple representations of Data
        • However, in Real-world many Machine Learning Problems have complex representation (for e.g. Face Detection, Face Recognition etc.)
  • Single Model
    • A Single Model is trained on the Training Data and used to make predictions on the Test Data
    • Strengths 
      • It is computationally fast and requires less time to make predictions compared to Ensemble Model
    • Weaknesses
      • Error is likely to be high compared to Ensemble Model
  • Ensemble Model
    • An Ensemble Model works by training different Models on the same Training Data and using each Model to individually make predictions on the Test Data
    • Strengths 
      • It is computationally expensive and requires more time to make predictions compared to Single Model
    • Weaknesses
      • Error is likely to be low compared to Single Model
  • Voting Approach is one of the simplest approaches to combine predictions from different Models 
    • In Voting Approach, we may assign
      •  Same weight to all Models
      •  Different weights to all Models
    •  Model Weight plays an important role in making the Final Prediction 
  • Voting Approach works as follows
    • Given a Test instance x
      • Step 1: Make individual predictions using different Models
      • Step 2: Combine predictions of individual Models
      • Step 3: Final Prediction of x will be the class which has the majority vote 
  • Voting Classifier is not an actual Classifier (Machine Learning Algorithm) but a wrapper for a set of different Machine Learning Algorithms , which are trained and tested on the same Data
    • A Voting Classifier combines predictions of individual Machine Learning Algorithms to make Final Predictions on unseen Data

In Next Chapter

  • In Next Chapter
  • In Sha Allah, in next Chapter, I will present
  • Decision Tree Learning
Chapter 8 - FIND-S Algorithm
  • Previous
Chapter 10 - Decision Tree Learning
  • Next
Share this article!
Facebook
Twitter
LinkedIn

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

About Us

Ilm O Irfan Technologies believes that honest dedicated hard work with the ethical approach of commitment true to every word, and trust-building is the road to success in any field of business or life.

Quick Links
  • About Us
  • Contact
Useful Links
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
  • Support
  • FAQ
Subscribe Our Newsletter
Facebook Twitter Linkedin Instagram Youtube

© 2022 Ilmo Irfan. All Rights Reserved.