# Chapter 9 - Version Space Algorithms

### Chapter Outline

- Chapter Outline

**Quick Recap**

**Version Space**

**List Then Eliminate Algorithm**

**Candidate Elimination Algorithm**

**Chapter Summary**

## Quick Recap

- Quick Recap – FIND-S Algorithm

**Learning is a****Searching****Problem**

**Research is defined as a****systematic investigation****of****sources****and****materials****, to****develop solutions****for****Real-world Problems****to****improve****the****quality (peace and happiness) of human life**

**A Researcher is a****man of character****and (s)he should****safeguard his character**

**The word****Research****is made up of**

**Re****and**

**search**

**Therefore, in Research, we mainly****refine****the****solution(s)****proposed for a Real-world Problem**

**The main steps of a Research Cycle are as follows**

**Step 1:****Identify****the Real-world Problem**

**Step 2:****Propose****Solution (called Solution 01) to solve the Real-world Problem**

**Step 3:****List down****Strengths and Weaknesses of Solution 01**

**Step 4: Propose Solution (called Solution 02) to**

**further****strengthen the Strengths of Solution 01**

**overcome limitations****of Solution 01**

**Step 5:****List down****Strengths and Weaknesses of Solution 02**

**Step 6: Propose Solution (called Solution 03) to**

**further****strengthen the Strengths of Solution 02**

**overcome limitations****of Solution 02**

**Step 7: Continue this****cycle****till the Day of Judgment**

**Considering FIND-S Algorithm**

**Input to Learner (FIND-S Algorithm)**

**Set of Training Examples (D)**

**Set of Functions / Hypothesis (H)**

**Output by Learner (FIND-S Algorithm)**

**A Hypothesis (h) from H which****best fits****the Training Examples (D)**

**Note that h is an****approximation****of Target Function**

**Inductive Bias Is the****set of assumptions****needed****in addition****to Training Examples to****justify****Deductively Learner’s Classiﬁcation**

**FIND-S Algorithm – Summary**

**Representation of Example**

**Attribute-Value Pair**

**Representation of Hypothesis (h)**

**Conjunction (AND) of Constraints on Attributes**

**Searching Strategy**

**I am not clear about this. Please drop me an email if you know.**

**Training Regime**

**Incremental Method**

**Inductive Bias of FIND-S Algorithm**

**Training Data is****error-free**

**Target Function / Concept is****present****in the Hypothesis Space (H)**

**Strengths**

**Returns a Model (h), which can be used to****make predictions****on unseen data**

**Weaknesses**

**Only works on****error-free****Data**

**However, Real-world Data is****noisy**

**Works on****assumption****that Target Function is****present****in the Hypothesis Space (H)**

**However, we****may / may not find****the Target Function in the Hypothesis Space (H) and this****may / may not be known**

**Only returns****one hypothesis****which****best fits****the Training Data**

**However, there can be****multiple hypothesis****, which best fit the Training Data**

## Version Space

- Recap - FIND-S Algorithm

**Strengths**

**Returns a Model (h), which can be used to****make predictions****on unseen data**

**Weaknesses**

**Only works on****error free****Data**

**However, Real-world Data is****noisy**

**Works on****assumption****that Target Function f is****present****in the Hypothesis Space (H)**

**However, we****may / may not find****the Target Function f in the Hypothesis Space (H) and this****may / may not be known**

**Only returns****one hypothesis****which****best fits****the Training Data**

**However, there can be****multiple hypothesis****, which best fit the Training Data**

- Main Problem in FIND-S Algorithm

**Problem**

**Only returns****one hypothesis****which****best fits****the Training Data**

**However, there can be****multiple hypothesis****, which****best fit****the Training Data**

**Proposed Solution**

**Version Space**

- Version Space vs Version Space Algorithm

**Version Space (VS**_{H,D})**Version Space (VS**_{H,D}) contains set of all hypothesis consistent with the Training Examples

**Version Space Algorithm**

**Version Space Algorithm computes the set of all hypothesis consistent with the Training Examples**

- Version Space Algorithms

**In this Chapter, In sha Allah we will study two Version Space Algorithms**

**List Then Eliminate Algorithm**

**Candidate Elimination Algorithm**

- Gender Identification Problem

**In this Chapter, we will use the same Gender Identfication Probelm presented in**

**Chapter 8 – FIND-S Algorithm**

- Representation of Training Example (d) and Hypothesis (h)

**Representation of Training Example (d)**

**Attribute-Value Pair**

**Representation of Hypothesis (h)**

**Conjunction of Constraints on Input Attributes**

- Sample Data – Vector Representation

**Vector Representation of Examples**

** x1= <Short, Light, Short, Yes, Yes, Half> +**

** x2= <Short, Light, Long, Yes, Yes, Half> +**

** x3 = <Tall, Heavy, Long, Yes, Yes, Full> –**

** x4 = <Short, Light, Long, Yes, No, Full> +**

** x5 = <Short, Light, Short, Yes, Yes, Half> –**

** x6 = <Tall, Light, Short, No, Yes, Full> –**

- Training Data – Vector Representation

**Vector Representation of Training Examples**

** x1= <Short, Light, Short, Yes, Yes, Half> +**

** x2= <Short, Light, Long, Yes, Yes, Half> +**

** x3 = <Tall, Heavy, Long, Yes, Yes, Full> –**

** x4 = <Short, Light, Long, Yes, No, Full> +**

- Testing Data – Vector Representation

**Vector Representation of Test Examples**

** x5 = <Short, Light, Short, Yes, Yes, Half> –**

** x6 = <Tall, Light, Short, No, Yes, Full> –**

## List Then Eliminate Algorithm

- Machine Learning Cycle

**Four phases of a Machine Learning Cycle are**

**Training Phase**

**Build the Model****using Training Data**

**Testing Phase**

**Evaluate the performance of Model****using Testing Data**

**Application Phase**

**Deploy the Model in Real-world****, to****make prediction****on Real-time unseen Data**

**Feedback Phase**

**Take Feedback form the****Users****and****Domain Experts****to****improve the Model**

- Note

**For List Then Eliminate Algorithm, In sha Allah I will only execute the****Training Phase****😊**

- Hyptoehsis Sapce (H)

**Consider a Hyptoehsis Sapce (H) of 15 Hypothesis (h)**

**H =****{h**_{1}, h_{2}, h_{3}, h_{4}, h_{5}, h_{6}, h_{7}, h_{8}, h_{9}, h_{10}, h_{11}, h_{12}, h_{13}, h_{14}, h_{15}}

**In sha Allah, in next Slides, I will try to explain how to use List then Eliminate Algoirthm to find out**

**All hypotheis that are consistent****with the Training Data**

**i.e. Version Space**

- List Then Eliminate Algorithm

- Training Phase

**Version Space**←**a list containing every hypothesis in***H*

**VS**_{H,D}**= {h**_{1}, h_{2}, h_{3}, h_{4}, h_{5}, h_{6}, h_{7}, h_{8}, h_{9}, h_{10}, h_{11}, h_{12}, h_{13}, h_{14}, h_{15}}

- Training Phase, Cont…

**Remove from Version Space any hypothesis that is****inconsistent****with the Training Example h(x)****≠****c(x)**

**Training Example x****1**

**h****1****, h****2****, h****3****are****inconsistent****with Training Example x****1**

**VS**_{H,D}**= { h**_{4}, h_{5}, h_{6}, h_{7}, h_{8}, h_{9}, h_{10}, h_{11}, h_{12}, h_{13}, h_{14}, h_{15}}

**Training Example x****2**

**h****14****, h****15****are****inconsistent****with Training Example x****2**

**VS**_{H,D}**= { h**_{4}, h_{5}, h_{6}, h_{7}, h_{8}, h_{9}, h_{10}, h_{11}, h_{12}, h_{13}}

**Training Example x****3**

**h****6****, h****9****, h****12****are****inconsistent****with Training Example x****3**

**VS**_{H,D}**= { h**_{4}, h_{5}, h_{7}, h_{8}, h_{10}, h_{11}, h_{13}}

**Training Example x****4**

**h****5****, h****11****are****inconsistent****with Training Example x****4**

**VS**_{H,D}**= { h**_{4}, h_{7}, h_{8}, h_{10}, h_{13}}

- Training Phase, Cont…

**Output the list of hypotheses in***VersionSpace*

**The list of hypotheses in***VersionSpace***are**

**VS**_{H,D}**= { h**_{4}, h_{7}, h_{8}, h_{10}, h_{13}}

**Conclusion**

**List Then Eliminate Algorithm returned**

**5 Hypotheses that are consistent****with the Training Data i.e. Version Space (****VS**_{H,D)}

- Inductive Bias - List Then Eliminate Algorithm

**Inductive Bias**

**Inductive Bias is the****set of assumptions****needed****in addition****to Training Examples to****justify****Deductively Learner’s Classiﬁcation**

**Inductive Bias of List Then Eliminate Algorithm**

**Training data is****error free****Target Function / Concept is****present****in the Hypothesis Space (H)**

- Training Regime- List Then Eliminate Algorithm

**Incremental Method**

**See****Chapter 2 – Basics of Machine Learning**

- Strengths and Weaknesses – List Then Eliminate Algorithm

**Strengths**

**List Then Eliminate Algorithm overcomes the limitation of FIND-S Algorithm and**

**returns multiple hypothesis, which best fits the Training Data i.e. Version Space (VS**_{H,D})

**Weaknesses**

**Only works on error free Data**

**However, Real-world Data is noisy**

**Works on assumption that Target Function is present in the Hypothesis Space (H)**

**However, we may / may not find the Target Function f in the Hypothesis Space (H) and this may / may not be known**

**List Then Eliminate Algorithm is computationally expensive because it makes a**

**pairwise comparison of each Training Example with all hypothesis in Hypothesis Space (H)**

- Why List Then Eliminate Algorithm is Computationally Expensive?

**Question**

**Why List Then Eliminate Algorithm is computationally expensive?**

**Answer**

**List Then Eliminate Algorithm is computationally expensive because**

**Hypothesis Space (H) is****represented****as a****List**

**Representing Hypothesis Space (H) as a List,****forces**

**List Then Eliminate Algorithm to make****pairwise comparisons**

**Conclusion**

**To****reduce****the****computational cost****of Version Space Algorithms****Have a****more compact representation****of****Hypothesis Space (H)**

## Candidate Elimination Algorithm

- Limitation of List Then Eliminate Algorithm

**Problem**

**List Then Eliminate Algorithm is computationally expensive because**

**representation of Hypothesis Space (H)****is****not compact**

**Proposed Solution**

**Represent Hypothesis Space (H) as**

**General to Specific Ordering of Hypothesis**

- Representation of Hypothesis Space (H) in Candidate Elimination Algorithm

**In Candidate Elimination Algorithm, Hypothesis Space (H) is represented as**

**General to Specific Ordering of Hypothesis**

- Representation of Hypothesis Space (H) in Candidate Elimination Algorithm

**General to Specific Ordering of Hypothesis**

- Machine Learning Cycle

**Four phases of a Machine Learning Cycle are**

**Training Phase**

**Build the Model****using Training Data**

**Testing Phase**

**Evaluate the performance of Model****using Testing Data**

**Application Phase**

**Deploy the Model in Real-world****, to****make prediction****on Real-time unseen Data**

**Feedback Phase**

**Take Feedback form the****Users****and****Domain Experts****to****improve the Model**

- Training Phase – Candidate Elimination Algorithm

- Training Phase

**Representing Version Space (VS**_{H,D})

**The General Boundary G of Version Space VS**_{H,D}is the set of maximally general members (or hypotheses)

**The Specific Boundary S of Version Space VS**_{H,D}is the set of maximally specific members (or hypotheses)

**Every member (or hypotheses) of the Version Space VS**_{H,D}lies between these boundaries

**After observing all the Training Examples, the Candidate Elimination Algorithm will**

**Output Version Space (comprising of 6 hypothesis)**

**VS**_{H,D}

**<Short, Light, ?, Yes, ?, ?>**

**<Short, ?, ?, Yes, ?, ?>**

**<Short, Light, ?, ?, ?, ?>**

**<?, Light, ?, Yes, ?, ?>**

**<Short, ?, ?, ?, ?, ?>**

**<?, Light, ?, ?, ?, ?>**

**Note that all 6 hypotheses in the Version Space (VS**_{H,D}) are an approximation of the Target Function f

**Recall**

**Training Data**

**x1= <Short, Light, Short, Yes, Yes, Half> +****x2= <Short, Light, Long, Yes, Yes, Half> +****x3 = <Tall, Heavy, Long, Yes, Yes, Full> –****x4 = <Short, Light, Long, Yes, No, Full> +**

**6 Models in the Version Space VS**_{H,D}

**<Short, Light, ?, Yes, ?, ?>****<Short, ?, ?, Yes, ?, ?>****<Short, Light, ?, ?, ?, ?>****<?, Light, ?, Yes, ?, ?>****<Short, ?, ?, ?, ?, ?>****<?, Light, ?, ?, ?, ?>**

**Models – in the form of Rules**

**In the next phase i.e. Testing Phase, we will**

**Evaluate the performance of the Model(s)**

- Testing Phase

**Question**

**How****good****Model(s) has learned?**

**Answer**

**Evaluate the performance of the Model(s) on unseen data (or Testing Data)**

- Evolution Measures

**Evaluation will be carried out using**

**Error measure**

- Error

**Definition**

**Error is defined as the proportion of incorrectly classified Test instances**

**Note**

**Accuracy=1-Error**

- Testing Phase, Cont…

**Two Approaches for Testing Phase**

**Single Model**

**Ensemble Model**

**Single Model**

**A Single Model is trained on the Training Data and used to make predictions on the Test Data**

**For Candidate Elimination Algorithm, we can**

**Randomly****select a****Model (or h)****from the****VS**_{H,D}**and****use****it to****make predictions on the Test Data**

**Strengths**

**It is****computationally fast****and requires****less time****to****make predictions****compared to Ensemble Model**

**Weaknesses**

**Error is****likely****to be****high****compared to Ensemble Model**

**Ensemble Model**

**An Ensemble Model works by training****different****Models on the****same****Training Data and using each Model to****individually****make predictions on the Test Data**

**For Candidate Elimination Algorithm, we can**

**Select all****6 Models (or h)****from the****VS**_{H,D}**(Ensemble Model) and****use****them to****make predictions****on the Test Data**

**Strengths**

**Error is****likely****to be****low****compared to Single Model**

**Weaknesses**

**It is computationally expensive and requires more time to make predictions compared to Single Model**

**Problem**

**How to****combine predictions****of****different****Models to make a****Final Prediction**

**A Possible Solution**

**Use Voting Approach**

- Voting Approach

**Voting Approach**

**Voting is one of the****simplest****approaches to****combine predictions****from****different****Models**

**Model Weight**

**In Voting Approach, we may assign**

**Same****weight to all Models**

**Different****weights to all Models**

**Steps – How Voting Approach Works?**

**Given a Test instance (x)**

**Step 1: Make****individual predictions****using****different****Models**

**Step 2:****Combine predictions****of****individual****Models**

**Step 3:****Final Prediction of x****will be the****class****which has the****majority vote**

- Example 1 – Voting Approach

**Assumption**

**All Models have the****same****weight i.e. 1**

**Consider a Binary Classification Problem**

**Class 1****= Yes**

**Class 2****= No**

**Five****different****Models are trained on****same****Training Data**

**Question**

**What will be the****Final Prediction****of Test Instance (x) using Ensemble Model?**

**Using Voting Approach to Make Predictions on Text Instance (x)**

**Step 1: Make****individual****predictions using****different****Models**

**Step 2: “****Combine predictions****” of “****individual****” Models**

**No. of Votes for Yes Class****= 1 +1 +1****= 3**

**No. of Votes for No Class****= 1 + 1****= 2**

**Step 3: “****Final Prediction of x****” will be the “****class****” which has the****majority vote**

**Majority Vote****= Yes**

**Final Prediction****of x**

**Yes**

**Note**

**For Binary Classification Problem with****same****weight**

**Use “****odd****” number of Models i.e. 3, 5, 7, 11 etc.****😊**

- Example 2 – Voting Approach

**Assumption**

**All Models have “****different****” weights**

**Consider a Binary Classification Problem**

**Class 1****= Yes**

**Class 2****= No**

**Five “****different****” Models are trained on “****same****” Training Data**

**Question**

**What will be the “****Final Prediction****” of Test Instance (x) using Ensemble Model?**

**Using Voting Approach to Make Predictions on Text Instance (x)**

**Step 1: Make****individual****predictions using****different****Models**

**Step 2:****Combine predictions****of****individual****Models**

**No. of Weighted Votes for Yes Class****= 0.2 + 0.6 + 0.4****= 1.2**

**No. of Weighted Votes for No Class****= 1.0 + 0.8****= 1.8**

**Step 3:****Final Prediction of x****will be the****class****which has the****majority vote**

**Majority Vote****= No**

**Final Prediction of x**

**No**

- Effect of Weights on Models in Making Final Predictions

**When the Models had****same****weights**

**Final Prediction was****Yes**

**When the Models had****different****weights**

**Final Prediction was****No**

**Conclusion**

**Model Weight****plays an****important role****in making the****Final Prediction**

- Voting vs Voting Classifier

**Voting**

**Voting is one of the****simplest****methods to****combine predictions****from****multiple****Machine Learning Algorithms**

**Voting Classifier**

**Voting Classifier i****s not an actual Classifier (Machine Learning Algorithm)****but a****wrapper****for a****set of different Machine Learning Algorithms****, which are trained and tested on the “****same****” Data**

**A Voting Classifier****combines predictions****of****individual****Machine Learning Algorithms to make****Final Predictions****on****unseen****Data**

- Note

**In sha Allah, in the next Slides, I will first try to show the**

**Testing Phase, Application Phase and Feedback Phase using**

**Single Model Approach**

**After that, I will try to show**

**Testing Phase, Application Phase and Feedback Phase using**

**Ensemble Model Approach**

### Single Model Approach

- Single Model – Testing Phase

**Randomly****select a Single Model form****VS**_{H,D}

**Single Model = <Short, Light, ?, Yes, ?, ?>**

**Apply Single Model on Test Data**

**x5 = < Short, Light, Short, Yes, Yes, Half> –****x6 = <Tall, Light, Short, No, Yes, Full> –**

**Applying Single Model on x****5**

**Prediction returned by Single Model**

**x****5****is predicted Positive (Incorrectly Classified Instance)**

**Applying Single Model on x****6**

**Prediction returned by Single Model**

**x****6****is predicted Negative (Correctly Classified Instance)**

- Teating Phase – Single Model

- Single Model - Application Phase

**We assume that our Single Model**

**performed well****on****large Test Data****and can be****deployed in Real-world**

**Single Model is****deployed in the Real-world****and now we can make**

**predictions****on Real-time Data**

- Steps – Making Predictions on Real-Time Data

**Step 1: Take Input from User**

**Step 2: Convert****User Input****into****Feature Vector**

**Exactly same****as****Feature Vectors****of Training and Testing Data**

**Step 3:****Apply****Model on the****Feature Vector**

**Step 4: Return****Prediction****to the User**

- Example - Making Predictions on Real-time Data

**Step 1: Take input from User**

**Step 2: Convert User Input into****Feature Vector**

**Note that****order****of Attributes must be****exactly same****as that of Training and Testing Examples**

**Step 3: Apply Single Model on Feature Vector **

**Sterp 4: Return****Prediction****to the User**

**Positive**

**Note**

**You can take Input from user, apply Model and return predictions as many times as you like****😊**

- Single Model - Feedback Phase

**Only Allah is Perfect****😊**

**Take Feedback on your****deployed****Single Model from**

**Domain Experts and**

**Users**

**Improve**** your Single Model based on Feedback ****😊**

### Ensemble Model Approach

- Ensemble Model – Testing Phase

**Ensemble Model**

**Comprises of****all six hypothesis (Models)****in the Version Space****VS**_{H,D}

**Below are the six Models used to make the Ensemble Model**

**Model 1 = <Short, Light, ?, Yes, ?, ?>****Model 2 = <Short, ?, ?, Yes, ?, ?>****Model 3 = <Short, Light, ?, ?, ?, ?>****Model 4 = <?, Light, ?, Yes, ?, ?>****Model 5 = <Short, ?, ?, ?, ?, ?>****Model 6 = <?, Light, ?, ?, ?, ?>**

**Apply Ensemble Model on Test Data**

**x5 = < Short, Light, Short, Yes, Yes, Half> –****x6****= <Tall, Light, Short, No, Yes, Full>**–

**Applying Ensemble Model on x****5**

**Note**

**Final Prediction****was calculated based on****Majority Vote**

**Prediction returned by Ensemble Model**

**x****5****is predicted Positive (Incorrectly Classified Instance)**

**Applying Ensemble Model on x****6**

**Note**

**Final Prediction****was calculated based on****Majority Vote**

**Prediction returned by Ensemble Model**

**x****6****is predicted Negative (Correctly Classified Instance)**

- Teating Phase – Ensemble Model

- Ensemble Model – Application Phase

**We assume that our Ensemble Model**

**performed well****on****large Test Data****and can be****deployed in Real-world**

**Ensemble Model is****deployed in Real-world****and now we can make**

**predictions****on Real-time Data**

- Steps – Making Predictions on Real-time Data

**Step 1: Take Input from User**

**Step 2: Convert****User Input****into****Feature Vector**

**Exactly same****as****Feature Vectors****of Training and Testing Data**

**Step 3:****Apply****Ensemble Model on the****Feature Vector**

**Step 4: Return****Prediction****to the User**

- Making Predictions on Real-time Data

**Step 1: Take input from User**

**Step 2: Conver User Input into****Feature Vector**

**Note that****order****of Attributes must be****exactly****same as that of Training and Testing Examples**

**Step 3: Apply Ensemble Model on Feature Vector**

**Note**

**Final Prediction****was calculated based on****Majority Vote**

**Step 4: Return****Prediction****to the User**

**Ensemble Model****predicts****the unseen example as**

**Positive**

**Note**

**You can take Input from user, apply Model and return predictions as many times as you like****😊**

- Ensemble Model – Feedback Phase

**Only Allah is Perfect****😊**

**Take Feedback on your****deployed****Ensemble Model from**

**Domain Experts and**

**Users**

**Improve**** your Ensemble Model based on Feedback**

### Discussion – Candidate Elimination Algorithm

- Inductive Bias - Candidate Elimination Algorithm

**Inductive Bias**

**Inductive Bias is the****set of assumptions****needed****in addition****to Training Examples to****justify****Deductively Learner’s Classiﬁcation**

**Inductive Bias of Candidate Elimination Algorithm**

**Training Data is****error free**

**Target Function / Concept is****present****in the Hypothesis Space (H)**

- Training Regime- Candidate Elimination Algorithm

**Incremental Method**

**See****Chapter 3 – Basics of Machine Learning**

- Strengths and Weaknesses – Candidate Elimination Algorithm

**Strengths**

**Candidate Elimination Algorithm****overcomes the limitation****of List Then Eliminate Algorithm and**

**provides a****more compact****representation of the Version Space**

**Version Space returned by Candidate Elimination Algorithm can be used to****make predictions****on****unseen****Data using either a**

**Single Model or**

**Ensemble Model**

**Weaknesses**

**Only works on****error free****Data**

**However, Real-world Data is****noisy**

**Works on****assumption****that Target Function f is****present****in the Hypothesis Space (H)**

**However, we****may / may not find****the Target Function f in the Hypothesis Space (H) and this****may / may not be known**

**Only works when****values of Attributes****are****Categorical**

**However, in Real-world many Machine Learning Problems have**

**Attributes / Features with****Numeric****values**

**Only works for****simple****representations of Data**

**However, in Real-world many Machine Learning Problems have****complex****representation (for e.g. Face Detection, Face Recognition etc.)**

### TODO and Your Turn

**TODO Task 1**

**Task 1**

**Consider the Titanic Dataset with the following Attributes**

**Gender: Male, Female**

**Ticket Class: Upper, Middle, Lower**

**Parent/Child Aboard: Zero, One, Two, Three**

**Embarked: Cherbourg, Queenstown, Southampton**

**Survival: No, Yes**

**We obtained the following Sample Data**

**Sample Data was split into Training and Testing Data in a**

**Train-Test Split Ratio of 67%-33%**

**Training Data**

**Testing Data**

**Note**

**Consider Candidate Elimination Algorithm when answering questions given below**

**Your answer should be**

**Well Justified**

**Questions**

**Write down the Input and Output for the above Machine Learning Problem?**

**How****Training Example****is****represented****?**

**How****Hypothesis (h)****should be****represented****?**

**Execute the Machine Learning Cycle?**

**Execute Testing and Application Phases using a**

**Single Model and**

**Ensemble Model**

**Write down your****observations****that you observed during the execution of Machine Learning Cycle?**

**Your Turn Task 1**

**Task 1**

**Select a Machine Learning Problem (similar to: Titanic – Machine Learning form Disaster) and answer the questions given below.**

**Note**

**Consider Candidate Elimination Algorithm in answering all the questions.**

**Questions**

**Write down the Input and Output for the selected Machine Learning Problem?**

**How****Training Example****is****represented****?**

**How****Hypothesis (h)****should be****represented****?**

**Calculate Size of Instance Space, Concept Space, Syntactically Distinct Hypothesis and Semantically Distinct Hypothesis?**

**Execute the Machine Learning Cycle?**

**Execute Testing and Application Phases using a**

**Single Model and**

**Ensemble Model**

**Write down your****observations****that you observed during the execution of Machine Learning Cycle?**

## Chapter Summary

- Chapter Summary

**Problem – FIND-S Algorithm**

**Only returns****one hypothesis****which****best fits****the Training Data**

**However, there can be****multiple hypothesis****, which****best fit****the Training Data**

**Proposed Solution**

**Version Space**

**Version Space (****VS**_{H,D}**) contains****set of all hypothesis consistent****with the Training Examples**

**Version Space Algorithm****computes****the set of all hypothesis consistent with the Training Examples**

**Two Algorithms which return a Version Space are**

**List Ten Eliminate Algorithm**

**Candidate Elimination Algorithm**

**List Ten Eliminate Algorithm – Summary**

**Representation of Example**

**Attribute-Value Pair**

**Representation of Hypothesis (h)**

**Conjunction (AND) of Constraints on Attributes**

**Searching Strategy**

**I am not clear about this. Please drop me an email if you know. Jazak Allah Khair**

**Training Regime**

**Incremental Method**

**Inductive Bias of List Ten Eliminate Algorithm**

**Training Data is****error-free**

**Target Function / Concept is present in the Hypothesis Space (H)**

**Strengths**

**List Then Eliminate Algorithm****overcomes the limitation****of FIND-S Algorithm and**

**returns****multiple hypothesis****, which****best fits****the Training Data i.e. Version Space (****VS**_{H,D}**)**

**Weaknesses**

**Only works on****error-free****Data**

**However, Real-world Data is****noisy**

**Works on the****assumption****that Target Function is****present****in the Hypothesis Space (H)**

**However, we****may / may not find****the Target Function in the Hypothesis Space (H) and this****may / may not be known**

**List Then Eliminate Algorithm is****computationally expensive****because it makes a**

**pairwise comparison****of each Training Example with****all hypothesis in Hypothesis Space (H)**

**List Then Eliminate Algorithm is computationally expensive because**

**Hypothesis Space (H) is****represented****as a****List****, which****forces**

**List Then Eliminate Algorithm to make****pairwise comparisons**

**To****reduce****the****computational cost****of Version Space Algorithms**

**Have a****more compact representation****of****Hypothesis Space (H)**

**for e.g. Candidate Elimination Algorithm**

**Candidate Elimination Algorithm – Summary**

**Representation of Example**

**Attribute-Value Pair**

**Representation of Hypothesis (h)**

**Conjunction (AND) of Constraints on Attributes**

**Searching Strategy**

**I am not clear about this. Please drop me an email if you know. Jazak Allah Khair**

**Training Regime**

**Incremental Method**

**Inductive Bias of Candidate Elimination Algorithm**

**Training Data is****error-free**

**Target Function / Concept is****present****in the Hypothesis Space (H)**

**Strengths**

**Candidate Elimination Algorithm****overcomes the limitation****of List Then Eliminate Algorithm and**

**provides a****more compact****representation of the Version Space**

**Version Space returned by Candidate Elimination Algorithm can be used to****make predictions****on****unseen****Data using either a**

**Single Model or**

**Ensemble Model**

**Weaknesses**

**Only works on****error-free****Data**

**However, Real-world Data is****noisy**

**Works on****assumption****that Target Function is****present****in the Hypothesis Space (H)**

**However, we****may / may not find****the Target Function in the Hypothesis Space (H) and this****may / may not be known**

**Only works when****values of Attributes****are****Categorical**

**However, in Real-world many Machine Learning Problems have**

**Attributes / Features with****Numeric****values**

**Only works for****simple****representations of Data**

**However, in Real-world many Machine Learning Problems have****complex****representation (for e.g. Face Detection, Face Recognition etc.)**

**Single Model**

**A Single Model is trained on the Training Data and used to make predictions on the Test Data**

**Strengths**

**It is****computationally fast****and requires****less time****to****make predictions****compared to Ensemble Model**

**Weaknesses**

**Error is****likely****to be****high****compared to Ensemble Model**

**Ensemble Model**

**An Ensemble Model works by training****different****Models on the****same****Training Data and using each Model to****individually****make predictions on the Test Data**

**Strengths**

**It is computationally expensive and requires more time to make predictions compared to Single Model**

**Weaknesses**

**Error is****likely****to be****low****compared to Single Model**

**Voting Approach is one of the****simplest****approaches to****combine predictions****from****different****Models**

**In Voting Approach, we may assign**

**Same****weight to all Models**

**Different****weights to all Models**

**Model Weight****plays an****important role****in making the****Final Prediction**

**Voting Approach works as follows**

**Given a Test instance x**

**Step 1: Make****individual predictions****using****different****Models**

**Step 2:****Combine predictions****of****individual****Models**

**Step 3:****Final Prediction of x****will be the****class****which has the****majority vote**

**Voting Classifier****is not an actual Classifier (Machine Learning Algorithm)****but a****wrapper****for a****set of different Machine Learning Algorithms****, which are trained and tested on the****same****Data**

**A Voting Classifier****combines predictions****of****individual****Machine Learning Algorithms to make****Final Predictions****on****unseen****Data**

### In Next Chapter

- In Next Chapter

- In Sha Allah, in next Chapter, I will present

- Decision Tree Learning