Chapter 7 - Concept Learning and Hypothesis Representation
Chapter Outline
- Chapter Outline
- Quick Recap
- Basics of Concept Learning
- Instance Space, Training Data, Concept Space and Hypothesis Space
- Hypothesis (h) Representation
- Hypothesis Space (H) Representation
- Chapter Summary
Quick Recap
- Quick Recap – Treating a Problem as a Machine Learning Problem - Step by Step Examples
- A Real-world Problem can be treated as a Machine Learning Problem using the following Step by Step approach
- Step 1: Decide the Learning Setting
- Step 2: Obtain Sample Data
- Step 3: Understand and Pre-process Sample Data
- Step 4: Represent Sample Data in Machine Understandable Format
- Step 5: Select Suitable Machine Learning Algorithms
- Step 6: Split Sample Data into Training Data and Testing Data
- Step 7: Select Suitable Evaluation Measure(s)
- Step 8: Execute First Two Phases of Machine Learning Cycle
- Training Phase
- Testing Phase
- Step 9: Analyze Results
- Step 10: Execute 3rd and 4th Phases of Machine Learning Cycle
Application Phase
Feedback Phase
- Step 11: Based on Feedback
- Go to Step 1 and Repeal all the Steps
- Step 10: Execute 3rd and 4th Phases of Machine Learning Cycle
- Three main Learning Settings are
- Supervised Learning
- Unsupervised Learning
- Semi-supervised Learning
- What Type of Data should be obtained depends upon
- Leering Setting you selected in Step 1
- Two Main Choices to Obtain Sample Data are
- Use Existing Corpora / Datasets
- Develop your Own Corpora / Datasets
- If (Corpora / Datasets Exist for your Research Problem)
- Then
- Use existing Corpora / Datasets
- Else
- You will need to develop your own Corpora / Datasets
- Then
- Very often, Machine Learning Algorithms understand Data represented in the form of
- Attribute-Value Pair
- We should consider the following main points when choosing suitable Machine Learning Algorithms for your Machine Learning Problem
- Type of Machine Learning Problem
- Number of Parameters
- Size of Training and Testing Data
- Number of Features
- Training and Testing Time
- Accuracy
- Speed and Accuracy in Application Phase
- Machine Learning Algorithms are designed to solve specific Machine Learning Problems
- Two Important Points to Know
- Complete and correct understanding of the Type of Machine Learning Problem , you are trying to solve using Machine Learning Algorithms
- In previous studies, what Machine Learning Algorithms have proven to be most effective for the Type of Machine Learning Problem you are solving?
- Good Starting Points for Classification Problems
- Feature-based ML Algorithms
- For Textual Data
- Random Forest
- Support Vector Machine
- Logistic Regression
- Naïve Bayes
- Gradient Boost
- For Image / Video Data
- Support Vector Machine
- Regular Neural Networks
- Logistic Regression
- Naive Bayes
- Extreme Learning Machines
- Random Forest
- Extreme Gradient Boost
- Type II Approximate Reasoning
- For Audio Data
- Connectionist Temporal Classification
- Deep Learning ML Algorithms
- For Textual Data
- Recurrent Neural Networks (RNN)
- Long Short-Term Memory (LSTM)
- BI-LSTM
- Gated Recurrent Units (GRU)
- BI-GRU
- For Image / Video Data
- Convolutional Neural Networks (most popular)
- For Audio Data
- Recurrent Neural Networks (RNN)
- Good Starting Points for Regression Problems
- Feature-based ML Algorithms
- Linear Regression
- Regression Trees
- Lasso Regression
- Multivariate Regression
- Good Starting Points for Sequence to Sequence Problems
- For Textual Data
- Recurrent Neural Networks (RNN)
- Long Short-Term Memory (LSTM)
- BI-LSTM
- Gated Recurrent Units (GRU)
- BI-GRU
- For Image / Video Data
- Convolutional Neural Networks
- For Audio Data
- Recurrent Neural Networks (RNN)
- Good Starting Points for Unsupervised Learning Problems
- Feature based Mal Algorithms
- For Textual Data
- K-Means
- Agglomerative Hierarchical Clustering
- Mean-Shift Clustering Algorithm
- DBSCAN – Density-Based Spatial Clustering of Applications with Noise
- EM using GMM – Expectation-Maximization (EM) Clustering using Gaussian Mixture Models (GMM)
- For Image / Video Data
- K-Means
- Fuzzy C Means
- Deep Learning ML Algorithms
- For Image / Video Data
- Generative Adversarial Networks
- Auto-encoders
- Good Starting Points for Semi-supervised Learning Problems
- Feature based ML Algorithms
- Label Spreading Algorithm
- Question
- Which Machine Learning Algorithm(s) is best for a specific Machine Learning Problem?
- Answer
- Apply all available Machine Learning Algorithms and see which performs best 😊
- Problem
- It requires a lot of effort, time and resources to
- Apply all available Machine Learning Algorithms and find the best one
- A Possible Solution
- Start with Good Starting Points 😊
- Machine Learning Experts say that following Machine Learning Algorithms are Good Starting Points
- Feature based ML Algorithms
- For Structured / Unstructured / Semi-structured Data
- Support Vector Machine
- Logistic Regression
- Deep Learning ML Algorithms
- For Textual Data
- Recurrent Neural Network (RNN)
- For Image / Video Data
- Convolutional Neural Network (CNN)
- For Audio Data
- Recurrent Neural Network (RNN)
- A ML Algorithm’s behavior is affected by
- No. of Parameters
- Size of Training Data
- Size of Training Data plays a very important role in the Selection of Suitable ML Algorithms
- Feature-based ML Algorithms
- Feature based ML Algorithms (a.k.a. Classical ML Algorithms) can be accurately trained , even if the Training Data is small
- Deep Learning ML Algorithms
- To accurately train Deep Learning Algorithms huge amount of Training Data is required
- Size of Testing Data
- Size of Testing Data plays a very important when evaluating a Machine Learning Algorithm
- To deploy a Model in Real-world (Application Phase), it should fulfill the following two conditions
- Model should perform well (Condition 01) on large Test Data (Condition 02)
- Number of Features used to Train a Model, have a significant impact on the performance of the Model
- Selection of most discriminating Features is important to get good results
- In Text / Image / Video / Genetic Corpora / Datasets
- Number of Features is very high compared to the No. of Instances in a Corpus / Dataset
- Two popular and widely used approaches to reduce Number of Features in a Corpus / Dataset are
- Feature Reduction
- Feature Reduction (a.k.a. Dimensionality Reduction) is a process which transforms Features into a lower dimension
- Feature Selection
- Feature Selection is the process of selecting most discrimination (or important) subset of Features (excluding redundant or irrelevant Features) from the Original Set of Features (without changing them)
- Popular Methods for Feature Reduction are
- Principal Component Analysis
- Generalized Discriminant Analysis
- Auto-encoders
- Non-negative Matrix Factorization
- Popular Methods for Feature Selection are
- Wrapper Methods
- Filter Methods
- Feature Extraction
- Creates new Features
- Feature Selection
- Selects a subset of Features from the Original Set of Features
- Given a Corpus / Dataset
- First carry out
- Feature Extraction then
- Feature Reduction / Feature Selection
- Training and Testing Time mainly depends upon two main factors
- Size of Training and Testing Data
- Target Accuracy
- Training Time of Deep Learning ML Algorithms is quite high compared to Feature based ML Algorithms
- The Target Accuracy may differ from Machine Learning Problem to Machine Learning Problem
- Speed and Accuracy requirements in Application Phase, may vary from Machine Learning Problem to Machine Learning Problem
- Standard Practice for Splitting Sample Data
- Use a Train-Test Split Ratio of
- 67% – 33%
- Selection of Suitable Evaluation Measure(s) is important to
- correctly evaluate the performance of a Model
- Selection of Suitable Evaluation Measure(s) mainly depends on
- Type of Machine Learning Problem
- Some of the most popular and widely used Evaluation Measures for Classification Problems are
- Baseline Accuracy (a.k.a. Most Common Categorization (MCC))
- Accuracy
- True Negative Rate
- False Positive Rate
- False Negative Rate
- Recall or True Positive Rate or Sensitivity
- Precision or Specificity
- F1
- Area Under the Curve (AUC)
- Some of the most popular and widely used Evaluation Measures for Regression Problems are
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
- R2 or Coefficient of Determination
- Adjusted R2
- Some of the most popular and widely used Evaluation Measures for Sequence-to-Sequence Problems are
- ROUGE (Recall-Oriented Understudy for Gisting Evaluation)
- BLEU (Bi-Lingual Evaluation Understudy)BLEU
- METEOR (Metric for Evaluation of Translation with Explicit Ordering)
- Recall the Equation
- Data=Model+Error
- Training Phase
- Use Training Data to build the Model
- Testing Phase
- Use Testing Data to evaluate the performance of the Model
- i.e. calculate Error in the Model
- When analyzing results, remember the Machine Learning Assumption
- In Application Phase
- Deploy the Model in the Real-world to make predictions on unseen data
- In Feedback Phase
- Take Feedback from
- Domain Experts
- Users of the ML system
- Based on Feedback from Domain Experts and Users
- Improve your Model
- In this Chapter, we treated (Step by Step) following four Real-world Problems as Machine Learning Problems
- GPA Prediction Problem
- Emotion Prediction Problem
- Text Summarization Problem
- Machine Translation Problem
Basics of Concept Learning
- Recap
- Recall that
- Most of the Machine Learning Algorithms use Inductive Learning Approach to learn
- Concept Learning is a major subclass of Inductive Learning
- Concept Learning – Psychology
- Definition
- Concept Learning refers to the human ability
- to learn categories for Real-world Object and
- to recognize new instances of those categories
- Purpose
- To completely and correctly understand the characteristics and behaviors of Real-world Objects falling in different categories and use them according to instructions of Allah (to get maximum benefits)
- Importance
- Without Concept Learning, it will be difficult to learn about category of a Real-world Object and use them efficiently and properly
- Applications
- Concept learning helps to speed up the learning process because
- If you completely and correctly understand the characteristics and behaviors of a category of a Real-world Object, then
- new (unseen) Real-world Objects falling in the same category will have similar characteristics and behavior
- Object – Machine Learning
- Definition
- In Machine Learning, a Real-world Object is a material thing that can be seen and touched
- Two Main Properties of a Real-world Object
- Characteristics / Attributes / Features
- Behavior
- Importance - Characteristics and Behavior of a Real-world Object
- Characteristics and Behavior of a Real-world Object help us to
- learn to categories Real-world Object (or distinguish one Real-world Object from the other)
- Rule of ThumbGiven
- Similar Real-world Objects will have Similar Characteristics and Behavior and vice versa
- Example
- Consider the following two Real-world Object
- Real-world Object 1 – Book
- Characteristics
- Author(s), Title, No. of Pages, Price
- Behavior
- Open, Read, Close
- Real-world Object 2 – Human
- Characteristics
- Height, Weight, Hair Length, Bear, Scarf
- Behavior
- Walk, Talk, Eat, Run
- Observation
- Characteristics and Behavior of Real-world Object 1 (Book) are entirely different from those of Real-world Object 2 (Human)
- Conclusion
- Book and Human are two different Real-world Objects and will fall in different categories
- Example 1 – Categorizing Real-world Objects
- Goal
- A Real-world Object
- Instructor of Machine Learning Course
- Categories
- Human vs Non-Human
- Goal
- Learn to categorize Real-world Object (Instructor) into one of the two categories
- Category / Class 1 = Human
- Category / Class 2 = Non-Human
- Observation
- Characteristics and Behavior of Instructor (Real-world Object) are
- similar to Human Category and
- dissimilar to Non-Human Category
- Conclusion
- Instructor (Real-world Object) is categorized as
- Human
- Example 2 – Categorizing Real-world Objects
- Given
- A Real-world Object
- Car
- Categories
- Human vs Non-Human
- Goal
- Learn to categorize Real-world Object (Car) into one of the two categories
- Category / Class 1 = Human
- Category / Class 2 = Non-Human
- Observation
- Characteristics and Behavior of Car (Real-world Object) are
- similar to Non-Human Category and
- dissimilar to Human Category
- Conclusion
- Car (Real-world Object) is categorized as
- Non-Human
- Representing Real-world Objects in Machines
- A Major Problem
- Machine Is dump and cannot understand Real-world Objects (as they are)
- Solution
- Change representation of Real-world Objects in a Format, which Machine Learning Algorithms can understand
- Example 1 – Representing Real-world Objects in Machines
- Real-world Object
- Student
- Few Characteristics of a Student
- Student ID
- Stunted Name
- Gender
- Degree Program
- CGPA
- Question
- How to represent COMSATS University, Lahore Campus students (Real-world Objects) in Machine?
- A Possible Solution
- Use Attribute-Value Pair
- Represent Characterizes of a Stunted (Real-world Object) using a set of Attributes
- Representation of 3 COMSATS Students in Machine (using a Database Table called Tbl-Student)
- Real-world Object Representation in Object-Oriented Programming (OOP)
- In Object-Oriented Programming (OOP)
- Characteristics are represented using Variables and
- Behavior is represented using Functions / Methods
- Concept Learning – Machine Learning
- Definition
- Concept Learning refers to a machine’s ability
- to learn categories for Real-world Object and
- to recognize new instances of those categories
- Much learning is acquiring general Concepts from specific examples
- Question
- Why use specific examples to learn general Concepts?
- Answer
- It is practically not possible to collect all examples of a Concept
- Example
- Machine Learning Problem
- Gender Identification
- Two Main Situations in Collecting Examples to Learn the Concept of Gender Identification
- Ideal Situation
- All examples (all human beings in the world) are collected to learn the Concept of Gender Identification
- Realistic Situation
- Only a small subset of examples (or Sample) is collected to learn the Concept of Gender Identification
- Realistic Situation
- Concept Learning – Formal Definition
- Definition
- Given
- Set of Training Examples (D) and
- Set of Functions / Hypothesis (H)
- Infer
- A Boolean-valued Function using the Set of Training Examples
- Possible Output Values of a Boolean-valued Function
- A Boolean-valued Function has two possible Output values
- For example
- 1 (Yes)
- 0 (No)
- Recall
- In a Binary Classification Problem, there are
- Two possible Output values
- Therefore, Concept Learning is also known as
- Binary Classification
- Positive Examples vs Negative Examples
- Positive Example
- In Concept Learning, if Output is 1 / True / Yes, then it is called a Positive Example
- Negative Example
- In Concept Learning, if Output is 0 / False / No, then it is called a Negative Example
- Example – Learning Bolden-value Functions
- Concept
- For a given Real-world Object
- Identify if it is a Human or not?
- Input
- Object
- Categories / Classes
- Category / Class 1 = Human (True)
- Category / Class 2 = Non-Human (False)
- Output
- True
- If Input (Real-world Object) is Human
- False
- If Input (Real-world Object) is Non-Human
- Example 1
- Input
- Car
- Output
- False
- Example 2
- Input
- Student
- Output
- True
- Concept as a Function
- Recall
- Much learning involves
- Learning Input-Output Functions
- Therefore, to learn any Concept, the solution is to
- Learn a Function
- Conclusion
- A Concept is a Function, which we want to learn
TODO and Your Turn
Todo Tasks
Your Turn Tasks
Todo Tasks
TODO Task 1
- Task 1
- In Sufia (صوفیا), there are four main Silsila (سلسلہ)Alia
- Chistia (چشتیہ)
- Qadria (قادریہ)
- Naqshbandia (نقشبندیہ )
- Suharwardia (سهروردیہ )
- Some of the main branches of Chistia Sislila Alia are
- Chistia Owaisia
- Chistia Sabria (Hazrat Ali Ahmed Sabir Sb R.A.)
- Chistia NIzamia (Hazrat Khawaja Nizam.ud.Din Owlia R.A.)
- Note
- Your answer should be
- Well Justified
- Questions
- What is the main goal of all these Silsila Alia?
- Chistia Silsila Alia was started by Hazrat Khawaja Moenu.duDin Chisti Ajmeri R.A.
- Write down the names of great personalities who started the other Silsila Alia mentioned in the Task?
- Write down Characteristics of each Silsila Alia?
- Write down similarities and differences in all Silsila Alia mentioned in the Task?
- Search if there are branches in other Silsila Alai (similar to Chistia Silsila Alia)?
- Find out common patterns in branches of Silsila Alia?
Your Turn Tasks
Your Turn Task 1
- Task 1
- Select a Task (similar to Silsila Alia in Sufia) and answer the questions below
- Questions
- Write down the Characters and Behavior of each Real-world Object / Concept present in your selected Task?
- Identify similar and different Real-world Objects / Concepts present in your selected Task?
- Identify common patterns?
Instance Space, Training Data, Concept Space and Hypothesis Space
- Instance Space (X)
- Definition
- Instance Space (X) is the set of all possible combinations of Input Attribute values
- In Machine Learning, Instance Space is same as
- Population in Statistics
- Symbols
- X
- represents Instance Space
- |X|
- represents Size of Instance Space
- Formula to Compute |X|
- |X| = (Values of Input Attribute 1) x (Values of Input Attribute 2) x (Values of Input Attribute 3) x ……… x (Values of Input Attribute n-1) x (Values of Input Attribute n)
- where n represents
- Total Number of Input Attributes
- Example 1 – Computing Size of Instance Space
- Concept to be Learned
- Gender Identification
- Input
- Human
- Output
- Gender of the Human
- Representation of Input i.e. Human
- Set of 2 Input Attributes
- Input Attribute 1
- Height
- Input Attribute 2
- Weight
- Possible Values of Input Attributes
- Values – Input Attribute 1 (Height)
- Short
- Medium
- Tall
- Values – Input Attribute 2 (Weight)
- Normal
- Heavy
- Representation of Output i.e. Gender of a Human
- Set of 1 Output Attribute
- Output Attribute
- Gender
- Possible Values of Output Attribute
- Values – Output Attribute (Gender)
- Male
- Female
- Note
- Gender Identification is a Concept Learning Problem
- Therefore, the Output Attribute (Gender) has only
- 2 possible Output values
- Therefore, the Output Attribute (Gender) has only
- Example 2 – Computing Size of Instance Space
- Concept to be Learned
- Gender Identification
- Input
- Human
- Output
- Gender of the Human
- Representation of Input i.e. Human
- Set of 3 Input Attributes
- Input Attribute 1
- Height
- Input Attribute 2
- Weight
- Input Attribute 3
- Scarf
- Possible Values of Input Attributes
- Values – Input Attribute 1 (Height)
- Short
- Medium
- Tall
- Values – Input Attribute 2 (Weight)
- Small
- Normal
- Heavy
- Values – Input Attribute 3 (Scarf)
- Yes
- No
- Representation of Output i.e. Gender of a Human
- Set of 1 Output Attributes
- Output Attribute
- Gender
- Possible Values of Output Attribute
- Values – Output Attribute (Gender)
- Male
- Female
- Note
- Gender Identification is a Concept Learning problem
- Therefore, the Output Attribute (Gender) has only
- 2 possible Output values
- Training Data (D)
- Sample Data
- Subset of Instance Space (X)
- Size of Sample Data is always less than the Size of Instance Space
- In Machine Learning, Sample Data is same as
- Sample in Statistics
- Sample Data is split into
- Training Data (D)
- Testing Data
- Formally, each d ∈ D is composed of
- di, c(di)
- di represents the Input
- c(di) represents the Output
- Why a Concept Cannot be Completely Learned?
- Recall
- Concept Learning is a major subclass of Inductive Learning and
- Inductive Learning Approach has Scope of Error
- Why a Concept Cannot be Completely Learned?
- To completely learn a Concept, we need
- entire Instance Space (X)
- Problem
- Practically impossible to acquire the entire Instance Space (X)
- What we have is Training Data (D)
- D is a subset of X
- Machine Learning Models learns from Training Data
- i.e. Machine Learning Model learns from incomplete data
- Therefore, a Concept cannot be completely learned
- Concept Space (C)
- Definition
- Concept Space (C) is the set of all possible combinations of Output values
- Symbols
- C
- represents Concept Space
- |C|
- represents Size of Concept Space
- Formula to Compute |C|
- C= 2|x|
- where |X| represents the Size of Instance Space
- Examples – Computing Size of Concept Space
- Example 1
- Considering a Concept Learning Problem in which
- |X| = 4
- C= 2|x| =24 = 16
- Example 2
- Considering a Concept Learning Problem in which
- |X| = 9
- |C| = 2|x| = 2|9| = 512
- Example 3
- Considering a Concept Learning Problem in which
- |X| = 96
- |C| = 2|x| =2|96| = 79,228,162,514,264,337,593,543,950,336
- Note
- In these three examples
- As |X| increases, the value of |C| exponentially increases and
- For |X| = 96, the value of |C| is very huge
- Real-world Corpora / Datasets may comprise of a large number of Input Attributes, which will result in a
- very huge Concept Space
- Recall
- Learning is a Searching Problem
- A Major Problem
- Bigger the |C|, the more challenging it becomes to search the Concept Space
- How to Search a Huge Concept Space?
- Problem with a Huge Concept Space
- The computational cost and time required to search a huge Concept Space will be very high
- Question
- How to efficiently and quickly search a huge Concept Space to find a Concept / Function which best fits the Training Data?
- A Possible Solution
- Reduce the size of Concept Space
- Hypothesis Space (H)
- Definition
- The reduced Concept Space (C) is called the Hypothesis Space (H)
- Question
- How to reduce the size of Concept Space (C)?
- A Possible Solution
- Introduce search bias
- Example – Using Search Bias to Reduce Size of Concept Space
- Consider the following searching problem
- There are a total of 40 students (25 Male and 15 Female) in Machine Learning class. Adeel is the topper of the class with a CGPA of 3.86. Our goal is to automatically search and find out the topper of the class.
- Target Concept / Function to be Searched (or Learned)
- Topper of Machine Learning Class
- i.e. Adeel (CPGA = 3.86)
- |C| = Size of Machine Learning Class = 40
- Question
- How to introduce search bias to reduce the size of Concept Space?
- A Possible Solution
- Introduce Gender Bias
- Choice 1 – Gender Bias
- Exclude Males from Concept Space
- |HFemales| = 40 – 25 = 15
- Note that Hypothesis Space (HFemales) only contains Female students of the Machine Learning class
- Choice 2 – Gender Bias
- Exclude Females from Concept Space
- |HMales| = 40 – 15 = 25
- Note that Hypothesis Space (HMales) only contains Male students of the Machine Learning class
- Example – Using Search Bias to Reduce Size of Concept Space, Cont…
- Now we have three spaces
- Concept Space
- |C| = 40
- Hypotheses Space only containing Female students
- |HFemales| = 15
- Hypotheses Space only containing Male students
- |HMales| = 25
- Example – Using Search Bias to Reduce Size of Concept Space, Cont…
- Target Concept / Function to be Searched (or Learned)
- Topper of Machine Learning Class
- i.e. Adeel (CPGA = 3.86)
- Searching Target Function / Concept in |C|
- Outcome of Search
- We will definitely find the Target Function / Concept in Concept Space (C)
- Reason
- We are searching the entire Concept Space (C)
- Therefore, we cannot miss the Target Function / Concept
- Searching Target Function / Concept in HFemales
- Outcome of Search
- We will not find the Target Function / Concept in Hypothesis Space (HFemales)
- Reason
- We are searching the reduced Concept Space (C)
- Therefore, we may / may not find the Target Function / Concept in a Hypothesis Space
- Searching Target Function / Concept in HMales
- Outcome of Search
- We will find the Target Function / Concept in Hypothesis Space (HMales)
- Reason
- We are searching the reduced Concept Space (C)
- Therefore, we may / may not find the Target Function / Concept in a Hypothesis Space
- Conclusion
- Searching a Concept Space
- We will definitely find the Target Function / Concept
- Searching a Hypothesis Space
- We may / may not find the Target Function / Concept and
- this may / may not be known to us
- We may / may not find the Target Function / Concept and
- Searching a Concept Space
- Concept Space vs Hypothesis Space
- Searching a Concept Space
- Strengths
- We will definitely find the Target Function / Concept
- Weaknesses
- Computational cost is high
- Amount of Time required for searching is high
- Searching a Hypothesis Space
- Strengths
- Computational cost is low
- Amount of Time required for searching is small
- Weaknesses
- We may / may not find the Target Function / Concept and this may / may not be known to us
- Titanic sank in 1912 (some passengers survived and remaining died). Kaggle launched a competition titled: Titanic: Machine Learning from Disaster
- Below are some of the Attributes (with their Possible Values) taken from Titanic Project Dataset
- Gender: Male, Female
- Ticket Class: Upper, Middle, Lower
- Survival: No, Yes
- Parent/Child Abroad: Zero, One, Two, Three
- Embarked: Cherbourg, Queenstown, Southampton
- Goal of a Machine Learning Algorithm is to predict
- Whether a Passenger Survived or Not?
- Note
- Your answer should be
- Well Justified
- Questions
- Write down Input and Output for the above discussed Machine Learning Problem?
- Compute the Size of Instance Space
- Compute the Size of Concept Space
- Can we say that this Machine Learning Problem is a Binary Classification Problem? Explain.
- Can you completely learn the above-discussed Concept?
- Task 2
- Rizwan wants to search Adeel from a group of 300 people. There are a total of 3 groups: (1) Young – contains 100 people, (1) Middle – contains 125 people and (3) Old – contains 75 people. Note that Adeel falls in the group of young people.
- Questions
- What is the Target Function / Concept?
- What is the Size of Concept Space?
- How Rizwan can introduce search bias to reduce the Size of Concept Space?
- How many Search Spaces will be made after introducing search bias?
- Discuss each Search Space separately and identify from which Search Spaces you will find the Target Function / Concept and from which you miss the Target Function / Concept
TODO and Your Turn
Todo Tasks
Your Turn Tasks
Todo Tasks
TODO Task 2
- Task 1
- Titanic sank in 1912 (some passengers survived and remaining died). Kaggle launched a competition titled: Titanic: Machine Learning from Disaster
- Below are some of the Attributes (with their Possible Values) taken from Titanic Project Dataset
- Gender: Male, Female
- Ticket Class: Upper, Middle, Lower
- Survival: No, Yes
- Parent/Child Abroad: Zero, One, Two, Three
- Embarked: Cherbourg, Queenstown, Southampton
- Goal of a Machine Learning Algorithm is to predict
- Whether a Passenger Survived or Not?
- Note
- Your answer should be
- Well Justified
- Questions
- Write down Input and Output for the above discussed Machine Learning Problem?
- Compute the Size of Instance Space
- Compute the Size of Concept Space
- Can we say that this Machine Learning Problem is a Binary Classification Problem? Explain.
- Can you completely learn the above-discussed Concept?
- Task 2
- Rizwan wants to search Adeel from a group of 300 people. There are a total of 3 groups: (1) Young – contains 100 people, (1) Middle – contains 125 people and (3) Old – contains 75 people. Note that Adeel falls in the group of young people.
- Questions
- What is the Target Function / Concept?
- What is the Size of Concept Space?
- How Rizwan can introduce search bias to reduce the Size of Concept Space?
- How many Search Spaces will be made after introducing search bias?
- Discuss each Search Space separately and identify from which Search Spaces you will find the Target Function / Concept and from which you miss the Target Function / Concept
Your Turn Tasks
Your Turn Task 2
- Task 1
- Select a Machine Learning Problem (similar to Titanic: Machine Learning from Disaster) and answer the questions given below
- Whether a Passenger Survived or Not?
- Questions
- Write down Input and Output for your selected Machine Learning Problem?
- Compute the Size of Instance Space
- Compute the Size of Concept Space
- Is your selected Machine Learning Problem a Binary Classification Problem? Explain.
- Can you completely learn the Target Function / Concept presented in your Machine Learning Problem?
- Task 2
- Select a scenario (similar to the one given in TODO Task 2) and answer the questions given below.
- Questions
- What is the Target Function / Concept?
- What is the Size of Concept Space?
- How you can introduce search bias to reduce the Size of Concept Space?
- How many Search Spaces will be made after introducing search bias?
- Discuss each Search Space separately and identify from which Search Spaces you will find the Target Function / Concept and from which you miss the Target Function / Concept
Hypothesis (h) Representation
- Machine is Dump
- Recall
- Input to Learner
- Set of Training Examples (D)
- Set of Functions / Hypothesis (H)
- Machine is Dump
- Therefore, we will need to represent
- Both Training Example (d) and Hypothesis (h) in a Format, which
- Machine can understand
- Representation of Training Example (d)
- Very often, a Training Example (d) is represented as
- Attribute-Value Pair
- Values of Attributes can be mainly categorized as
- Categorical
- Numeric
- Examples
- Following Machine Learning Algorithms represent a Training Example (d) in the form of Attribute-Value Pair
- FIND-S Algorithm
- List Then Eliminate Algorithm
- Candidate Elimination Algorithm
- Decision Tree Learning Algorithms
- For example, ID3, J48, Random Forest
- Artificial Neural Networks
- For example, Multi-Layer Perceptron
- Bayesian Learning Algorithms
- For example, Naïve Bayes
- Instance-based Learning Algorithms
- For example, k-NN
- Representation of Hypothesis (h)
- Representation of Hypothesis (h) may vary from Machine Leering Algorithm to Machine Learning Algorithm
- Examples
- Representation of Hypothesis (h) for FIND-S, List Then Eliminate and Candidate Elimination Algorithms is
- Conjunction (AND) of Constraints on Attributes
- Representation of Hypothesis (h) for ID3 Algorithms is
- Disjunction (OR) of Conjunction (AND) of Constraints on Attributes
- Representation of Hypothesis (h) for Regular Neural Network Algorithms is
- Combination of Weights between Units
- Example –Representation of Training Example and Hypothesis
- In sha Allah, in this example, I will consider the Gender Identification Problem and will try to explain the representation of both
- Training Example (d) and Hypothesis (h)
- Example –Representation of Training Example and Hypothesis
- Concept to be Learned
- Gender Identification
- Input
- Human
- Output
- Gender of the Human
- Representation of Example
- Recall
- Example = Input + Output
- Representation of Input i.e. Human
- Set of 2 Input Attributes
- Input Attribute 1
- Height
- Input Attribute 2
- Weight
- Values of Input Attributes
- Values – Input Attribute 1 (Height)
- Short
- Medium
- Tall
- Values – Input Attribute 2 (Weight)
- Small
- Normal
- Heavy
- Representation of Output i.e. Gender of a Human
- Set of 1 Output Attributes
- Output Attribute
- Gender
- Values of Output Attribute
- Values – Output Attribute (Gender)
- Yes (Female)
- No (Male)
- Note
- Example (both Input and Output) are represented as
- Attribute-Value Pair
- Computing Size of Instance Space (X)
- Instance Space (X)
- X = {x1,x2,x3,x4,x5, x6, x7, x8, x9}
- Let’s see each example separately, considering the following order of Input Attributes
- <Height, Weight>
- x1 = <Short, Small>
- x2 = <Short, Normal>
- x3 = <Short Heavy>
- x4 = <Medium, Small>
- x5 = <Medium, Normal>
- x6 = <Medium, Heavy>
- x7 = <Tall, Small>
- x8 = <Tall, Normal>
- x9 = <Tall, Heavy>
- Sample Data
- Assume we obtained Sample Data of 6 instances
- Sample Data Split
- Sample Data is split into
- Training Data – 2 / 3 of Sample Data
- Testing Data – 1 / 3 of Sample Data
- Formally, each d ∈ D is composed of di, c(di)
- di represents the Input
- c(di) represents the Output
- Training Data (D)
- Testing Data
- Note
- We have completed the Representation of Training Examples (D) in the form of
- Attribute-Value Pair
- In sha Allah, in next slides we will discuss the
- Representation of Hypothesis (h)
- Concept Space (C)
- Concept Space (C)
- Represention of Concepts
- Two Main Representation
- Conjuction (AND) of Input Attributes
- Disjunction (OR) of Conjuction (AND) of Input Attributes
- Note a Concept can also be written in the form of
- Rules (if-to)
- Example 1 - Concept Representation as Conjuction (AND) of Input Attributes
- Concept
- C = < Height = Short AND Weight = Normal>
- Concept in the form of Rules
- Concept in Tabular Form
- Example 2 - Concept Representation as Conjuction (AND) of Input Attributes
- Concept
- C = < Height = Medium AND Weight = Normal>
- Concept in the form of Rules
- Concept in Tabular Form
- Example 3 - Concept Representation as Conjuction (AND) of Input Attributes
- Concept
- C = < Height = Short AND Weight = Heavy>
- Concept in the form of Rules
- Concept in Tabular Form
- Observations - Concept Representation as Conjunction (AND) of Input Attributes
- In all three examples given on previous slides, Output is Yes only for
- One combination of Height and Weight
- Example 1 - Concept Representation as Disjunction (OR) of Conjuction (AND) of Input Attributes
- Concept
- C = < Height = Short AND Weight = Small OR
Height =Medium AND Weight = Small OR
Height = Short AND Weight = Normal OR
Height = Medium AND Weight = Normal OR
Height = Short AND Weight = Heavy >
- Concept in the form of Rules
- Concept in Tabular Form
- Example 2 - Concept Representation as Disjunction (OR) of Conjuction (AND) of Input Attributes
- Concept
- C = < Height = Short AND Weight = Small OR
Height = Short AND Weight = Normal OR
Height = Medium AND Weight = Normal OR
Height = Short AND Weight = Heavy >
- Concept in the form of Rules
- Concept in Tabular Form
- Example 3 - Concept Representation as Disjunction (OR) of Conjuction (AND) of Input Attributes
- Concept
- C = < Height = Medium AND Weight = Normal OR
Height = Short AND Weight = Heavy >
- Concept in the form of Rules
- Concept in Tabular Form
- Observations - Concept Representation as Disjunction (OR) of Conjunction (AND) of Input Attributes
- In all three examples given on previous slides, Output is Yes for
- Multiple combinations of Height and Weight
- Comparison of Concept Representations
- Question
- Which of the two Concept Representations is more powerful?
- Conjunction (AND) of Input Attributes
- Disjunction (OR) of Conjunction (AND) of Input Attributes
- Answer
- Disjunction (OR) of Conjunction (AND) of Input Attributes
- Reason
- We can represent more complex Concepts using Disjunction (OR) of Conjunction (AND) of Input Attributes compared to Conjunction (AND) of Input Attributes
- Learning Is a Searching Problem
- Recall
- Learner has to search the Concept Space / Hypothesis Space using the Set of Training Examples (D) to find a Concept / Hypothesis / Function, which best fits the Training Data
- In our Gender Identification Problem
- |C| = 512
- Hypothesis Space (H)
- The reduced Concept Space (C) is called the Hypothesis Space (H)
- To reduce the size of Concept Space (C)
- introduce search bias
- Most Common Bias
- uses the AND (Conjunction) relationship between Attributes
- In other words, Hypothesis Space (H) uses the
- Conjunction (AND) of Attributes
- i.e. h = <Height, Weight>
- Representation of Hypothesis (h)
- Question
- How a Hypothesis (h) can be represented?
- Answer
- There are many possible representations of a Hypothesis (h)
- Representation of Hypothesis (h)
- In our Gender Identification Problem, a Hypothesis (h) is represented as
- Conjunction (AND) of Constraints on Input Attributes
- Each constraint can be
- A specific value : e.g.
- Weight = Small
- A specific value : e.g.
- A don’t care value (any of Possible Values): e.g.
- Weight = ?
- A don’t care value (any of Possible Values): e.g.
- No value allowed (Null Hypothesis Ø): e.g.
- Weight = Ø
- No value allowed (Null Hypothesis Ø): e.g.
- Constraints on Input Attributes
- Consider the Gender Identification Problem
- We have two Input Attributes
- Height
- Weight
- Constraints – Height Attribute
- No value allowed
- Ø
- A specific value
- Short, Medium, Tall
- Any value allowed
- ?
- Total No. of Constraints – Height Attribute
- Ø, Short, Medium, Tall, ?
- Constraints – Weight Attribute
- No value allowed
- Ø
- A specific value
- Small, Normal, Heavy
- Any value allowed
- ?
- Total No. of Constraints – Weight Attribute
- Ø, Small, Normal, Heavy, ?
- Examples – Hypothesis (h)
- In our Gender Identification Problem
- h = <Height, Weight>
- Some of the Hypotheis (h) for our Gender Idetnification Problem are as follows
- h = <Height, Weight>
- h1 = <Short, Small>
- h2 = <Tall, Heavy>
- h3 = <Ø, Small>
- h4 = <?, ?>
- Example (d) vs Hypothesis (h)
- An Example (d) is
- Vector of Attribute Values
- Examples
- d = Height, Weight, Gender
- d1 = Short, Small, Yes
- d2 = Tall, Heavy, No
- Note
- In d1 and d2
- Small, Short, Heavy, Tall, Yes and No are
- Attribute Values
- A hypothesis (h) is
- Vector of Constraints on Input Attributes
- Examples
- h = <Height, Weight>
- h1 = <Short, Small >
- h2 = <Tall, Heavy>
- h3 = <Ø, Small>
- h4 = <?, ?>
- Note
- In h1 , h2 and h3
- Small, Short, Heavy and Tall are
- Constraints on Input Attributes
- Hypothesis in the form of Rules
- Hypothesis (h) can be written in the form of Rules
- Example 1
- h = <Height, Weight>
- h1 = <Short, Small>
- Example 2
- h = <Height, Weight>
- h4 = <?, ?>
- Reduction in Size of Concept Space (C)
- Hypothesis (h) is represented as
- Conjunction (AND) of Constraints on Input Attributes
- Constraints on attributes include
- Ø – means that there will be no value for which Hypothesis (h) will be 1 (Yes)
- Specified values – means can take values from a specified set of values
- ? – means that for all values of the input Hypothesis (h) = 1 (Yes)
- In Gender Identification Problem
- There are two Input Attributes
- Height
- Weight
- Constraints on Height Attribute (total 5 constraints)
- Ø, Short, Medium, Tall, ?
- Constraints on Weight Attribute (total 5 constraints)
- Ø, Small, Normal, Heavy, ?
- |H| = No. of Constraints on Weight Attribute x No. of Constraints on Height Attribute
- |H| = 5 x 5 = 25
- Recall
- |C| = 512 and
- |H| = 25
- Conclusion
- Introduction of search bias reduced the size of Concept Space (C)
- Syntactically Distinct Hypothesis vs Semantically Distinct Hypothesis
- Consider the following two print statements
- In C Programming Language
- printf (“Allah loves those who serve humanity”)
- In Python Programming Language
- print (“Allah loves those who serve humanity”)
- Both statements are
- Semantically same but
- Syntactically distinct
- Syntactically Distinct Hypothesis vs Semantically Distinct Hypothesis
- No. of Syntactically Distinct Hypothesis = 25
- h = <Height, Weight>
- h1 = < Ø, Ø >
- h2 = < Ø, Small >
- h3 = < Ø, Normal >
- h4 = < Ø, Heavy >,
- h5 = < Ø, ? >,
- h6 = < Short, Ø >,
- h7 = < Short, Small >,
- h8 = < Short, Normal >,
- h9 = < Short, Heavy >,
- h10 = < Short, ? >,
- h11 = < Medium, Ø >,
- h12 = < Medium, Small >,
- h13 = < Medium, Normal >,
- h14 = < Medium, Heavy >,
- h15 = < Medium, ? >,
- h16 = < Tall, Ø >,
- h17 = < Tall, Small >,
- h18 = < Tall, Normal >,
- h19 = < Tall, Heavy >,
- h20 = < Tall, ? >,
- h21 = < ?, Ø >,
- h22 = < ?, Small >,
- h23 = < ?, Normal >,
- h24 = < ?, Heavy >,
- h25 = < ?, ? >
- Syntactically Distinct Hypothesis vs Semantically Distinct Hypothesis
- There are redundancies within these 25 hypotheses
- Caused by Ø
- Note
- Whenever there is Ø in any of the Input Attributes and we are considering Conjunctions (AND)
- Output will always be 0
- The following Hypotheses are semantically same
- h = <Height, Weight>
- h1 = < Ø, Ø >
- h2 = < Ø, Small >
- h3 = < Ø, Normal >,
- h4 = < Ø, Heavy >,
- h5 = < Ø, ? >,
- h6 = < Short, Ø >,
- h11 = < Medium, Ø >,
- h16 = < Tall, Ø >,
- h21 = < ?, Ø >
- Syntactically Distinct Hypothesis vs Semantically Distinct Hypothesis, Cont…
- No. of Semantically Distinct Hypothesis = 1 + (Small, Medium, Heavy, ?) x (Short, Medium, Tall, ?)
- No. of Semantically Distinct Hypothesis = 1+ 4 x 4 = 17
- h = <Height, Weight>
- h1 = < Ø, Ø >
- h2 = < Short, Small >
- h3 = < Short, Normal >,
- h4 = < Short, Heavy >,
- h5 = < Short, ? >,
- h6 = < Medium, Small >,
- h7 = < Medium, Normal >,
- h8 = < Medium, Heavy >,
- h9 = < Medium, ? >,
- h10 = < Tall, Small >,
- h11 = < Tall, Normal >,
- h12 = < Tall, Heavy >,
- h13 = < Tall, ? >,
- h14 = < ?, Small >,
- h15 = < ?, Normal >,
- h16 = < ?, Heavy >,
- h17 = < ?, ? >
- Conclusion - Reduction in Size of Concept Space
- To summarize
- |C| = 512
- |H| = 25 (Syntactically Distinct Hypothesis)
- |H| = 17 (Semantically Distinct Hypothesis)
- Conclusion
- Search bias has reduced the size of Concept Space from 512 to 17, which is a
- Huge reduction
- Advantage of Reduction
- A Learner can quickly search with low computational cost
- A space of 17 Hypothesis (h) instead of 512 Hypothesis (h)
- Disadvantage of Reduction
- The Target Faction / Concept may / may not be present in the Hypothesis Space (H) and this may / may not be known
- Disadvantage of Reduction
TODO and Your Turn
Todo Tasks
Your Turn Tasks
Todo Tasks
TODO Task 3
- Task
- Consider the Titanic Dataset with the following Attributes
- Gender: Male, Female
- Ticket Class: Upper, Middle, Lower
- Survival: No, Yes
- Goal of the Learner is to predict
- Whether a Passenger Survived or Not?
- Note
- Your answer should be
- Well Justified
- Questions
- Write down Input and Output for the Machine Learning Problem?
- Write down all instances in Instance Space in a Table
- Graphically represent the Concept Space (similar to I did for Gender Identification Problem in the Chapter)
- Write 4 Concepts from Concept Space in the form of: (1) Rules, (2) Table, considering
- Conjunction (AND) of Constrains on Attributes (2 Concepts)
- Disjunction (OR) of Conjunction (AND) of Constraints on Attributes (2 Concepts)
- Suppose we introduce search bias as Conjunction (AND) of Constrains on Attributes
- Calculate the Size of Instance Space, Concept Space, Syntactically Distinct Hypothesis and Semantically Distinct Hypothesis
- Write down Syntactically Distinct Hypothesis
- Write down Semantically Distinct Hypothesis
Your Turn Tasks
Your Turn Task 3
- Task
- Select a Machine Learning Problem (similar to the Titanic Dataset given in TODO)
- Questions
- Write down Input and Output for the selected Machine Learning Problem?
- Write down all instances in Instance Space in a Table
- Graphically represent the Concept Space (similar to I did for Gender Identification Problem in the Chapter)
- Write 4 Concepts from Concept Space in the form of: (1) Rules, (2) Table, considering
- Conjunction (AND) of Constrains on Attributes (2 Concepts)
- Disjunction (OR) of Conjunction (AND) of Constrains on Attributes (2 Concepts)
- Suppose we introduce search bias as Conjunction (AND) of Constrains on Attributes
- Calculate the Size of Instance Space, Concept Space, Syntactically Distinct Hypothesis and Semantically Distinct Hypothesis
- Write down Syntactically Distinct Hypothesis
- Write down Semantically Distinct Hypothesis
Hypothesis Space (H) Representation
- Learning is a Searching Problem
- In Machine Learning, Concept Learning can be formulated as
- Problem of searching through a predefined space of potential hypotheses (H) (Hypothesis Space) for the Hypothesis (h) that best fits the Training Data (D)
- Representation of Hypothesis Space (H)
- Hypothesis Space (H)
- Hypothesis Space (H) is a predefined space of Hypothesis (h)
- Question
- How can we represent a Hypothesis Space (H) for a Learner?
- Answer
- There are many possible representations of Hypothesis Space (H)
- Representation of Hypothesis Space (H) Cont...
- A simple representation of Hypothesis Space (H) is
- General-to-Specific Ordering of Hypotheses
- Hypothesis Space (H) Representation vs Hypothesis (h) Representation
- Hypothesis (h) Representation
- Conjunction of Constraints on Input Attributes
- Hypothesis Space (H) Representation
- General-to-Specific Ordering of Hypotheses
- Note
- Representation of Hypothesis (h) is entirely different from the Hypothesis Space (H)
- General-to-Specific Ordering of Hypotheses
- Consider the Gender Identification Problem discussed in the previous Section
- Hypothesis (h) Representation
- Most Specific Hypothesis (h)
- h = <∅, ∅>
- The Output of Most Specific Hypothesis (h) will always be 0 / False / No
- i.e. it will classify all instances as Negative
- Most General Hypothesis (h)
- h = <?, ?>
- The Output of Most Specific Hypothesis (h) will always be 1 / True / Yes
- i.e. it will classify all instances as Positive
- General-to-Specific Ordering of Hypotheses, Cont...
- If we represent Hypothesis Space (H) as General-to-Specific Ordering of Hypotheses then
- Specific Boundry = Most Specific Hypothesis (h)
- Remaining Hypothesis will lie here
- General Boundry = Most General Hypothesis (h)
- General and Specific Hypothesis
- A hypothesis (hi) is more general then or equal to another hypothesis (hj) if
- It hi puts fewer constraints than hj and therefore classifies
- more instances as Positive
- Example - General and Specific Hypothesis
- Consider the following two instances
- x1 = <Short, Heavy>
- x2 = <Tall, Heavy>
- Consider the following three Hypotheses
- h1 = <Short, Heavy>
- h2 = <Tall, Heavy>
- h3 = <?, Heavy>
- Let’s see which Hypothesis is more general than other Hypotheses
- Apply h1 on x1 and x2
- x1 is classified as Positive
- x2 is classified as Negative
- Apply h2 on x1 and x2
- x1 is classified as Negative
- x2 is classified as Positive
- Apply h3 on x1 and x2
- x1 is classified as Positive
- x2 is classified as Positive
- Conclusion
- h3 is more general then h1 and h2 because
- it puts fewer constraints compared to h1 and h2, and
- it classified more instances as Positive compared to h1 and h2
TODO and Your Turn
Todo Tasks
Your Turn Tasks
Todo Tasks
TODO Task 4
- Task 1
- Consider the Titanic Dataset with the following Attributes
- Gender: Male, Female
- Ticket Class: Upper, Middle, Lower
- Survival: No, Yes
- Goal of the Learner is to predict
- Whether a Passenger Survived or Not?
- Note
- Your answer should be
- Well Justified
- Questions
- Write down Input and Output for the Machine Learning Problem?
- Considering Conjunction (AND) of Constrains on Attributes
- How will you represent your Hypothesis Space (H)?
- Task 2
- Give 2 examples in which one hypothesis is more general than or equal to other hypothesis
Your Turn Tasks
Your Turn Task 4
- Task 1
- Select a Machine Learning Problem (similar to the Titanic Dataset in TODO Task)
- Questions
- Write down Input and Output for the selected Machine Learning Problem?
- Considering Conjunction (AND) of Constrains on Attributes
- How will you represent your Hypothesis Space (H)?
- Task 2
- Give 2 examples in which one hypothesis is more general than or equal to another hypothesis
Chapter Summary
- Chapter Summary
- Most of the Machine Learning Algorithms use Inductive Learning Approach to learn
- Concept Learning is a major subclass of Inductive Learning
- In Psychology, Concept Learning refers to the human ability
- to learn categories for Real-world Object and
- to recognize new instances of those categories
- In Machine Learning, a Real-world Object is a material thing that can be seen and touched
- Two Main Properties of a Real-world Object are
- Characteristics / Attributes / Features
- Behavior
- Characteristics and Behavior of a Real-world Object help us to
- learn to categories Real-world Object (or distinguish one Real-world Object from the other)
- Similar Real-world Objects will have Similar Characteristics and Behavior and vice versa
- Problem
- Machine Is dump and cannot understand Real-world Objects ( as they are )
- Solution
- Change representation of Real-world Objects in a Format, which Machine Learning Algorithms can understand
- In Machine Learning, Concept Learning refers to a machine’s ability
- to learn categories for Real-world Object and
- to recognize new instances of those categories
- Concept Learning – Formal Definition
- Given
- Set of Training Examples (D) and
- Set of Functions / Hypothesis (H)
- Infer
- A Boolean-valued Function using the Set of Training Examples
- Concept Learning is also known as Binary Classification
- In Concept Learning
- if Output is 1 / True / Yes, then it is called a Positive Example
- if Output is 0 / False / No, then it is called a Negative Example
- A Concept is a Function , which we want to learn
- Instance Space (X) is the set of all possible combinations of Input Attribute values
- In Machine Learning, Instance Space is same as
- Population in Statistics
- Sample Data
- Subset of Instance Space (X)
- Size of Sample Data is always less than the Size of Instance Space
- In Machine Learning, Sample Data is same as
- Sample in Statistics
- Sample Data is split into
- Training Data (D)
- Testing Data
- Formally, each d ∈ D is composed of
- di, c(di)
- di represents the Input
- c(di) represents the Output
- Concept Space (C) is the set of all possible combinations of Output values
- Symbols
- Formula to Compute |C|
- |C| = 2|X|
- where |X| represents the Size of Instance Space
- The computational cost and time required to search a huge Concept Space will be very high
- The reduced Concept Space (C) is called the Hypothesis Space (H)
- Most Common Approach to reduce the size of Concept Space is to
- Introduce search bias
- Most Common Bias
- uses the AND (Conjunction) relationship between Attributes
- Searching a Concept Space
- Strengths
- We will definitely find the Target Function / Concept
- Weaknesses
- Computational cost is high
- Amount of Time required for searching is high
- Searching a Hypothesis Space
- Strengths
- Computational cost is low
- Amount of Time required for searching is small
- Weaknesses
- We may / may not find the Target Function / Concept and
- this may / may not be known to us
- Very often , a Training Example (d) is represented as
- Attribute-Value Pair
- Values of Attributes can be mainly categorized as
- Categorical
- Numeric
- Examples
- Following Machine Learning Algorithms represent a Training Example (d) in the form of Attribute-Value Pair
- FIND-S Algorithm
- List Then Eliminate Algorithm
- Candidate Elimination Algorithm
- Decision Tree Learning Algorithms
- For example, ID3, J48, Random Forest
- Artificial Neural Networks
- For example, Multi-Layer Perceptron
- Bayesian Learning Algorithms
- For example, Naïve Bayes
- Instance based Learning Algorithms
- For example, k-NN
- Representation of Hypothesis (h) may vary from Machine Leering Algorithm to Machine Learning Algorithm
- Examples
- Representation of Hypothesis (h) for FIND-S, List Then Eliminate and Candidate Elimination Algorithms is
- Conjunction (AND) of Constraints on Attributes
- Representation of Hypothesis (h) for ID3 Algorithms is
- Disjunction (OR) of Conjunction (AND) of Constraints on Attributes
- Representation of Hypothesis (h) for Regular Neural Network Algorithms is
- Combination of Weights between Units
- Hypothesis Space (H)
- Hypothesis Space (H) is a predefined space of Hypothesis (h)
- There are many possible Representations of Hypothesis Space (H)
- A simple Representation of Hypothesis Space (H) is
- General-to-Specific Ordering of Hypotheses
In Next Chapter
- In Next Chapter
- In Sha Allah, in next Chapter, I will present
- FIND-S Algorithm