Skip to content

Ilm o Irfan

Technologies

  • Home
  • Courses
    • Free Courses
  • Motivational Seminars
  • Blog
  • Books
  • Contact
Menu
  • Home
  • Courses
    • Free Courses
  • Motivational Seminars
  • Blog
  • Books
  • Contact
Search
Close this search box.
Explore Courses

Machine Learning

  • January 23, 2023
  • No Comments
  • Home
  • Free Book

Table of Contents

Chapter 13- Instance Based Learning

Chapter Outline

  • Chapter Outline
  • Quick Recap
  • Basics of Instance-based Learning
  • k-Nearest Neighbor (k-NN) Algorithm
  • Applying k-NN Algorithm on Numeric Data – A Step by Step Example
  • Applying k-NN Algorithm on Categorical Data – A Step by Step Example
  • Chapter Summary

 

Quick Recap

  • Quick Recap – Bayesian Learning
  • Alhumdulilah, Up till now, we have studied the following Machine Learning Algorithms
    • FIND-S Algorithm
    • List Then Eliminate Algorithm
    • Candidate Elimination Algorithm
    • ID3 Algorithm
    • Perceptron Algorithm (Two Layer ANN)
    • Regular Neural Network (Multi-layer ANN)
  • Major Problems 
    • Problem 1
      • These ML Algorithms do not allow us to combine prior knowledge (or past experience) with our Training Examples, when learning a Concept
    • Problem 2
      • These ML Algorithms either accept or reject a Hypothesis (h) i.e. take a Binary Decision
        • Instead of assigning a probability to each Hypothesis (h) in the Hypothesis Space (H)
  • A Possible Solution
    • Bayesian Learning Algorithms
  • Naïve Bayes – Summary
    • Representation of Training Examples (D)
      • Attribute-Value Pair
    • Representation of Hypothesis (h)
      • Probability Distribution
    • Searching Strategy
      • I am not clear about Search Strategy of Naïve Bayes Algorithm. Please drop me an email if you know it.
    • Training Regime 
      • Batch Method
    • Strengths
      • Naïve Bayes Algorithm allows us to combine prior knowledge (or past experience) with our Training Examples, when learning a Concept
    • Naïve Bayes Algorithm assigns a probability to each Hypothesis (h) in the Hypothesis Space (H)
      • Instead of simply accepting or rejecting a Hypothesis (h)
    • Naïve Bayes Algorithm can handle noisy Data 
    • Naïve Bayes Algorithm show good Accuracy on Testing Data even if the Set of Training Examples (D) is rather small 
    • Naïve Bayes Algorithm can avoid Overfitting
    • Naïve Bayes Algorithm are used in Deep Learning Algorithms, which allows Deep Learning Algorithms to learn from small datasets / corpora 
    • Naïve Bayes Algorithm can make Probabilistic Predictions 
      • Instead of making a Binary Prediction i.e. 0 or 1
  • Weaknesses
    • Naive Bayes Algorithm assumes that Attributes / Features are independent of each other
      • However, in Real world, Attributes / Features depend on each other 
  •  Bayesian Learning Algorithms are Probabilistic Machine Learning Algorithms 
  • Bayesian Learning uses Bayes’ Theorem to determine the Conditional Probability of a Hypotheses (h) given a Set of Training Examples (D)
  • Bayesian Learning Algorithms allow us to combine prior knowledge (or past experience) with your Set of Training Examples (D) 
  • Bayesian Learning Algorithms are suitable for following Machine Learning Problems
    • When you want to combine prior knowledge (or past experience) with your Set of Training Examples (D) 
    • When you want to assign a probability to each Hypothesis (h) in the Hypothesis Space (H)
      • Instead of simply accepting or rejecting a Hypothesis (h)
  • When you have noisy Data 
  • When you have small Training Data
      • In recent years, Bayesian Learning Algorithms are used in Deep Learning Algorithms, which allows Deep Learning Algorithms to learn from small datasets / corpora 
    • When you want to make Probabilistic Predictions 
      • Instead of making a Binary Prediction i.e. 0 or 1
  • A Brute Force Bayesian Learning Algorithm works as follows
    • Step 1: Compute Posterior Probability for all the Hypothesis (h) in Hypothesis Space (H)
    • Step 2: Output Most Probable Hypothesis (a.k.a. Maximum A Posteriori (MAP) Hypothesis, hMAP)
  • Bayes Optimal Classifier works as follows to calculate the Most Probable Classification of an unseen instance x
    • Step 1: Calculate Posterior Probability for every Hypothesis h ∈ H
    • Step 2: Calculate the prediction of each Hypothesis (h) for each new instance
    • Step 3: Use the combination of Step 1 and 2. to classify each new instance 
  • Prediction returned by hMAP(x) is (always) not Most Probable Classification? 
  • To reduce the computation cost in finding the Most Probable Classification using Bayes Optimal Classifier, the Gibbs Algorithm works as follows
    • Step 1: Choose one Hypothesis (h) at random, according to P(h|D)
    • Step 2: Use this h to classify new unseen instances
  • Naive Bayes Algorithm is a highly practical Bayesian Learning Algorithm
    •  Performance comparable to Artificial Neural Networks and Decision Tree Learning Algorithms on some tasks
  • Naïve Bayes is very good for

          Text Classification

Basics of Instance-based Learning

  • Summary - Machine Learning Algorithms Studied So Far
  • Machine Learning Algorithms studied so far are
    • FIND-S Algorithm
    • List Then Eliminate Algorithm
    • Candidate Elimination Algorithm
    • ID3 Algorithm (Decision Tree Learning)
    • Perceptron (Two Layer Artificial Neural Network)
    • Regular Neural Network (Multi-layer Artificial Neural Network)
    • Naïve Bayes Algorithm (Bayesian Learning Algorithm)
  • How these ML Algorithms Work?
  • Given
    • Set of Training Examples (D)
    • Set of Functions / Hypothesis (H)
  • Training Phase 
    • Build an explicit representation of the Target Function (called approximated Target Function) by
      • Searching Hypothesis Space (H) to find a Hypothesis h (approximated Target Function), which best fits the Set of Training Examples (D) 
  • Testing Phase
    • Use the explicit representation (approximated Target Function) to
      • classify unseen instances 
    • Approximated Target Function is applied on an unseen instance and it predicts the Target Classification 
  • Note
    • These Machine Learning Algorithms are also called Eager Machine Learning Algorithms
  • Instance-based Machine Learning Algorithms
  • How Instance-based ML Algorithms Work?
  • Given
    • Set of Training Examples (D)
  • Training Phase 
    • Simply store the Set of Training Examples (D) 
  • Testing Phase
    • To classify a new unseen instance x
      • Step 1: Compare unseen instance x with all the stored Training Examples
      • Step 2: Depending on relationship of unseen instance x with stored Training Examples
        • Predict the Target Classification for unseen instance x
  • Note
    • These Machine Learning Algorithms are also called Lazy Machine Learning Algorithms
  • Eager ML Algorithms vs. Lazy ML Algorithms
  • Eager Machine Learning Algorithms 
    • Machine Learning Algorithms which build an explicit representation of the Target Function as Training Examples are presented are called Eager Machine Learning Algorithms
  • Lazy Machine Learning Algorithms 
    • Machine Learning Algorithms which defer processing until a new unseen instance must be classified are called Lazy Machine Learning Algorithms 
  • Basic Idea - Instance-based Classifiers
  • Basic Idea
    • If there is Long Hair like a Female, Covered Head with scarf like a Female, then it’s probably a Female

  • Basic Idea - Instance-based Classifiers
  • Basic Idea
    • If there is Long Hair like a Female, Covered Head with scarf like a Female, then it’s probably a Female

  • Basic Idea - Instance-based Classifiers
  • Basic Idea
    • If there is Long Hair like a Female, Covered Head with scarf like a Female, then it’s probably a Female

  • Basic Idea - Instance-based Classifiers

K-Nearest Neighbor (k-NN) Algorithm

  • K-Nearest Neighbor (k-NN) Algorithm
  • k-NN Algorithm is the grand-daddy of Instance-based Machine Learning Algorithms 
  • Question
    • What do we mean by k nearest neighbors in k-NN Algorithm?
  • Answer
    • k nearest neighbors of an unseen instance x are Training Examples that have the k smallest distance to unseen instance x
  • Example – k-Nearest Neighbors

  • K-Nearest Neighbor (k-NN) Algorithm Cont…
  • Representation of Training Examples in k-NN Algorithm
    • Attribute-Value Pair
  • Attribute Values can be 
    • Categorical
    • Numeric
  • Question
    • How to Compute Relationship between unseen instance x and stored Training Examples (D)?
  • Answer
    • Situation 1
      • Attribute Values are Categorical
    • Similarity / Distance (or relationship) between unseen instance x and stored Training Examples can be computed using following measures
      • Jaccard Similarity Co-efficient
      • Hamming Distance etc.
    • Situation 2
      • Attribute Values are Numeric
    • Similarity / Distance (relationship) between unseen instance x and stored Training Examples can be computed using following measure
      • Euclidean Distance
      • Cosine Similarity etc.
  • Note
    • Similarity = 1 – Distance
    • Distance = 1 – Similarity
  • k-Nearest Neighbor (k-NN) Algorithm requires three things
    • Set of Training Examples (D)
    • Distance Metric to compute distance between unseen instance x and Set of stored Training Examples
    • Value of k
    • Number of Nearest Neighbors to consider when classifying unseen instance x
  • In k-Nearest Neighbor (k-NN) Algorithm, value if k should be 
    • Carefully chosen 
  • If k is too small
    • Sensitive to noise points (Training Examples)
  • If k is too large
    • Neighborhood may include points (Training Examples) from other Classes

  • Given
    • Set of 15 Training Examples
    • Distance Metric = Euclidean Distance  
    • k = 3
  • Steps – Classify an unseen instance x using k-NN Algorithm 
    • Step 1: Compute Euclidean Distance between unseen instance x and all 15 Training Examples 
    • Step 2: Identify 3-nearest neighbors (as k = 3)
    • Step 3: Use Class labels of k nearest neighbors to determine the Class label of unseen instance x (e.g., by taking Majority Vote)

  • Scaling Issue – k-NN Algorithm
  • Scaling Issue
    • Attributes may have to be scaled to prevent distance measures from being dominated by one of the Attributes
  • Example – Scaling Issue 
    • Height of a person may vary from 
      • 1.5m to 1.8m
    • Weight of a person may vary from 
      • 90lb to 300lb
    • Income of a person may vary from
      • Rs. 10,000 to Rs. 1,000,000

Applying k-NN Algorithm on Numeric Data – A Step by Step Example

  • Gender Identification Problem
  • Machine Lerning Problem
    • Gender Identification
  • Input 
    • Human
  • Output
    • Gender of a Human
  • Task
    • Given a Human (Input), predict the Gender of the Human (Output)
  • Treated as
    • Instance-based Learning Problem
  • Representation of Examples
  • Example = Input + Output
    • Input 
      • Human
    • Output
      • Gender
  • Representation of Input and Output
    • Attribute-Value Pair
  • Representation of Input
  • Input is represented as a 
    • Set of 3 Input Attributes
      • Height
      • Weight
      • Salary 

  • Representation of Output
  • Output is represented as a 
    • Set of 1 Output Attribute
      • Gender

  • Sample Data
  • We obtained a Sample Data of 7 Examples

  • Scaling Issue in Sample Data
  • In our Sample Data
    • Values of Salary Attribute are very high compared to other two Attributes (Height and Weight)
    • High value of Salary Attribute will dominate other two Attributes in computing distance
    • Therefore, we need to scale / normalize values of all three Attributes in a 
      • Range of [0 – 1]
  • Question
    • How to scale / normalize Attribute Values?
  • Answer
    • There are many possible solutions
  • A simple and widely used approach is
    • Min-Max Scaling
  • Min-Max Scaling Approach
  • Min-Max Scaling Approach scales / normalizes a value in the 
    • Range of [0 – 1]
  • Formula 

  • Where X is the Value of Attribute (A) for an instance
  • Xmin is the minimum Value of Attribute (A) among all instances in Sample Data
  • Xmax is the maximum Value of Attribute (A) among all instances in Sample Data
  • Xnorm is the scaled / normalized value of Attribute (A) for an instance
  • Applying Mix-Max Scaling on Height Attribute
  • Heightmin  = 4.8
  • Heightmax = 6.0
  • Scaling / Normalizing Value of Height for Example E1
    • Xnorm = 6.0- 4.86.0- 4.8 = 1.21.2=1.00 
  • Scaling / Normalizing Value of Height for Example E2
    • Xnorm = 4.8- 4.86.0- 4.8 = 01.2=0.00
  •  Scaling / Normalizing Value of Height for Example E3
    • Xnorm = 5.2- 4.86.0- 4.8 = 0.41.2=0.33
  • Scaling / Normalizing Value of Height for Example E4
    • Xnorm = 5.3- 4.86.0- 4.8 = 0.51.2=0.42
  • Scaling / Normalizing Value of Height for Example E5
    • Xnorm = 5.0- 4.86.0- 4.8 = 0.21.2=0.16
  • Scaling / Normalizing Value of Height for Example E6
    • Xnorm = 5.7- 4.86.0- 4.8 = 0.91.2=0.75
  • Scaling / Normalizing Value of Height for Example E7
    • Xnorm = 5.0- 4.86.0- 4.8 = 0.21.2=0.16
  • Applying Mix-Max Scaling on Weight Attribute
  • Weightmin  = 55
  • Weightmax = 150
  • Scaling / Normalizing Value of Weight for Example E1
    • Xnorm = 100- 55150- 55 = 4595=0.47
  • Scaling / Normalizing Value of Weight for Example E2
    • Xnorm = 70- 55150- 55 = 1595=0.16
  • Scaling / Normalizing Value of Weight for Example E3
    • Xnorm = 80- 55150- 55 = 2595=0.26
  • Scaling / Normalizing Value of Weight for Example E4
    • Xnorm = 150- 55150- 55 = 9595=1.00
  • Scaling / Normalizing Value of Weight for Example E5
    • Xnorm = 75- 55150- 55 = 2095=0.21
  • Scaling / Normalizing Value of Weight for Example E6
    • Xnorm = 125- 55150- 55 = 7095=0.74
  • Scaling / Normalizing Value of Weight for Example E7
    • Xnorm = 55- 55150- 55 = 095=0.00
  • Applying Mix-Max Scaling on Salary Attribute
  • Salarymin  = 20000
  • Salarymax = 100000
  • Scaling / Normalizing Value of Salary for Example E1
    • Xnorm = 20000- 20000100000- 20000 = 080000=0.00
  • Scaling / Normalizing Value of Salary for Example E2
    • Xnorm = 30000- 20000100000- 20000 = 1000080000=0.12
  • Scaling / Normalizing Value of Salary for Example E3
    • Xnorm = 80000- 20000100000- 20000 = 6000080000=0.75
  • Scaling / Normalizing Value of Salary for Example E4
    • Xnorm = 100000- 20000100000- 20000 = 8000080000=0.00
  • Scaling / Normalizing Value of Salary for Example E5
    • Xnorm = 50000- 20000100000- 20000 = 3000080000=0.37
  • Scaling / Normalizing Value of Salary for Example E6
    • Xnorm = 40000- 20000100000- 20000 = 2000080000=0.25
  • Scaling / Normalizing Value of Salary for Example E7
    • Xnorm = 60000- 20000100000- 20000 = 4000080000=0.50
  • Sample Data after Scaling / Normalization

  • Note
    • Values of all three Attributes are scaled / normalized in a
      • Range of [0 – 1]
  • Split the Sample Data
  • We split the Sample Data using Random Split Approach into
    • Training Data – 2 / 3 of Sample Data
    • Testing Data – 1 / 3 of Sample Data
  • Training Data

  • Testing Data

  • Machine Learning Cycle
  • Four phases of a Machine Learning Cycle are
    • Training Phase
      • Build the Model using Training Data
    • Testing Phase
      • Evaluate the performance of Model using Testing Data
    • Application Phase
      • Deploy the Model in Real-world, to make prediction on Real-time Unseen Data
    • Feedback Phase
      • Take Feedback from the Users and Domain Experts to improve the Model
  • Training Phase
  • In the Training Phase, k-NN Algorithm will 
    • Store the Set of 5 Training Examples
  • Recall

  • Training Data

  • Model
    • Stored Set of 5 Training Examples 
  • In the next Phase i.e. Testing Phase, Insha Allah, we will 
    • Evaluate the performance of the Model
  • Testing Phase
  • Question 
    • How good Model has learned?
  • Answer
    • Evaluate the performance of the model on unseen data (or Testing Data)
  • Evolution Measures
  • Evaluation will be carried out using 
    • Error measure
  • Error
  • Definition
    • Error is defined as the proportion of incorrectly classified Test instances
  • Formula

  • Note
    • Accuracy = 1 – Error
  • Evaluate Model
  • Given
    • Model
      • Set of 5 Training Examples
    • Distance Measure 
      • Euclidean Distance 
    • k Nearest Neighbors
      • k = 3
  • Classify Test Examples (unseen instances) using k-NN Algorithm 
    • For each Test Example (unseen instance) E
      • Step 1: Compute Euclidean Distance between unseen instance E and all 5 Training Examples 
      • Step 2: Identify 3-nearest neighbors (as k = 3)
      • Step 3: Use Class labels of 3 nearest neighbors to determine the Class label of unseen instance E (e.g., by taking Majority Vote)
  • Euclidean Distance
  • Definition
    • The distance between two Points (Examples) is simply the Absolute Value of the difference between their Attribute Values
  • Formula

  • Where d represents Point 01 (Example 01)
  • q represents Point 02 (Example 02)
  • Example 1 – Computing Euclidean Distance for Unnormalized Data
  • Consider the following two (Unnormalized) Training Examples

  • Euclidean Distance between Training Examples E1 and E2 is :

  •  
  •  
  • Example 2 – Computing Euclidean Distance for normalized Data
  • Consider the following two (Normalized) Training Examples

  • Euclidean Distance between Instances E1 and E2 is

 

  • Comparison of Normalized and Unnormalized Data
  • Considering the previous two Examples 
  • Euclidean Distance for Unnormalized Data
    • 10000.04
  • Euclidean Distance for Normalized Data
    • 1.05
  • Conclusion 
    • Normalization / Scaling has a significant impact on the 
      • Calculation of Euclidean Distance
  • Predicting Class of Text Example E6 using k-NN Algorithm
  • Step 1: Compute Euclidean Distance between Text Example E6 and all 5 Training Examples 
  • Computing Euclidean Distance Training Example E1 and Test Example E6 

  • Euclidean Distance Training Example E1 and Test Example E6

  •  
  • Computing Euclidean Distance Training Example E2 and Test Example 6

  • Euclidean Distance Training Example E2 and Test Example E6
  •  

  • Computing Euclidean Distance Training Example E3 and Test Example E6

  • Euclidean Distance Training Example E3 and Test Example E6
  •  

  • Computing Euclidean Distance Training Example E4 and Test Example E6

  • Euclidean Distance Training Example E4 and Test Example E6
  •  

  • Computing Euclidean Distance Training Example E5 and Test Example E6

  • Euclidean Distance Training Example E5 and Test Example E6

  •  
  • Step 2: Identify k-Nearest Neighbors
  • Identify 3-nearest neighbors (as k = 3)
  • Distance of Test Example E6 with all 5 Training Examples
    • d (E6, E1) = 0.44
    • d (E6, E2) = 0.95
    • d (E6, E3) = 0.81
    • d (E6, E4) = 0.48
    • d (E6, E5) = 0.80
  • 3-Nearest Neighbors 
    • Male (Votes) = 2
    • Female (Votes) = 1

 

  • Prediction of Test Example E6 (using Majority Vote)
    • Male
  • Remember
  • The best way to learn anything is to
    • Do It Yourself 😊
  • Whenever you want to learn any concept, Learn it
    • Step by Step
    • Simple to Complex
    • Using a Detailed Example 
  • Predicting Class of Text Example E7 using k-NN Algorithm
  • Step 1: Compute Euclidean Distance between Text Example E7 and all 5 Training Examples 
  • Computing Euclidean Distance Training Example E1 and Test Example E7

  • Euclidean Distance Training Example E1 and Test Example E7
  •  

  • Computing Euclidean Distance Training Example E2 and Test Example E7

  • Euclidean Distance Training Example E2 and Test Example E7
  •  

  • Computing Euclidean Distance Training Example E3 and Test Example E7

  • Euclidean Distance Training Example E3 and Test Example E7

  •  
  • Computing Euclidean Distance Training Example E4 and Test Example E7

  • Euclidean Distance Training Example E4 and Test Example E7

  •  
  • Computing Euclidean Distance Training Example E5 and Test Example E7

  • Euclidean Distance Training Example E5 and Test Example E7

  •  
  • Step 2: Identify k-Nearest Neighbors
  • Identify 3-nearest neighbors (as k = 3)
  • Distance of Test Example E7 with all 5 Training Examples
    • d (E7, E1) = 1.08
    • d (E7, E2) = 0.43
    • d (E7, E3) = 0.39
    • d (E7, E4) = 1.26
    • d (E7, E5) = 0.24
  • 3-Nearest Neighbors 
    • Male (Votes) = 0
    • Female (Votes) = 3

  • Prediction of Test Example E7 (using Majority Vote)
    • Female
  • Computing Error

  • Application Phase
  • We assume that our Model 
    • performed well on large Test Data and can be deployed in Real-world
  • Model is deployed in the Real-world and now we can make
    • predictions on Real-time Data
  • Steps – Making Predictions on Real-time Data
  • Step 1: Take Input from User
  • Step 2: Convert User Input into Feature Vector
    • Exactly same as Feature Vectors of Training and Testing Data
  • Step 3: Convert User Input into Normalized Feature Vector
  • Step 4: Compute Euclidean Distance between Unseen Example and all 5 Training Examples 
  • Step 5: Identify k Nearest Neighbors 
  • Step 6: Use Class labels of k nearest neighbors to determine the Class label of unseen instance E (e.g., by taking Majority Vote)
  • Making Predictions on Real-time Data
  • Step 1: Take input from User

  • Step 2: Conver User Input into Feature Vector
    • Note that order of Attributes must be exactly same as that of Training and Testing Examples

  • Step 3: Conver User Input into Normalized Feature Vector
    • Note that order of Attributes must be exactly same as that of Training and Testing Examples
    • Scaling / Normalizing Value of Height for Unseen Example 

                   Xnorm = 5.2- 4.86.0- 4.8 = 0.41.2=0.34

    • Scaling / Normalizing Value of Weight for Unseen Example 

                    Xnorm = 70- 55150- 55 = 1595=0.15

    • Scaling / Normalizing Value of Salary for Unseen Example 

                   Xnorm = 30000- 20000100000- 20000 = 1000080000=0.12

  • Step 4: Compute Euclidean Distance between Unseen Example and all 5 Training Examples 
  • Computing Euclidean Distance Training Example E1 and Unseen Example  

  • Euclidean Distance Training Example E1 and Unseen Example 

  •  
  • Computing Euclidean Distance Training Example E2 and Unseen Example

  • Euclidean Distance Training Example E2 and Unseen Example

  •  
  • Computing Euclidean Distance Training Example E3 and Unseen Example

  • Euclidean Distance Training Example E3 and Unseen Example

  •  
  • Computing Euclidean Distance Training Example E4 and Unseen Example

  • Euclidean Distance Training Example E4 and Unseen Example 

  • Computing Euclidean Distance Training Example E5 and Unseen Example

  • Euclidean Distance Training Example E5 and Unseen Example 

  • Step 5: Identify k Nearest Neighbors 
    • Identify 3-nearest neighbors (as k = 3)
    • Distance of Unseen Example with all 5 Training Examples
    • d (E, E1) = 0.74
    • d (E, E2) = 0.34
    • d (E, E3) = 0.63
    • d (E, E4) = 0.85
    • d (E, E5) = 0.31
    • 3-Nearest Neighbors 
      • Male (Vote) = 0
      • Female (Vote) = 3

  • Step 6: Use Class labels of k nearest neighbors to determine the Class label of unseen instance E (e.g., by taking Majority Vote)
    • Unseen Example E is predicted as
      • Female
  • Note 
    • You can take Input from user, apply Model and return predictions as many times as you like 😊
  • Feedback Phase
  • Only Allah is Perfect 😊
  • Take Feedback on your deployed Model from
    • Domain Experts and
    • Users

Improve your Model based on Feedback 😊

TODO and Your Turn​

Todo Tasks
Your Turn Tasks
Todo Tasks

TODO Task 1

  • Task 1
    • Consider the Titanic Dataset with the following Attributes
      • Gender (Numeric)
      • Ticket Class (Numeric)
      • Parent/Child Abroad: (Numeric)
      • Annual Income: (Numeric)
      • Survival: No, Yes
  • We obtained the following Sample Data

  • Sample Data was split into Training and Testing Data in a 
    • Train-Test Split Ratio of 67%-33%
  • Training Data 

  • Testing Data

  • Note
    • Consider k-NN Algorithm when answering questions given below
    • Your answer should be 
      • Well Justified
  • Questions
    • Write down the Input and Output for the above Machine Learning Problem?
    • How Training Example is represented?
    • Execute the Machine Learning Cycle using Normalized Sample Data using Euclidean Distance and Min-Max Scaling?
      • Classify Test Examples using k = 1, 3 and 5
      • What is the impact of changing the value of k on classification of Test Examples?
    • Execute the Machine Learning Cycle using Normalized Sample Data using Cosine Similarity Measures and Min-Max Scaling?
      • Classify Test Examples using k = 1, 3 and 5
      • What is the impact of changing the value of k on classification of Test Examples?
    • Compare results obtained (on Test Data) using Cosine Similarity Measure and Euclidean Distance?
Your Turn Tasks

Your Turn Task 1

  • Task 1
    • Select a Machine Learning Problem (similar to: Titanic – Machine Learning from Disaster) and answer the questions given below.
    • Note
      • Consider k-NN Algorithm in answering all the questions.
  • Questions
    • Write down the Input and Output for the above Machine Learning Problem?
    • How Training Example is represented?
    • Execute the Machine Learning Cycle using Normalized Sample Data using Euclidean Distance and Min-Max Scaling?
      • Classify Test Examples using k = 1, 3 and 5
      • What is the impact of changing the value of k on classification of Test Examples?
    • Execute the Machine Learning Cycle using Normalized Sample Data using Cosine Similarity Measures and Min-Max Scaling?
      • Classify Test Examples using k = 1, 3 and 5
      • What is the impact of changing the value of k on classification of Test Examples?
  • Compare results obtained (on Test Data) using Cosine Similarity Measure and Euclidean Distance?

Applying k-NN Algorithm on Categorical Data – A Step by Step Example

  • Gender Identification Problem
  • Machine Learning Problem
    • Gender Identification
  • Input 
    • Human
  • Output
    • Gender of a Human
  • Task
    • Given a Human (Input), predict the Gender of the Human (Output)
  • Treated as
    • Instance-based Learning Problem
  • Representation of Examples
  • Example = Input + Output
    • Input
      • Human
    • Output
      • Gender
  • Representation of Input and Output
    • Attribute-Value Pair
  • Representation of Input
  • Input is represented as a 
    • Set of 6 Input Attributes
      • Height
      • Weight
      • HairLength
      • HeadCovered
      • WearingChain
      • ShirtSleeves

  • Representation of Output
  • Output is represented as a 
    • Set of 1 Output Attribute
      • Gender

  • TIP
  • To become a great person
    • Always respect
      • Children 
      • Woman
      • Old People
      • Olama.e.Karam
  • Sample Data
  • We obtained a Set of 7 Examples

  • Split the Sample Data
  • We split the Sample Data using Random Split Approach into
    • Training Data – 2 / 3 of Sample Data
    • Testing Data – 1 / 3 of Sample Data
  • Training Data

  • Testing Data

  • Machine Learning Cycle
  • Four phases of a Machine Learning Cycle are
    • Training Phase
      • Build the Model using Training Data
    • Testing Phase
      • Evaluate the performance of Model using Testing Data
    • Application Phase
      • Deploy the Model in Real-world, to make prediction on Real-time Unseen Data
    • Feedback Phase
      • Take Feedback from the Users and Domain Experts to improve the Model
  • Training Phase
  • In the Training Phase, k-NN Algorithm will 
    • Store the Set of 5 Training Examples
  • Recall

  • Training Data

  • Model
    • Stored Set of 5 Training Examples 
  • In the next Phase i.e. Testing Phase, Insha Allah, we will 
    • Evaluate the performance of the Model
  • Testing Phase
  • Question 
    • How good Model has learned?
  • Answer
    • Evaluate the performance of the model on unseen data (or Testing Data)
  • Evolution Measures
  • Evaluation will be carried out using 
    • Error measure
  • Error
  • Definition
    • Error is defined as the proportion of incorrectly classified Test instances
  • Formula

  • Note
    • Accuracy = 1 – Error
  • Evaluate Model
  • Given
    • Model
      • Set of 5 Training Examples
    • Distance Measure 
      • Hamming Distance 
    • k Nearest Neighbors
      • k = 3
  • Classify Test Examples (unseen instances) using k-NN Algorithm 
    • For each Test Example (unseen instance) E
      • Step 1: Compute Hamming Distance between unseen instance E and all 5 Training Examples 
      • Step 2: Identify 3-nearest neighbors (as k = 3)
      • Step 3: Use Class labels of 3 nearest neighbors to determine the Class label of unseen instance E (e.g., by taking Majority Vote)
  • Hamming Distance
  • Definition
    • Hamming Distance between two instances with same number of Attributes is the Number of Attributes for which 
      • Attribute-Values of two Attributes are different
  • Example

  • Hamming Distance between Instances E1 and E2 is
    • d (E1, E2) = 1
  • Predicting Class of Text Example E6 using k-NN Algorithm
  • Step 1: Compute Hamming Distance between Text Example E6 and all 5 Training Examples 
  • Computing Hamming Distance between Training Example E1 and Test Example E6 

  • Hamming Distance between Training Example E1 and Test Example E6 
    • d (E6, E1) = 3
  • Computing Hamming Distance between Training Example E2 and Test Example E6

  • Hamming Distance between Training Example E2 and Test Example E6 
    • d (E6, E2) = 4
  • Computing Hamming Distance between Training Example E3 and Test Example E6

  • Hamming Distance between Training Example E3 and Test Example E6 
    • d (E6, E3) = 3
  • Computing Hamming Distance between Training Example E4 and Test Example E6

  • Hamming Distance between Training Example E4 and Test Example E6 
    • d (E6, E4) = 4
  • Computing Hamming Distance between Training Example E5 and Test Example E6

  • Hamming Distance between Training Example E5 and Test Example E6 
    • d (E6, E5) = 4
  • Step 2: Identify k-Nearest Neighbors
  • Identify 3-nearest neighbors (as k = 3)
  • Distance of Test Example E6 with all 5 Training Examples
    • d (E6, E1) = 3
    • d (E6, E2) = 4
    • d (E6, E3) = 3
    • d (E6, E4) = 4
    • d (E6, E5) = 4
  • 3-Nearest Neighbors 
    • Male (Vote) = 1
    • Female (Vote) = 2

  • Step 3: Use Class labels of 3 nearest neighbors to determine the Class label of unseen instance E (e.g., by taking Majority Vote)
  • Prediction of Test Example E6 (using Majority Vote)
    • Female
  • Remember
  • The best way to learn anything is to
    • Do It Yourself 😊
  • Whenever you want to learn any concept, Learn it
    • Step by Step
    • Simple to Complex
    • Using a Detailed Example 
  • Predicting Class of Text Example E7 using k-NN Algorithm
  • Step 1: Compute Hamming Distance between Text Example E7 and all 5 Training Examples 
  • Computing Hamming Distance between Training Example E1 and Test Example E7

  • Hamming Distance between Training Example E1 and Test Example E7 
    • d (E7, E1) = 3
  • Computing Hamming Distance between Training Example E2 and Test Example E7

  • Hamming Distance between Training Example E2 and Test Example E7
    • d (E7, E2) = 2
  • Computing Hamming Distance between Training Example E3 and Test Example E7

  • Hamming Distance between Training Example E3 and Test Example E7
    • d (E7, E3) = 4
  • Computing Hamming Distance between Training Example E4 and Test Example E7

  • Hamming Distance between Training Example E4 and Test Example E7 
    • d (E7, E4) = 2
  • Computing Hamming Distance between Training Example E5 and Test Example E7

  • Hamming Distance between Training Example E5 and Test Example E7
    • d (E7, E5) = 4
  • Step 2: Identify k-Nearest Neighbors
  • Identify 3-nearest neighbors (as k = 3)
  • Distance of Test Example E7 with all 5 Training Examples
    • d (E7, E1) = 3
    • d (E7, E2) = 2
    • d (E7, E3) = 3
    • d (E7, E4) = 2
    • d (E7, E5) = 4
  • 3-Nearest Neighbors 
    • Male (Vote) = 0
    • Female (Vote) = 3

  • Step 3: Use Class labels of 3 nearest neighbors to determine the Class label of unseen instance E (e.g., by taking Majority Vote)
  • Prediction of Test Example E7 (using Majority Vote)
    • Female
  • Computing Error

  • Application Phase
  • We assume that our Model 
    • performed well on large Test Data and can be deployed in Real-world
  • Model is deployed in the Real-world and now we can make
    • predictions on Real-time Data
  • Steps – Making Predictions on Real-time Data
  • Step 1: Take Input from User
  • Step 2: Convert User Input into Feature Vector
    • Exactly same as Feature Vectors of Training and Testing Data
  • Step 3: Compute Hamming Distance between Unseen Example and all 5 Training Examples 
  • Step 4: Identify k Nearest Neighbors 
  • Step 5: Use Class labels of k nearest neighbors to determine the Class label of unseen instance E (e.g., by taking Majority Vote)
  • Making Predictions on Real-time Data
  • Step 1: Take input from User

  • Step 2: Conver User Input into Feature Vector
    • Note that order of Attributes must be exactly same as that of Training and Testing Examples

  • Making Predictions on Real-time Data
  • Step 3: Compute Hamming Distance between Unseen Example and all 5 Training Examples 
  • Computing Hamming Distance between Training Example E1 and Unseen Example

  • Hamming Distance between Training Example E1 and Unseen Example
    • d (E, E1) = 1
  • Computing Hamming Distance between Training Example E2 and Unseen Example

  • Hamming Distance between Training Example E2 and Unseen Example
    • d (E, E2) = 0
  • Computing Hamming Distance between Training Example E3 and Unseen Example

  • Hamming Distance between Training Example E3 and Unseen Example
    • d (E, E3) = 2
  • Computing Hamming Distance between Training Example E4 and Unseen Example

  • Hamming Distance between Training Example E4 and Unseen Example
    • d (E, E4) = 2
  • Computing Hamming Distance between Training Example E5 and Unseen Example

  • Hamming Distance between Training Example E5 and Unseen Example
    • d (E, E5) = 2
  • Step 4: Identify k-Nearest Neighbors
  • Identify 3-nearest neighbors (as k = 3)
  • Distance of Unseen Example with all 5 Training Examples
    • d (E, E1) = 1
    • d (E, E2) = 0
    • d (E, E3) = 2
    • d (E, E4) = 2
    • d (E, E5) = 2
  • 3-Nearest Neighbors 
    • Male (Vote) = 1
    • Female (Vote) = 2

  • Step 5: Use Class labels of 3 nearest neighbors to determine the Class label of unseen instance E (e.g., by taking Majority Vote)
  • Unseen example is predicted as
    • Female
  • Note 
    • You can take Input from user, apply Model and return predictions as many times as you like 😊
  • Feedback Phase
  • Only Allah is Perfect 😊
  • Take Feedback on your deployed Model from
    • Domain Experts and
    • Users
  • Improve your Model based on Feedback 😊
  • Strengths and Weaknesses – k-NN Algorithm
  • Strengths 
    • Very simple implementation
    • Robust regarding the search space
      • For example, Classes don’t have to be linearly separable
    • k-NN Algorithm can be easily updated with new Training Examples (Online Learning) at very little cost 
    • Need to tune only two main parameters 
      • Distance Metric and 
      • Value of k
  • Weaknesses
    • Testing each instance is expensive 
      • Possible Solution 
        • R-tree or kd-tree approaches can be used to speed up the identification of k nearest neighbors 
    • Sensitive to noisy or irrelevant Attributes, which can result in less meaningful Distance scores
      • Possible Solution 
        • Scaling and/or Feature Selection can be used to resolve this issue 
    • Sensitiveness to highly unbalanced datasets
      • Possible Solution 
        • Use a balanced dataset for k-NN Algorithm 
  • Offline Learning vs. Online Learning
  • Offline Learning
    • You learn a Concept from a static dataset
  • Online Learning
    • You learn a Concept from data and keep on updating your Hypothesis (h) as more data is available for the same Concept 

TODO and Your Turn​

Todo Tasks
Your Turn Tasks
Todo Tasks

TODO Task 2

  • Task 1
    • Consider the Titanic Dataset with the following Attributes
      • Gender: Male, Female
      • Ticket Class: Upper, Middle, Lower
      • Parent/Child Abroad: Zero, One, Two, Three
      • Embarked: Cherbourg, Queenstown, Southampton
      • Survival: No, Yes
  • We obtained the following Sample Data

  • Sample Data was split into Training and Testing Data in a 
    • Train-Test Split Ratio of 80%-20%

          Training Data 

  • Testing Data

  • Note
    • Consider k-NN Algorithm when answering questions given below
    • Your answer should be 
      • Well Justified
  • Questions
    • Write down the Input and Output for the above Machine Learning Problem?
    • How Training Example is represented?
    • Execute the Machine Learning Cycle using Hamming Distance measure?
      • Classify Test Examples using k = 1, 3 and 5
      • What is the impact of changing the value of k on classification of Test Examples?
Your Turn Tasks

Your Turn Task 2

  • Task 1
    • Select a Machine Learning Problem (similar to: Titanic – Machine Learning from Disaster) and answer the questions given below.
    • Note
      • Consider k-NN Algorithm in answering all the questions.
  • Questions
    • Write down the Input and Output for the above Machine Learning Problem?
    • How Training Example is represented?
    • Execute the Machine Learning Cycle using Hamming Distance measure?
      • Classify Test Examples using k = 1, 3 and 5
  • What is the impact of changing the value of k on classification of Test Examples?

Chapter Summary

  • Chapter Summary
  • Machine Learning Algorithms studied so far are
    • FIND-S Algorithm
    • List Then Eliminate Algorithm
    • Candidate Elimination Algorithm
    • ID3 Algorithm (Decision Tree Learning)
    • Perceptron (Two Layer Artificial Neural Network)
    • Regular Neural Network (Multi-layer Artificial Neural Network)
    • Naïve Bayes Algorithm (Bayesian Learning Algorithm)
  • The above-mentioned ML Algorithms are called Eager Methods
  • How Eager ML Algorithms Work?
    • Given
      • Set of Training Examples (D)
      • Set of Functions / Hypothesis (H)
      • Training Phase 

In Next Chapter

  • In Next Chapter
  • In Sha Allah, in next Chapter, I will present
  • Evaluating Hypothesis (Models)
Chapter 12- Bayesian Learning
  • Previous
Chapter 14 - Evaluating Hypothesis (Models)
  • Next
Share this article!
Facebook
Twitter
LinkedIn

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

About Us

Ilm O Irfan Technologies believes that honest dedicated hard work with the ethical approach of commitment true to every word, and trust-building is the road to success in any field of business or life.

Quick Links
  • About Us
  • Contact
Useful Links
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
  • Support
  • FAQ
Subscribe Our Newsletter
Facebook Twitter Linkedin Instagram Youtube

© 2022 Ilmo Irfan. All Rights Reserved.