Chapter 15 - Book Main Findings, Conclusion and Future Work
Chapter Outline
- Chapter Outline
- Quick Recap
- Main Findings and Conclusion of the Book
- Future Work and Feedback
Quick Recap
- Quick Recap – Evaluating Hypothesis (Models)
- To completely and correctly learn any task follow the Learning Cycle
- The four main phases of a Learning Cycle to completely and correctly learn any task are
- Training Phase
- Testing Phase
- Applicants Phase
- Feedback Phase
- The main goal of building a Model (Training / Learning) is to use it in doing Real-world Tasks with good Accuracy
- No one can perfectly predict, that if a Model performs well in Training Phase, then it will also perform well in Real-world
- However, before deploying a Model in the Real-world, it is important to know
- How well will it perform on unseen Real-time Data?
- To judge / estimate the performance of a Model (or h) in Real-world (Application Phase)
- Evaluate the Model (or h) on large Test Data (Testing Phase)
- If (Model Performance = Good AND Test Data = Large)
- Then
- If (Model Performance = Good AND Test Data = Large)
Use the Model in Real-world
- Else
Refine (re-train) the Model
- Recall – Machine Learning Assumption
- If a Model (or h) performs well on large Test Data, it will also perform well on unseen Real-world Data
- Again, this is an assumption and we are not 100% sure that if a Model performs well on large Test Data will definitely perform well on Real-world Data
- Therefore, it is useful to take continuous Feedback on deployed Model (Feedback Phase) and keep on improving it
- Two main advantages of Evaluating Hypothesis / Model are
- We get the answer to an important question i.e.
- Should we rely on predictions of Hypothesis / Model when deployed in Real-world?
- Machine Learning Algorithms may rely on Evaluation to refine Hypothesis (h)
- What we evaluate a Hypothesis h, we want to know
- How accurately will it classify future unseen instances?
- i.e. Estimate of Error (EoE)
- How accurate is our Estimate of Error (EoE)?
- I.e. what Margin of Error (±? %) is associated with our Estimate of Error (EoE)? (we call it Error in Estimate of Error (EoE))
- True Error
- Error computed on entire Population
- Sample Error
- Error computed on Sample Data
- Since we cannot acquire entire Population
- Therefore, we cannot calculate True Error
- Calculate Sample Error in such a way that
- Sample Error estimates True Error well
- Statistical Theory tells us that Sample Error can estimate True Error well if the following two conditions are fulfilled
- Condition 01 – the n instances in Sample S are drawn
- independently of one another
- independently of h
- according to Probability Distribution D
- Condition 02
- n ≥ 30
- Estimate of Sample Error (EoSE) can be calculated as follows
- Step 1: Randomly select a Representative Sample S from the Population
- Step 2: Calculate Sample Error on Representative Sample S drawn in Step 1
- Estimate cannot be perfect and will contain Error
- Therefore, calculate Error in Estimate of Sample Error (EoSE)
- Error in Estimate of Sample Error (EoSE) can be calculated using Confidence Interval
- The Most Probable Value of True Error is Sample Error with approximately N% Probability (Confidence Level), True Error lies in interval
- where n represents size of Sample
- errors(h) represents Sample Error
- The choice of Confidence Level depends on the field of study
- Generally, the most common Confidence Level used by Researchers is 95%
- Our goal is to have a
- Small Interval with High Confidence
- The two main diseases in Machine Learning are
- Overfitting
- Underfitting
- The condition when a Machine Learning Algorithm tries to remember all the Training Examples from the Training Data (Rote Learning) is known as Overfitting of the Model (h)
- Overfitting happens when our
- Model (h) has lot of features or
- Model (h) is too complex
- The condition when a Machine Learning Algorithm could not learn the correlations between Attributes / Features properly then it is known as Underfitting of the Model (h)
- Underfitting happens when our
- Model misses the trends or patterns in the Training Data and could not generalize well for the Training Examples
- To overcome the problems of Overfitting and Underfitting, we
- Use Train-Test Split Approach
- Train-Test Split Approach, splits the Sample Data into two sets: (1) Train Set and (2) Test Set
- Two main variations of Train-Test Split Approach are
- Random Split Approach
- Class Balanced Split Approach
- Class Balanced Split Approach should be preferred over Random Split Approach
- Train-Test Split Ratio determines what percentage of the Sample Data will be used as Train Set and what percentage of the Sample Data will be used as Test Set
- The Train-Test Split Ratio may vary from Machine Learning Problem to Machine Learning Problem
- e.g. 70%-30%, 80%-20%, 90%-10% etc.
- Most Common Train-Test Split Ratio
- Use 2 / 3 of Sample Data as Train Set
- Use 1 / 3 of Sample Data as Test Set
- Train-Test Split Approach provides high variance in Estimate of Sample Error since
- Changing which examples happens to be in Train Set can significantly change Sample Error
- To overcome the problem of high variance in Estimate of Sample Error calculated using Train-Test Split Approach, we
- Use K-fold Cross Validation Approach
- K-fold Cross Validation Approach works as follows
- Step 1: Split Train Set into K equal folds (or partitions)
- Step 2: Use one of the folds (kth fold) as the Test Set and union of remaining folds (k – 1 folds) as Training Set
- Step 3: Calculate Error of Model (h)
- Step 4: Repeat Steps 2 and 3, to choose Train Sets and Test Sets from different folds, and calculate Error K-times
- Step 5: Calculate Average Error
- Important Note
- In each fold, there must be at least 30 instances
- All K-folds must be disjoint i.e. instance appearing in one fold must not appear in any other fold
- Empirical study showed that best value for
- K is 10
- K-fold Cross Validation Approach is a better estimator of Error since
- All data is used for both Training and Testing
- K-fold Cross Validation Approach is computationally expensive since
- we have to repeat Training and Testing Phases K-times
- It is suitable to use Train-Test Split Approach in the following situations
- When Training Time is Very Large
- Organizing International Competitions
- Having Very Huge Sample Data
- It is suitable to use K-fold Cross Validation Approach in the following situations
- When Training Time is Not Very Large
- Having Sample Data that is Not Very Huge
- To compare various Machine Learning Algorithms, following things must be same
- Train Set
- Test Set
- Evaluation Measure
- Evaluation Methodology
- Important Note
- If any of the above things are not same then it will
- Not be a valid comparison
- Some of the most popular and widely used Evaluation Measures for Classification Problems are
- Baseline Accuracy
- Accuracy
- Precision
- Recall
- F1
- Area Under the Curve (AUC)
- Baseline Accuracy (a.k.a. Majority Class Categorization (MCC)) is calculated by assigning the label of Majority Class to all the Test Instances
- BA provides a simple baseline to compare proposed Machine Learning Algorithms
- Accuracy is defined as the proportion of correctly classified Test instances
- Accuracy = 1 – Error
- Accuracy evaluation measure is more suitable to use for evaluation of Machine Learning Algorithms when we have
- Balanced Data
- Two main limitations of Accuracy measure are
- Accuracy fails to accurately evaluate a Machine Learning Algorithm when Test Data is highly unbalanced
- Accuracy ignores possibility of different misclassification costs
- To overcome the limitations of Accuracy measure, we use
- Confusion Matrix
- A Confusion Matrix is a table used to describe the performance of a Classification Model (or Classifier) on a Set of Test Examples (Test Data), whose Actual Values (or True Values) are known
- Some of the main advantages of Accuracy measure are
- Confusion Matrix allows us to visualize the performance of a Model / Classifier
- Confusion Matrix allows to separately get insights into the Errors made by each Class
- Confusion Matrix gives insights to both
- Errors made by a Model / Classifier and
- Types of Errors made by a Model / Classifier
- Confusion Matrix allows us to compute many different Evaluation Measures including
- Baseline Accuracy
- Accuracy
- True Positive Rate (or Recall)
- True Negative Rate
- False Positive Rate
- False Negative Rate
- Precision
- F1
- Recall or True Positive Rate (TPR) or Sensitivity is the proportion of Positive cases that were correctly classified
- False Positive Rate (FPR) is the proportion of Negative cases that were incorrectly classified as Positive
- True Negative Rate (TNR) or Specificity is defined as the proportion of Negatives cases that were classified correctly
- False Negative Rate (FNR) is the proportion of Positive cases that were incorrectly classified as Negative
- Precision (P) is the proportion of the predicted Positive cases that were correct
- F-measure is the Harmonic Mean of Precision and Recall
- where controls relative weight assigned to Precision and Recall
- F1-measure is that, When we assign same weights to Precision and Recall i.e. β = 1, the F-measure becomes F1-measure
- When considering Misclassifications
- Positive costs may be more or less important than misclassifying (or incorrectly predicting) Negative costs
- Before Training your Model, you must be very clear about the Misclassification Costs, otherwise
- Your Model will fail to perform well in Real-world (i.e. Application Phase)
- The problem of considering different misclassification costs can be handled using
- ROC Curves
- Precision-Recall Curves
- ROC Curve summarizes the trade-off between the True Positive Rate (TPR) and False Positive Rate (FPR) for a Classifier / Model using different Probability Thresholds
- ROC Curves are more suitable to use when we have Balanced Data
- Precision-Recall Curve summarizes the trade-off between Precision and Recall (or True Positive Rate) for a Classifier / Model using different Probability Thresholds
- Precision-Recall Curves are more suitable to use when we have Highly Unbalanced Data
- Area Under the ROC Curve (AUC) is defined as the probability that randomly chosen Positive instance is ranked above than the randomly selected Negative one
- AUC tells how much Model is capable of distinguishing between Classes?
- AUC score is computed using True Positive Rate (TPR) and False Positive Rate (FPR)
- Range of AUC Score
- Range of AUC Score is [0 – 1]
- 0 means All Predictions of Model are Wrong
- 1 means All Predictions of Model are Correct
- What is a good AUC Score?
- AUC Score = 0.5
- suggests No Discrimination (i.e., Model do not have the ability to differentiate Positive instances from Negative ones)
- AUC Score = 0.7 to 0.8
- considered Acceptable
- AUC Score = 0.8 to 0.9
- considered Excellent
- AUC Score = Greater Than 0.9
- considered Outstanding
- Some of the most popular and widely used Evaluation Measures for Regression Problems are
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
- R2 or Coefficient of Determination
- Adjusted R2
- Absolute Error (AE) is the difference between the Actual Value and the Predicted Value
- Mean Absolute Error (MAE) is the average of all Absolute Errors
- Square Error (SE) is the Square of difference between the Actual Value and the Predicted Value
- Mean Square Error (MSE) is the average of all Square Errors
- Root Mean Square Error (RMSE) is the Square root of all Mean Square Errors
- Some of the most popular and widely used Evaluation Measures for Sequence-to-Sequence Problems are
- ROUGE (Recall-Oriented Understudy for Gisting Evaluation)
- BLEU (Bi-Lingual Evaluation Understudy)
- METEOR (Metric for Evaluation of Translation with Explicit Ordering)
- ROUGE is a de facto standard to automatically evaluate the performance of Text Summarization Systems
- Normally, we calculate
- ROUGE-1
- ROUGE-2
- ROUGE-L
- Average F1 scores are reported for ROUGE-1, ROUGE-2 and ROUGE-L metrics
- BLEU is a de facto standard to automatically evaluate the performance of Machine Translation Systems
- Normally, we calculate
- BLEU-1
- BLEU-2
- BLEU-3
- BLEU-4
Main Findings and Conclusion of the Book
- Finding 1 - Life = Technical Skills (15%) + Human Engineering (85%)
- To achieve excellence in Technical Skills, learn Technical Skills by following the best learning methodology, i.e.,
- DO IT YOURSELF
- To achieve excellence in Human Engineering, never ever compromise on 3 pillars of Human Engineering
- Truth
- Honesty
- Justice
- Finding 2 – Learn to Write
- My Respected Teacher advised me
آپ لکھنا سیکھیں‘ لکھنے سے انسانی دماغ میں اضافہ ہوتا ہے‘ آپ میں سوچنے سمجھنے کی اہلیت بڑھتی ہے اور سوچنے سمجھنے والے لوگ کبھی غریب نہیں رہتے۔
- (Translation: Learn to write. Writing enhances the human brain. It increases your ability to think and understand. People who think and understand are never poor.)
- To enhance your Intellectual Thinking, first completely and correctly understand the Real-world Problem and then plan, design, and develop high-quality Python Software to solve the given Real-world Problem
- Finding 3 – Completely and Correctly Solving any Real-world Problem
- To completely and correctly solve any Real-world Problem, follow the following five-step process
- Plan – in Mind
- Design – on Paper
- Execute – at Prototype Level
- Execute – at Full Scale
- Feedback – from audience and domain experts for further improvement
- Finding 4 – Nothing to Lose in Life: Stay Happy and Motivated 😊
- Advice of My Respected Teacher
- When you do any Real-world Task,
- Put Your 100% Effort with Sincerity without bothering about the Results
- 100% Effort with Sincerity == > Getting Desired Results
- It will Double Your Confidence
- 100% Effort with Sincerity == > NOT Getting Desired Results
- It will Double Your Experience
- Therefore, Nothing to Lose 😊
- Keep Smiling and Stay Motivated 😊
- Finding 5 – Completely and Correctly Execute Machine Learning Cycle
- To develop high-quality Models, it is important to completely and correctly execute the Machine Learning Cycle
- Fining 6 – Making Decisions based on Most Suitable Solutions
- There can be multiple solutions (Machine Learning Algorithms) against a Real-world Problem to improve quality of human life
- Since decisions change our lives, and the identification of the Most Suitable Solutions for various Real-world Problems will improve the overall quality of our lives In Sha Allah
- Finding 7 – Completely and Correctly Understand Core Concepts of Machine Learning
- To plan, design, and develop high-quality Python Software’s, it is important to completely and correctly understand core concepts of Machine Learning and how they work together to build high-quality Models including
- Understand Data
- Transforming Data into a format which Machine Learning Algorithms can understand and learn from it
- Select Most Suitable Machine Learning Algorithms for a given Machine Learning Problem
- Use Most Suitable Evaluation Measures to evaluate the performance of Models (before deploying them in the Real-world)
- Finding 8 – Use a Template-based Approach to solve any Real-world Problem
- There is tradeoff between Speed and Accuracy
- To solve Real-world Problems with both Speed and Accuracy, plan, design and develop a Template-based Approach, using a combination of following approaches
- Systematic Thinking Approach
- Identifying Most Suitable Solution Approach
- Completeness and Correctness Approach
- Divide and Conquer Approach
- Half-Cooked Approach
- Structured Thinking Approach
- Inverted Triangle Approach
- 100% Effort with Sincerity Approach
Conclusion of the Book
- Conclusion of the Book
- Completely and correctly executing the Machine Learning Cycle, enables us to
- easily and quickly solve complex Real-world Problems, by developing high-quality Machine Learning Models, to improve the quality of human life (for sake of Allah)
- Authors – Technical Skills
- Considering Technical Skills, to summarize in one sentence what we have learned from this Book is
- How to Tell an Interesting, Connected and Coherent Story in a Book
- Authors – Human Engineering
- Considering Human Engineering, to summarize in one sentence what we have learned from this Book is
گرُو کی ہر بات گرُہوتی ہے، گرُکو نہیں گرُو کو پکڑ۔
(Translation: Every word of a Mentor is a gem. Don’t focus on gems, firmly hold the Mentor)
Future Work and Feedback
- Future Work
- Future Work 1
- Plan, Design, and Develop an Online Course on Machine Learning
- Future Work 2
- Plan, Design, and Write a Book on
- Deep Learning
- Humble Request for Feedback
- Recall
- To completely and correctly solve any Real-world Problem, follow the following five-step process
- Plan – in Mind
- Design – on Paper
- Execute – at Prototype Level
- Execute – at Full Scale
- Feedback – from audience and domain experts for further improvement
Chapter Summary
- Chapter Summary
- Mani Findings of this Book
- Finding 1 – Life = Technical Skills (15%) + Human Engineering (85%)
- Finding 2 – Learn to Write
- Finding 3 – To Completely and Correctly Solve any Real-world Problem, use the following five-step process
- Plan – in Mind
- Design – on Paper
- Execute – at Prototype Level
- Execute – at Full Scale
- Feedback – from audience and domain experts for further improvement
- Finding 4 – To be successful in life, keep putting 100% Effort with Sincerity without bothering about the Results
- Finding 5 – To develop high-quality Machine Learning Models, it is important to completely and correctly execute the Machine Learning Cycle
- Finding 6 – Decisions change our lives, and the identification of the Most Suitable Solutions for various Real-world Problems will improve the overall quality of our lives In Sha Allah
- Finding 7 – To plan, design, and develop high-quality Models, it is important to completely and correctly understand core concepts of Machine Learning and how they work together
- Finding 8 – To solve Real-world Problems with both Speed and Accuracy, plan, design and develop a Template-based Approach, using a combination of following approaches
- Systematic Thinking Approach
- Identifying Most Suitable Solution Approach
- Completeness and Correctness Approach
- Divide and Conquer Approach
- Half-Cooked Approach
- Structured Thinking Approach
- Inverted Triangle Approach
- 100% Effort with Sincerity Approach
- Conclusion of the Book
- Completely and correctly executing the Machine Learning Cycle, using, enables us to
- easily and quickly solve complex Real-world Problems, by developing high-quality Machine Learning Models, to improve the quality of human life (for sake of Allah)