Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS

62 views

Published on

He will share used tools, key metrics in AI testing and how to evaluate the AI model.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS

  1. 1. How I Test Ai Model DEVDAY 2019 April 06, 2019
  2. 2. Minh Hoang A Tester A member of Technology team Fond of new technology A Challenge-taker I’M
  3. 3. Objectives Sharing used tools, key metrics in AI testing and how to evaluate the AI model.
  4. 4. Agenda 1 • What is machine learning • Myths & Facts about AI • Myths & Facts about Chatbot ABOUT A.I 3 TAKE AWAY 2 • The right metrics for evaluating the ML model • How we test FAQ model • Demo HOW I TEST THE AI MODEL 4 REFERENCES • Tools & Libraries
  5. 5. About A.I
  6. 6. What Is Machine Learning? Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed.
  7. 7. Myths And Facts About A.I MYTH FACT Artificial intelligence and machine learning will wipe out all the jobs. A.I is no different from other technological advances in that it helps humans become more effective and processes more efficient. “Cognitive AI” technologies are able to understand and solve new problems the way the human brain can. “Cognitive” technologies can’t solve problems they weren’t designed to solve. You need a PH.D. to work in machine learning & data science. Nowadays, a lot of documents and tutorial on the Internet can help people step by step approach machine learning world.
  8. 8. v What Is Chatbot? A computer program designed to simulate conversation with human users, especially over the Internet.
  9. 9. Myths And Facts About Chatbot MYTH FACT Chatbot have only been around for a short while. ELIZA is one of the most well-known Chatbot therapists and the bot was created about 50 years ago. Texts or voice is the only way to interact with Bots. Actually Chatbot platforms allows users to interact with them via graphical interfaces or graphical widgets, and recent Chatbot platforms follow this development approach. All Chatbot platforms use AI. Not all Chatbot platforms use AI. Most Chatbot platforms are rule-based which follow a simple, autonomous process, something along the lines of a decision tree.
  10. 10. How We Test The Ai Model
  11. 11. Regression • MSPE • MSAE • R Square • Adjusted R Square Classification • Precision – Recall • ROC-AUC • Accuracy • Log-Loss Unsupervised Models • Rand Index • Mutual • Information Others • CV Error • Heuristic methods to find K • BLEU Score (NLP) The Right Metric For Evaluating Ml Models
  12. 12. Actual positive Actual negative Predicted positive True positive False positive (Type I errors) Predicted negative False negative (Type II errors) True negative Confusion Matrix Commonly Used Metrics In Classification
  13. 13. Accuracy: • Percentage of total items classified correctly • Formula: Commonly Used Metrics In Classification
  14. 14. Recall/Sensitivity/TPR (True Positive Rate): • Number of items correctly identified as positive out of total true positives • Formula: Commonly Used Metrics In Classification Actual positive Actual negative Predicted positive True positive False positive (Type I errors) Predicted negative False negative (Type II errors) True negative
  15. 15. Precision • Number of items correctly identified as positive out of total items identified as positive • Formula: Commonly Used Metrics In Classification Actual positive Actual negative Predicted positive True positive False positive (Type I errors) Predicted negative False negative (Type II errors) True negative
  16. 16. Precision • It is a harmonic mean of precision and recall • Formula: Commonly Used Metrics In Classification Precision Recall F1 1 1 1 0.1 0.1 0.1 0.5 0.5 0.5 1 0.1 0.182 0.3 0.8 0.36 0.8 0.3 0.436
  17. 17. What Is FAQ Model?
  18. 18. Prepare test data •Crawl FAQ data •Generate question from FAQ data Run test •Train model with FAQ data •Run test Analyze result •Pre-process the raw result •Calculate metrics to evaluate the AI model in classification •Visualize the metrics Model Result •Select the threshold value The Process To Test FAQ Model?
  19. 19. • Collect FAQ questions data (Manual and Automate) • Use NLTK to generate new question data (NLG) • Self-defined question data How We Define Test Data Set?
  20. 20. Train with domain X and run the test defined for domain X. How We Evaluate The AI Model?
  21. 21. • Pre-process the raw result. • Calculate metrics to evaluate the AI model in classification. • Visually metrics. How We Analyze The Result?
  22. 22. Demo
  23. 23. Take Away
  24. 24. Take Away • Know main metrics for evaluating ML model. • Know how to test the classification AI model. • It is up to your self-learning skills and adaptability to decide whether working on ___ projects (AI, blockchain, VR, etc.) is difficult. • Use Automation to reduce time and effort to prepare test data
  25. 25. Tools & Libraries
  26. 26. Tools & Libraries • API: requests and postman. • AI/ML: nltk, difflib, plot.ly, pandas and numpy.
  27. 27. Question & Answer

×