5/22/19 1Demetris Trihinas
trihinas.d@unic.ac.cy
1ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
Machine Learning
Panel Opening Comments
Demetris Trihinas
Department of Computer Science
AILab @ University of Nicosia
trihinas.d@unic.ac.cy
5/22/19 2Demetris Trihinas
trihinas.d@unic.ac.cy
2ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
Full-Time Faculty Member
University of Nicosia
“Designing and developing scalable and self-adaptive tools for data
management, exploration and visualization”
@dtrihinas
http://dtrihinas.info
https://ailab.unic.ac.cy/https://www.slideshare.net/DemetrisTrihinas
5/22/19 3Demetris Trihinas
trihinas.d@unic.ac.cy
3ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
What is NOT Machine Learning
• Any question you can ask and get an –immediate and
concrete– answer (e.g., database, spreadsheet).
• How many sofas models are currently in stock?
• How many sofas did we sell in Germany last month?
• Which of our customers bought a sofa worth more than 500
euros this quarter?
5/22/19 4Demetris Trihinas
trihinas.d@unic.ac.cy
4ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
The Machine Learning Process
Data and
Labels
Feature
Engineering
ML
Algorithm
Bike
Car
Bike
< 2, 170, 35, 169, 51, 38, …>
< 2, 119, 28, 210, 52, 02, …>
Car
<4, 13, 157, 90, 178, 145, …>
<4, 12, 170, 82, 193, 145, …>
wheels
Training
Testing
Data
Feature
Engineering
Inferencing
Statistical
Model
<4, 18, 200, 64, 170, 141, …> It’s a… Car
Finding ”patterns” from features
! = #(%, ')
5/22/19 5Demetris Trihinas
trihinas.d@unic.ac.cy
5ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
ML Modeling is an Approximation...
Feature
Engineering
ML
Algorithm
Training
Testing
Data
Feature
Engineering
Statistical
Model
<4, 18, 200, 64, 170, 141, …>
<3, 22, 23, 31, 101, 205, …>
Data and
Labels
Inferencing
It’s a Car I’m 0.88 sure
It’s a Car I’m 0.71 sure
5/22/19 6Demetris Trihinas
trihinas.d@unic.ac.cy
6ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
It’s Also Called “Learning” Because…
Feature
Engineering
ML
Algorithm
Training
Testing
Data
Feature
Engineering
Statistical
Model
<4, 18, 200, 64, 170, 141, …> It’s a… Car, 0.88
<3, 22, 23, 31, 101, 205, …> It’s a… Car, 0.71
<3, 22, 23, 31, 101, 205, …> It’s a… tricycle, 0.93
Data and
Labels
Evaluate
and
RetrainInferencing
5/22/19 7Demetris Trihinas
trihinas.d@unic.ac.cy
7ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
How Do Robots See Us
5/22/19 8Demetris Trihinas
trihinas.d@unic.ac.cy
8ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
Training is of Utmost Importance
• More training data is good but too much leads to overfitting
(irrelevant details are modeled).
• Algorithms are not racist, do not hold prejudice or apply
stereotypes… yes, but what happened to Amazon?
Memorizing the answers is NOT Learning
5/22/19 9Demetris Trihinas
trihinas.d@unic.ac.cy
9ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
Beware of the “Trainer”…
https://www.businessinsider.com/amazon-built-ai-to-hire-people-discriminated-against-women-2018-10
5/22/19 10Demetris Trihinas
trihinas.d@unic.ac.cy
10ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
Data Preprocessing
• Preprocessing significantly increases ML performance
and result quality.
• Is 1% more error tolerable if computation “promises”
to run for 10min instead of 5 hours?
• Data Reduction –remove
insignificant model dimensions.
• Data Cleaning –remove
incomplete and “dirty” data.
Do not “influence” result
5/22/19 11Demetris Trihinas
trihinas.d@unic.ac.cy
11ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
What is YOUR Role?
• Ask good questions – a model is based on a hypothesis.
• Provide training data – BEWARE training can lead to bias.
• Assess the quality of results – retrain let the model “learn”.
• TEST, TEST and TEST again.
The Ability Matrix
Insights are generated
by humans not
machines!
5/22/19 12Demetris Trihinas
trihinas.d@unic.ac.cy
12ACCA ML Panel| Nicosia, May 2019
Department of
Computer Science
Questions?
Demetris Trihinas
Department of Computer Science
AILab @ University of Nicosia
trihinas.d@unic.ac.cy

Machine Learning Introduction

  • 1.
    5/22/19 1Demetris Trihinas trihinas.d@unic.ac.cy 1ACCAML Panel| Nicosia, May 2019 Department of Computer Science Machine Learning Panel Opening Comments Demetris Trihinas Department of Computer Science AILab @ University of Nicosia trihinas.d@unic.ac.cy
  • 2.
    5/22/19 2Demetris Trihinas trihinas.d@unic.ac.cy 2ACCAML Panel| Nicosia, May 2019 Department of Computer Science Full-Time Faculty Member University of Nicosia “Designing and developing scalable and self-adaptive tools for data management, exploration and visualization” @dtrihinas http://dtrihinas.info https://ailab.unic.ac.cy/https://www.slideshare.net/DemetrisTrihinas
  • 3.
    5/22/19 3Demetris Trihinas trihinas.d@unic.ac.cy 3ACCAML Panel| Nicosia, May 2019 Department of Computer Science What is NOT Machine Learning • Any question you can ask and get an –immediate and concrete– answer (e.g., database, spreadsheet). • How many sofas models are currently in stock? • How many sofas did we sell in Germany last month? • Which of our customers bought a sofa worth more than 500 euros this quarter?
  • 4.
    5/22/19 4Demetris Trihinas trihinas.d@unic.ac.cy 4ACCAML Panel| Nicosia, May 2019 Department of Computer Science The Machine Learning Process Data and Labels Feature Engineering ML Algorithm Bike Car Bike < 2, 170, 35, 169, 51, 38, …> < 2, 119, 28, 210, 52, 02, …> Car <4, 13, 157, 90, 178, 145, …> <4, 12, 170, 82, 193, 145, …> wheels Training Testing Data Feature Engineering Inferencing Statistical Model <4, 18, 200, 64, 170, 141, …> It’s a… Car Finding ”patterns” from features ! = #(%, ')
  • 5.
    5/22/19 5Demetris Trihinas trihinas.d@unic.ac.cy 5ACCAML Panel| Nicosia, May 2019 Department of Computer Science ML Modeling is an Approximation... Feature Engineering ML Algorithm Training Testing Data Feature Engineering Statistical Model <4, 18, 200, 64, 170, 141, …> <3, 22, 23, 31, 101, 205, …> Data and Labels Inferencing It’s a Car I’m 0.88 sure It’s a Car I’m 0.71 sure
  • 6.
    5/22/19 6Demetris Trihinas trihinas.d@unic.ac.cy 6ACCAML Panel| Nicosia, May 2019 Department of Computer Science It’s Also Called “Learning” Because… Feature Engineering ML Algorithm Training Testing Data Feature Engineering Statistical Model <4, 18, 200, 64, 170, 141, …> It’s a… Car, 0.88 <3, 22, 23, 31, 101, 205, …> It’s a… Car, 0.71 <3, 22, 23, 31, 101, 205, …> It’s a… tricycle, 0.93 Data and Labels Evaluate and RetrainInferencing
  • 7.
    5/22/19 7Demetris Trihinas trihinas.d@unic.ac.cy 7ACCAML Panel| Nicosia, May 2019 Department of Computer Science How Do Robots See Us
  • 8.
    5/22/19 8Demetris Trihinas trihinas.d@unic.ac.cy 8ACCAML Panel| Nicosia, May 2019 Department of Computer Science Training is of Utmost Importance • More training data is good but too much leads to overfitting (irrelevant details are modeled). • Algorithms are not racist, do not hold prejudice or apply stereotypes… yes, but what happened to Amazon? Memorizing the answers is NOT Learning
  • 9.
    5/22/19 9Demetris Trihinas trihinas.d@unic.ac.cy 9ACCAML Panel| Nicosia, May 2019 Department of Computer Science Beware of the “Trainer”… https://www.businessinsider.com/amazon-built-ai-to-hire-people-discriminated-against-women-2018-10
  • 10.
    5/22/19 10Demetris Trihinas trihinas.d@unic.ac.cy 10ACCAML Panel| Nicosia, May 2019 Department of Computer Science Data Preprocessing • Preprocessing significantly increases ML performance and result quality. • Is 1% more error tolerable if computation “promises” to run for 10min instead of 5 hours? • Data Reduction –remove insignificant model dimensions. • Data Cleaning –remove incomplete and “dirty” data. Do not “influence” result
  • 11.
    5/22/19 11Demetris Trihinas trihinas.d@unic.ac.cy 11ACCAML Panel| Nicosia, May 2019 Department of Computer Science What is YOUR Role? • Ask good questions – a model is based on a hypothesis. • Provide training data – BEWARE training can lead to bias. • Assess the quality of results – retrain let the model “learn”. • TEST, TEST and TEST again. The Ability Matrix Insights are generated by humans not machines!
  • 12.
    5/22/19 12Demetris Trihinas trihinas.d@unic.ac.cy 12ACCAML Panel| Nicosia, May 2019 Department of Computer Science Questions? Demetris Trihinas Department of Computer Science AILab @ University of Nicosia trihinas.d@unic.ac.cy