SlideShare a Scribd company logo
1 of 51
Тестирование искусственного
интеллекта: с какой стороны
подступиться?
Игорь Хрол, Минск
Кто перед вами?
● Игорь Хрол
● 12 лет в отрасли
● Инженер, тимлид, менеджер, тренер,
консультант
● Python, Scala, C#, Java и другое
● www.khroliz.com
2
3
4
7
8
9
Предсказание выживаемости
Обученная
Модель
Информация
о пассажире
Логистическая регрессия
10
11
Информация
о пассажире
Признаки Умножаем
на веса
Применяем
логистическую
функцию
12
Информация
о пассажире
Признаки Умножаем
на веса
Применяем
логистическую
функцию
13
Информация
о пассажире
Признаки Умножаем
на веса
Применяем
логистическую
функцию
14
Информация
о пассажире
Признаки Умножаем
на веса
Применяем
логистическую
функцию
15
Информация
о пассажире
Признаки Умножаем
на веса
Применяем
логистическую
функцию
16
Информация
о пассажире
Признаки Умножаем
на веса
Применяем
логистическую
функцию
Как тестировать?
17
Неясен ожидаемый результат
18
Подбор весовых коэффициентов
19
или
Функция ошибок
20
Средняя ошибка
21
Поиск лучших коэффициентов
22
23
24
Решена ли задача?
25
26
27
Результат на всех данных
Обучающая и тестовая выборки
28
Обучаем на ⅔ данных
29
30
31
Можно ли лучше?
32
33
34
78.3% вместо 77.6% ранее
35
Улучшаем дальше?..
36
37
Две проблемы в машинном обучении
Недообучили Переобучили
38
Ok
Регуляризация
39
Как выбрать параметр С?
40
Validation выборка
41
train test
train validation
Cross-validation
42
train test
train validation
train
validation
validation
traintrain
43
44
45
46
В чём проблема?
● 0 - хорошая банка, 1 - плохая
● Процент брака - 0.01%
● Классификатор всегда отдаёт 0
Точность - 99.99%!
Precision & Recall
47
Precision & Recall
100000 банок из них 10 брака (0.01%).
Классификатор определил 20 банок в брак.
48
# tp fp fn precision recall
1 1 19 9 1/20=0.05 1/10=0.1
2 5 15 5 5/20=0.25 5/10=0.5
3 10 10 0 10/20=0.5 10/10=1
Precision & Recall
100000 банок из них 10 брака (0.01%).
49
# tp fp fn precision recall
2 5 15 5 5/20=0.25 5/10=0.5
4 6 19 4 6/25=0.24 6/10=0.6
F1 score
50
# tp fp fn precision recall F1 score
2 5 15 5 5/20=0.25 5/10=0.5 0.333333
4 6 19 4 6/25=0.24 6/10=0.6 0.342857
F1 score
51
# tp fp fn precision recall F1 score F0.5 score F2 score
2 5 15 5 5/20=0.25 5/10=0.5 0.333333 0.277778 0.416667
4 6 19 4 6/25=0.24 6/10=0.6 0.342857 0.272727 0.461538
Напоследок… Что почитать?
● www.coursera.org/learn/machine-learning/home/welcome
● www.youtube.com/watch?v=T_YWBGApUgs&t=21524s
● www.eecs.tufts.edu/~dsculley/papers/ml_test_score.pdf
● kaggle.com
● ods.ai
59
Спасибо!
Вопросы?
60
Игорь Хрол
khroliz@gmail.com
www.khroliz.com

More Related Content

More from ITEM

First steps in digitalization and modernization of (huge) non-IT company
First steps in digitalization and modernization of (huge) non-IT companyFirst steps in digitalization and modernization of (huge) non-IT company
First steps in digitalization and modernization of (huge) non-IT companyITEM
 
Redesign of management methodologies
Redesign of management methodologiesRedesign of management methodologies
Redesign of management methodologiesITEM
 
Through Trial and Error: How to Prepare a Trainee to the Wild World of Custom...
Through Trial and Error: How to Prepare a Trainee to the Wild World of Custom...Through Trial and Error: How to Prepare a Trainee to the Wild World of Custom...
Through Trial and Error: How to Prepare a Trainee to the Wild World of Custom...ITEM
 
Тернистый путь к самоорганизации
Тернистый путь к самоорганизацииТернистый путь к самоорганизации
Тернистый путь к самоорганизацииITEM
 
Lessons learned scrum mastering distributed teams
Lessons learned scrum mastering distributed teamsLessons learned scrum mastering distributed teams
Lessons learned scrum mastering distributed teamsITEM
 
Превращая риски в продажи
Превращая риски в продажиПревращая риски в продажи
Превращая риски в продажиITEM
 
Internet marketing for IT companies
Internet marketing for IT companies Internet marketing for IT companies
Internet marketing for IT companies ITEM
 
Success of foreign investment attraction by outsource/service companies.
Success of foreign investment attraction by outsource/service companies.Success of foreign investment attraction by outsource/service companies.
Success of foreign investment attraction by outsource/service companies.ITEM
 
Outsourcing is a dead-end
Outsourcing is a dead-endOutsourcing is a dead-end
Outsourcing is a dead-endITEM
 
Communication with clients
Communication with clientsCommunication with clients
Communication with clientsITEM
 
Harnessing the creative genius within your organization
Harnessing the creative genius within your organizationHarnessing the creative genius within your organization
Harnessing the creative genius within your organizationITEM
 
Service Blueprinting Workshop
Service Blueprinting WorkshopService Blueprinting Workshop
Service Blueprinting WorkshopITEM
 
Introduction to scaled agile framework
Introduction to scaled agile frameworkIntroduction to scaled agile framework
Introduction to scaled agile frameworkITEM
 
Remote debugging for mobile apps
Remote debugging for mobile appsRemote debugging for mobile apps
Remote debugging for mobile appsITEM
 
Building cross platform web apps
Building cross platform web appsBuilding cross platform web apps
Building cross platform web appsITEM
 
Android Application Security Assessment
Android Application Security AssessmentAndroid Application Security Assessment
Android Application Security AssessmentITEM
 
GraphQL is new sexy
GraphQL is new sexyGraphQL is new sexy
GraphQL is new sexyITEM
 

More from ITEM (17)

First steps in digitalization and modernization of (huge) non-IT company
First steps in digitalization and modernization of (huge) non-IT companyFirst steps in digitalization and modernization of (huge) non-IT company
First steps in digitalization and modernization of (huge) non-IT company
 
Redesign of management methodologies
Redesign of management methodologiesRedesign of management methodologies
Redesign of management methodologies
 
Through Trial and Error: How to Prepare a Trainee to the Wild World of Custom...
Through Trial and Error: How to Prepare a Trainee to the Wild World of Custom...Through Trial and Error: How to Prepare a Trainee to the Wild World of Custom...
Through Trial and Error: How to Prepare a Trainee to the Wild World of Custom...
 
Тернистый путь к самоорганизации
Тернистый путь к самоорганизацииТернистый путь к самоорганизации
Тернистый путь к самоорганизации
 
Lessons learned scrum mastering distributed teams
Lessons learned scrum mastering distributed teamsLessons learned scrum mastering distributed teams
Lessons learned scrum mastering distributed teams
 
Превращая риски в продажи
Превращая риски в продажиПревращая риски в продажи
Превращая риски в продажи
 
Internet marketing for IT companies
Internet marketing for IT companies Internet marketing for IT companies
Internet marketing for IT companies
 
Success of foreign investment attraction by outsource/service companies.
Success of foreign investment attraction by outsource/service companies.Success of foreign investment attraction by outsource/service companies.
Success of foreign investment attraction by outsource/service companies.
 
Outsourcing is a dead-end
Outsourcing is a dead-endOutsourcing is a dead-end
Outsourcing is a dead-end
 
Communication with clients
Communication with clientsCommunication with clients
Communication with clients
 
Harnessing the creative genius within your organization
Harnessing the creative genius within your organizationHarnessing the creative genius within your organization
Harnessing the creative genius within your organization
 
Service Blueprinting Workshop
Service Blueprinting WorkshopService Blueprinting Workshop
Service Blueprinting Workshop
 
Introduction to scaled agile framework
Introduction to scaled agile frameworkIntroduction to scaled agile framework
Introduction to scaled agile framework
 
Remote debugging for mobile apps
Remote debugging for mobile appsRemote debugging for mobile apps
Remote debugging for mobile apps
 
Building cross platform web apps
Building cross platform web appsBuilding cross platform web apps
Building cross platform web apps
 
Android Application Security Assessment
Android Application Security AssessmentAndroid Application Security Assessment
Android Application Security Assessment
 
GraphQL is new sexy
GraphQL is new sexyGraphQL is new sexy
GraphQL is new sexy
 

Тестирование искусственного интеллекта: с какой стороны подступиться?