SlideShare a Scribd company logo
1 of 37
Train with Python,
predict with C++
Pavel Filonov
+
+ +
+
+
+
+
About me
Pavel Filonov
Research Development Team Lead at
Kaspersky Lab
01
CoreHard. Train with Python, predict with C++
Machine Learning everywhere!
02
CoreHard. Train with Python, predict with C++
• Mobile
• Embedded
• Automotive
• Desktops
• Games
• Finance
• Etc.
Image from [1]
Machine Learning face the real world
03
CoreHard. Train with Python, predict with C++
Image from [2]
CRISP-DM
04
CoreHard. Train with Python, predict with C++
Image from [3]
Dream team
05
CoreHard. Train with Python, predict with C++
Developer Data Scientist
Dream team – synergy way
06
CoreHard. Train with Python, predict with C++
Developer Data Scientist Research Developer
Dream team – process way
07
CoreHard. Train with Python, predict with C++
Developer Data Scientist
Communications
Machine learning sample cases
08
CoreHard. Train with Python, predict with C++
1. Energy efficiency prediction
2. Intrusion detection system
3. Image classification
Buildings Energy Efficiency
09
CoreHard. Train with Python, predict with C++
ref: [4]
● Input attributes
○ Relative Compactness
○ Surface Area
○ Wall Area
○ etc.
● Outcomes
○ Heating Load
Regression problem
10
CoreHard. Train with Python, predict with C++
Regression problem
11
CoreHard. Train with Python, predict with C++
Regression problem
12
CoreHard. Train with Python, predict with C++
Quality metric
13
CoreHard. Train with Python, predict with C++
●
Baseline model
14
CoreHard. Train with Python, predict with C++
●
class Predictor {
public:
using features = std::vector<double>;
virtual ~Predictor() {};
virtual double predict(const features&) const = 0;
};
class MeanPredictor: public Predictor {
public:
MeanPredictor(double mean);
double predict(const features&) const override { return mean_; }
protected:
double mean_;
};
Linear regression
15
CoreHard. Train with Python, predict with C++
●
class LinregPredictor: public Predictor {
public:
LinregPredictor(const std::vector<double>&);
double predict(const features& feat) const override {
assert(feat.size() + 1 == coef_.size());
return std::inner_product(feat.begin(), feat.end(), ++coef_.begin(),
coef_.front());
}
protected:
std::vector<double> coef_;
};
Polynomial regression
16
CoreHard. Train with Python, predict with C++
●
class PolyPredictor: public LinregPredictor {
public:
using LinregPredictor::LinregPredictor;
double predict(const features& feat) const override {
features poly_feat{feat};
const auto m = feat.size();
poly_feat.reserve(m*(m+1)/2);
for (size_t i = 0; i < m; ++i) {
for (size_t j = i; j < m; ++j) {
poly_feat.push_back(feat[i]*feat[j]);
}
}
return LinregPredictor::predict(poly_feat);
}
};
Integration testing
17
CoreHard. Train with Python, predict with C++
● you always have a lot of data for testing
● use python model output as expected values
● beware of floating point arithmetic problems
TEST(LinregPredictor, compare_to_python) {
auto predictor = LinregPredictor{coef};
double y_pred_expected = 0.0;
std::ifstream test_data{"../train/test_data_linreg.csv"};
while (read_features(test_data, features)) {
test_data >> y_pred_expected;
auto y_pred = predictor.predict(features);
EXPECT_NEAR(y_pred_expected, y_pred, 1e-4);
}
}
Intrusion detection system
18
CoreHard. Train with Python, predict with C++
● input - network traffic features
○ protocol_type
○ connection duration
○ src_bytes
○ dst_bytes
○ etc.
● Output
○ normal
○ network attack
ref: [5]
Classification problem
19
CoreHard. Train with Python, predict with C++
Quality metrics
20
CoreHard. Train with Python, predict with C++
● Receive operation characteristics (ROC) curve
Baseline model
21
CoreHard. Train with Python, predict with C++
● always predict most frequent class
● ROC area under the curve = 0.5
Logistic regression
22
CoreHard. Train with Python, predict with C++
●
Logistic regression
23
CoreHard. Train with Python, predict with C++
● easy to implement
template<typename T>
auto sigma(T z) {
return 1/(1 + std::exp(-z));
}
class LogregClassifier: public BinaryClassifier {
public:
float predict_proba(const features_t& feat) const override {
auto z = std::inner_product(feat.begin(), feat.end(),
++coef_.begin(), coef_.front());
return sigma(z);
}
protected:
std::vector<float> coef_;
};
Gradient boosting
24
CoreHard. Train with Python, predict with C++
● de facto standard universal method
● multiple well known C++ implementations with python bindings
○ XGBoost
○ LigthGBM
○ CatBoost
● each implementation has its own custom model format
CatBoost
25
CoreHard. Train with Python, predict with C++
● C API and C++ wrapper
● own build system (ymake)
class CatboostClassifier: public BinaryClassifier {
public:
CatboostClassifier(const std::string& modepath);
~CatboostClassifier() override;
double predict_proba(const features_t& feat) const override {
double result = 0.0;
if (!CalcModelPredictionSingle(model_, feat.data(), feat.size(),
nullptr, 0, &result, 1)) {
throw std::runtime_error{"CalcModelPredictionFlat error message:" +
GetErrorString()};
}
return result;
}
private:
ModelCalcerHandle* model_;
}
CatBoost
26
CoreHard. Train with Python, predict with C++
● ROC-AUC = 0.9999
Image classification
27
CoreHard. Train with Python, predict with C++
● Handwritten digits recognizer – MNIST
● input – gray-scale pixels 28x28
● output – digit on picture (0, 1, … 9)
Multilayer perceptron
28
CoreHard. Train with Python, predict with C++
Image from: [6]
Quality metrics
29
CoreHard. Train with Python, predict with C++
●
Multilayer perceptron
30
CoreHard. Train with Python, predict with C++
●
auto MlpClassifier::predict_proba(const features_t& feat) const {
VectorXf x{feat.size()};
auto o1 = sigmav(w1_ * x);
auto o2 = softmax(w2_ * o1);
return o2;
}
Multilayer perceptron
31
CoreHard. Train with Python, predict with C++
●
auto MlpClassifier::predict_proba(const features_t& feat) const {
VectorXf x{feat.size()};
auto o1 = sigmav(w1_ * x);
auto o2 = softmax(w2_ * o1);
return o2;
}
Convolutional networks
31
CoreHard. Train with Python, predict with C++
● State of the Art algorithms in image processing
● a lot of C++ implementation with python bindings
○ TensorFlow
○ Caffe
○ MXNet
○ CNTK
Tensorflow
31
CoreHard. Train with Python, predict with C++
● C++ API
● Bazel build system
● Hint – prebuild C API
Conclusion
31
CoreHard. Train with Python, predict with C++
● Don’t be fear of the ML
● Try simpler things first
● Get benefits from different languages
References
02
CoreHard. Train with Python, predict with C++
1. Andrew Ng, Machine Learning – coursera
2. How to Get the Right Creative Requirements From Your Client
3. The Forgotten Step in CRISP-DM and ASUM-DM Methodologies
4. Energy efficiency Data Set
5. KDD Cup 1999
6. MNIST training with Multi Layer Perceptron
7. Code samples
+
+
+
+
+
+
+
Many thanks!
Let’s talk
Pavel Filonov Pavel.Filonov@kaspersky.com +7(966)077-32-80
+

More Related Content

What's hot

Runtime Code Generation and Data Management for Heterogeneous Computing in Java
Runtime Code Generation and Data Management for Heterogeneous Computing in JavaRuntime Code Generation and Data Management for Heterogeneous Computing in Java
Runtime Code Generation and Data Management for Heterogeneous Computing in Java
Juan Fumero
 
Firefox content
Firefox contentFirefox content
Firefox content
Ubald Agi
 

What's hot (20)

Arrays
ArraysArrays
Arrays
 
Brief Introduction to Cython
Brief Introduction to CythonBrief Introduction to Cython
Brief Introduction to Cython
 
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
 
Games no Windows (FATEC 2015)
Games no Windows (FATEC 2015)Games no Windows (FATEC 2015)
Games no Windows (FATEC 2015)
 
C graphics programs file
C graphics programs fileC graphics programs file
C graphics programs file
 
UGC-NET, GATE and all IT Companies Interview C++ Solved Questions PART - 2
UGC-NET, GATE and all IT Companies Interview C++ Solved Questions PART - 2UGC-NET, GATE and all IT Companies Interview C++ Solved Questions PART - 2
UGC-NET, GATE and all IT Companies Interview C++ Solved Questions PART - 2
 
analog clock C#
analog clock C#analog clock C#
analog clock C#
 
Part - 2 Cpp programming Solved MCQ
Part - 2  Cpp programming Solved MCQ Part - 2  Cpp programming Solved MCQ
Part - 2 Cpp programming Solved MCQ
 
Value-Based Manufacturing Optimisation in Serverless Clouds for Industry 4.0
Value-Based Manufacturing Optimisation in Serverless Clouds for Industry 4.0Value-Based Manufacturing Optimisation in Serverless Clouds for Industry 4.0
Value-Based Manufacturing Optimisation in Serverless Clouds for Industry 4.0
 
Operator overloading2
Operator overloading2Operator overloading2
Operator overloading2
 
C++ via C#
C++ via C#C++ via C#
C++ via C#
 
Early Results of OpenMP 4.5 Portability on NVIDIA GPUs & CPUs
Early Results of OpenMP 4.5 Portability on NVIDIA GPUs & CPUsEarly Results of OpenMP 4.5 Portability on NVIDIA GPUs & CPUs
Early Results of OpenMP 4.5 Portability on NVIDIA GPUs & CPUs
 
C programs Set 2
C programs Set 2C programs Set 2
C programs Set 2
 
Computer Engineering (Programming Language: Swift)
Computer Engineering (Programming Language: Swift)Computer Engineering (Programming Language: Swift)
Computer Engineering (Programming Language: Swift)
 
Tech Talks @NSU: DLang: возможности языка и его применение
Tech Talks @NSU: DLang: возможности языка и его применениеTech Talks @NSU: DLang: возможности языка и его применение
Tech Talks @NSU: DLang: возможности языка и его применение
 
Runtime Code Generation and Data Management for Heterogeneous Computing in Java
Runtime Code Generation and Data Management for Heterogeneous Computing in JavaRuntime Code Generation and Data Management for Heterogeneous Computing in Java
Runtime Code Generation and Data Management for Heterogeneous Computing in Java
 
Firefox content
Firefox contentFirefox content
Firefox content
 
C++ file
C++ fileC++ file
C++ file
 
Exploiting vectorization with ISPC
Exploiting vectorization with ISPCExploiting vectorization with ISPC
Exploiting vectorization with ISPC
 
Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...
 

Similar to C++ Corehard Autumn 2018. Обучаем на Python, применяем на C++ - Павел Филонов

Matúš Cimerman: Building AI data pipelines using PySpark, PyData Bratislava M...
Matúš Cimerman: Building AI data pipelines using PySpark, PyData Bratislava M...Matúš Cimerman: Building AI data pipelines using PySpark, PyData Bratislava M...
Matúš Cimerman: Building AI data pipelines using PySpark, PyData Bratislava M...
GapData Institute
 
Marek Suplata Projects
Marek Suplata ProjectsMarek Suplata Projects
Marek Suplata Projects
guest14f12f
 

Similar to C++ Corehard Autumn 2018. Обучаем на Python, применяем на C++ - Павел Филонов (20)

C++ lab assignment
C++ lab assignmentC++ lab assignment
C++ lab assignment
 
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018 Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
 
Story of static code analyzer development
Story of static code analyzer developmentStory of static code analyzer development
Story of static code analyzer development
 
Matúš Cimerman: Building AI data pipelines using PySpark, PyData Bratislava M...
Matúš Cimerman: Building AI data pipelines using PySpark, PyData Bratislava M...Matúš Cimerman: Building AI data pipelines using PySpark, PyData Bratislava M...
Matúš Cimerman: Building AI data pipelines using PySpark, PyData Bratislava M...
 
Debugging and Profiling C++ Template Metaprograms
Debugging and Profiling C++ Template MetaprogramsDebugging and Profiling C++ Template Metaprograms
Debugging and Profiling C++ Template Metaprograms
 
JS Responsibilities
JS ResponsibilitiesJS Responsibilities
JS Responsibilities
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
 
Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...
Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...
Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...
 
ParallelProgrammingBasics_v2.pdf
ParallelProgrammingBasics_v2.pdfParallelProgrammingBasics_v2.pdf
ParallelProgrammingBasics_v2.pdf
 
426 lecture 4: AR Developer Tools
426 lecture 4: AR Developer Tools426 lecture 4: AR Developer Tools
426 lecture 4: AR Developer Tools
 
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
 
How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...
 
Chapel Comes of Age: a Language for Productivity, Parallelism, and Performance
Chapel Comes of Age: a Language for Productivity, Parallelism, and PerformanceChapel Comes of Age: a Language for Productivity, Parallelism, and Performance
Chapel Comes of Age: a Language for Productivity, Parallelism, and Performance
 
C++20 the small things - Timur Doumler
C++20 the small things - Timur DoumlerC++20 the small things - Timur Doumler
C++20 the small things - Timur Doumler
 
What&rsquo;s new in Visual C++
What&rsquo;s new in Visual C++What&rsquo;s new in Visual C++
What&rsquo;s new in Visual C++
 
Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?
 
Marek Suplata Projects
Marek Suplata ProjectsMarek Suplata Projects
Marek Suplata Projects
 
Using Deep Learning for Computer Vision Applications
Using Deep Learning for Computer Vision ApplicationsUsing Deep Learning for Computer Vision Applications
Using Deep Learning for Computer Vision Applications
 

More from corehard_by

C++ CoreHard Autumn 2018. Обработка списков на C++ в функциональном стиле - В...
C++ CoreHard Autumn 2018. Обработка списков на C++ в функциональном стиле - В...C++ CoreHard Autumn 2018. Обработка списков на C++ в функциональном стиле - В...
C++ CoreHard Autumn 2018. Обработка списков на C++ в функциональном стиле - В...
corehard_by
 
C++ CoreHard Autumn 2018. Debug C++ Without Running - Anastasia Kazakova
C++ CoreHard Autumn 2018. Debug C++ Without Running - Anastasia KazakovaC++ CoreHard Autumn 2018. Debug C++ Without Running - Anastasia Kazakova
C++ CoreHard Autumn 2018. Debug C++ Without Running - Anastasia Kazakova
corehard_by
 
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
corehard_by
 

More from corehard_by (20)

C++ CoreHard Autumn 2018. Создание пакетов для открытых библиотек через conan...
C++ CoreHard Autumn 2018. Создание пакетов для открытых библиотек через conan...C++ CoreHard Autumn 2018. Создание пакетов для открытых библиотек через conan...
C++ CoreHard Autumn 2018. Создание пакетов для открытых библиотек через conan...
 
C++ CoreHard Autumn 2018. Что должен знать каждый C++ программист или Как про...
C++ CoreHard Autumn 2018. Что должен знать каждый C++ программист или Как про...C++ CoreHard Autumn 2018. Что должен знать каждый C++ программист или Как про...
C++ CoreHard Autumn 2018. Что должен знать каждый C++ программист или Как про...
 
C++ CoreHard Autumn 2018. Actors vs CSP vs Tasks vs ... - Евгений Охотников
C++ CoreHard Autumn 2018. Actors vs CSP vs Tasks vs ... - Евгений ОхотниковC++ CoreHard Autumn 2018. Actors vs CSP vs Tasks vs ... - Евгений Охотников
C++ CoreHard Autumn 2018. Actors vs CSP vs Tasks vs ... - Евгений Охотников
 
C++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр Титов
C++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр ТитовC++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр Титов
C++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр Титов
 
C++ CoreHard Autumn 2018. Информационная безопасность и разработка ПО - Евген...
C++ CoreHard Autumn 2018. Информационная безопасность и разработка ПО - Евген...C++ CoreHard Autumn 2018. Информационная безопасность и разработка ПО - Евген...
C++ CoreHard Autumn 2018. Информационная безопасность и разработка ПО - Евген...
 
C++ CoreHard Autumn 2018. Заглядываем под капот «Поясов по C++» - Илья Шишков
C++ CoreHard Autumn 2018. Заглядываем под капот «Поясов по C++» - Илья ШишковC++ CoreHard Autumn 2018. Заглядываем под капот «Поясов по C++» - Илья Шишков
C++ CoreHard Autumn 2018. Заглядываем под капот «Поясов по C++» - Илья Шишков
 
C++ CoreHard Autumn 2018. Ускорение сборки C++ проектов, способы и последстви...
C++ CoreHard Autumn 2018. Ускорение сборки C++ проектов, способы и последстви...C++ CoreHard Autumn 2018. Ускорение сборки C++ проектов, способы и последстви...
C++ CoreHard Autumn 2018. Ускорение сборки C++ проектов, способы и последстви...
 
C++ CoreHard Autumn 2018. Метаклассы: воплощаем мечты в реальность - Сергей С...
C++ CoreHard Autumn 2018. Метаклассы: воплощаем мечты в реальность - Сергей С...C++ CoreHard Autumn 2018. Метаклассы: воплощаем мечты в реальность - Сергей С...
C++ CoreHard Autumn 2018. Метаклассы: воплощаем мечты в реальность - Сергей С...
 
C++ CoreHard Autumn 2018. Что не умеет оптимизировать компилятор - Александр ...
C++ CoreHard Autumn 2018. Что не умеет оптимизировать компилятор - Александр ...C++ CoreHard Autumn 2018. Что не умеет оптимизировать компилятор - Александр ...
C++ CoreHard Autumn 2018. Что не умеет оптимизировать компилятор - Александр ...
 
C++ CoreHard Autumn 2018. Кодогенерация C++ кроссплатформенно. Продолжение - ...
C++ CoreHard Autumn 2018. Кодогенерация C++ кроссплатформенно. Продолжение - ...C++ CoreHard Autumn 2018. Кодогенерация C++ кроссплатформенно. Продолжение - ...
C++ CoreHard Autumn 2018. Кодогенерация C++ кроссплатформенно. Продолжение - ...
 
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
 
C++ CoreHard Autumn 2018. Обработка списков на C++ в функциональном стиле - В...
C++ CoreHard Autumn 2018. Обработка списков на C++ в функциональном стиле - В...C++ CoreHard Autumn 2018. Обработка списков на C++ в функциональном стиле - В...
C++ CoreHard Autumn 2018. Обработка списков на C++ в функциональном стиле - В...
 
C++ CoreHard Autumn 2018. Asynchronous programming with ranges - Ivan Čukić
C++ CoreHard Autumn 2018. Asynchronous programming with ranges - Ivan ČukićC++ CoreHard Autumn 2018. Asynchronous programming with ranges - Ivan Čukić
C++ CoreHard Autumn 2018. Asynchronous programming with ranges - Ivan Čukić
 
C++ CoreHard Autumn 2018. Debug C++ Without Running - Anastasia Kazakova
C++ CoreHard Autumn 2018. Debug C++ Without Running - Anastasia KazakovaC++ CoreHard Autumn 2018. Debug C++ Without Running - Anastasia Kazakova
C++ CoreHard Autumn 2018. Debug C++ Without Running - Anastasia Kazakova
 
C++ CoreHard Autumn 2018. Полезный constexpr - Антон Полухин
C++ CoreHard Autumn 2018. Полезный constexpr - Антон ПолухинC++ CoreHard Autumn 2018. Полезный constexpr - Антон Полухин
C++ CoreHard Autumn 2018. Полезный constexpr - Антон Полухин
 
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
 
Исключительная модель памяти. Алексей Ткаченко ➠ CoreHard Autumn 2019
Исключительная модель памяти. Алексей Ткаченко ➠ CoreHard Autumn 2019Исключительная модель памяти. Алексей Ткаченко ➠ CoreHard Autumn 2019
Исключительная модель памяти. Алексей Ткаченко ➠ CoreHard Autumn 2019
 
Как помочь и как помешать компилятору. Андрей Олейников ➠ CoreHard Autumn 2019
Как помочь и как помешать компилятору. Андрей Олейников ➠  CoreHard Autumn 2019Как помочь и как помешать компилятору. Андрей Олейников ➠  CoreHard Autumn 2019
Как помочь и как помешать компилятору. Андрей Олейников ➠ CoreHard Autumn 2019
 
Автоматизируй это. Кирилл Тихонов ➠ CoreHard Autumn 2019
Автоматизируй это. Кирилл Тихонов ➠  CoreHard Autumn 2019Автоматизируй это. Кирилл Тихонов ➠  CoreHard Autumn 2019
Автоматизируй это. Кирилл Тихонов ➠ CoreHard Autumn 2019
 
Статичный SQL в С++14. Евгений Захаров ➠ CoreHard Autumn 2019
Статичный SQL в С++14. Евгений Захаров ➠  CoreHard Autumn 2019Статичный SQL в С++14. Евгений Захаров ➠  CoreHard Autumn 2019
Статичный SQL в С++14. Евгений Захаров ➠ CoreHard Autumn 2019
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 

Recently uploaded (20)

Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 

C++ Corehard Autumn 2018. Обучаем на Python, применяем на C++ - Павел Филонов

  • 1. Train with Python, predict with C++ Pavel Filonov + + + + + + +
  • 2. About me Pavel Filonov Research Development Team Lead at Kaspersky Lab 01 CoreHard. Train with Python, predict with C++
  • 3. Machine Learning everywhere! 02 CoreHard. Train with Python, predict with C++ • Mobile • Embedded • Automotive • Desktops • Games • Finance • Etc. Image from [1]
  • 4. Machine Learning face the real world 03 CoreHard. Train with Python, predict with C++ Image from [2]
  • 5. CRISP-DM 04 CoreHard. Train with Python, predict with C++ Image from [3]
  • 6. Dream team 05 CoreHard. Train with Python, predict with C++ Developer Data Scientist
  • 7. Dream team – synergy way 06 CoreHard. Train with Python, predict with C++ Developer Data Scientist Research Developer
  • 8. Dream team – process way 07 CoreHard. Train with Python, predict with C++ Developer Data Scientist Communications
  • 9. Machine learning sample cases 08 CoreHard. Train with Python, predict with C++ 1. Energy efficiency prediction 2. Intrusion detection system 3. Image classification
  • 10. Buildings Energy Efficiency 09 CoreHard. Train with Python, predict with C++ ref: [4] ● Input attributes ○ Relative Compactness ○ Surface Area ○ Wall Area ○ etc. ● Outcomes ○ Heating Load
  • 11. Regression problem 10 CoreHard. Train with Python, predict with C++
  • 12. Regression problem 11 CoreHard. Train with Python, predict with C++
  • 13. Regression problem 12 CoreHard. Train with Python, predict with C++
  • 14. Quality metric 13 CoreHard. Train with Python, predict with C++ ●
  • 15. Baseline model 14 CoreHard. Train with Python, predict with C++ ● class Predictor { public: using features = std::vector<double>; virtual ~Predictor() {}; virtual double predict(const features&) const = 0; }; class MeanPredictor: public Predictor { public: MeanPredictor(double mean); double predict(const features&) const override { return mean_; } protected: double mean_; };
  • 16. Linear regression 15 CoreHard. Train with Python, predict with C++ ● class LinregPredictor: public Predictor { public: LinregPredictor(const std::vector<double>&); double predict(const features& feat) const override { assert(feat.size() + 1 == coef_.size()); return std::inner_product(feat.begin(), feat.end(), ++coef_.begin(), coef_.front()); } protected: std::vector<double> coef_; };
  • 17. Polynomial regression 16 CoreHard. Train with Python, predict with C++ ● class PolyPredictor: public LinregPredictor { public: using LinregPredictor::LinregPredictor; double predict(const features& feat) const override { features poly_feat{feat}; const auto m = feat.size(); poly_feat.reserve(m*(m+1)/2); for (size_t i = 0; i < m; ++i) { for (size_t j = i; j < m; ++j) { poly_feat.push_back(feat[i]*feat[j]); } } return LinregPredictor::predict(poly_feat); } };
  • 18. Integration testing 17 CoreHard. Train with Python, predict with C++ ● you always have a lot of data for testing ● use python model output as expected values ● beware of floating point arithmetic problems TEST(LinregPredictor, compare_to_python) { auto predictor = LinregPredictor{coef}; double y_pred_expected = 0.0; std::ifstream test_data{"../train/test_data_linreg.csv"}; while (read_features(test_data, features)) { test_data >> y_pred_expected; auto y_pred = predictor.predict(features); EXPECT_NEAR(y_pred_expected, y_pred, 1e-4); } }
  • 19. Intrusion detection system 18 CoreHard. Train with Python, predict with C++ ● input - network traffic features ○ protocol_type ○ connection duration ○ src_bytes ○ dst_bytes ○ etc. ● Output ○ normal ○ network attack ref: [5]
  • 20. Classification problem 19 CoreHard. Train with Python, predict with C++
  • 21. Quality metrics 20 CoreHard. Train with Python, predict with C++ ● Receive operation characteristics (ROC) curve
  • 22. Baseline model 21 CoreHard. Train with Python, predict with C++ ● always predict most frequent class ● ROC area under the curve = 0.5
  • 23. Logistic regression 22 CoreHard. Train with Python, predict with C++ ●
  • 24. Logistic regression 23 CoreHard. Train with Python, predict with C++ ● easy to implement template<typename T> auto sigma(T z) { return 1/(1 + std::exp(-z)); } class LogregClassifier: public BinaryClassifier { public: float predict_proba(const features_t& feat) const override { auto z = std::inner_product(feat.begin(), feat.end(), ++coef_.begin(), coef_.front()); return sigma(z); } protected: std::vector<float> coef_; };
  • 25. Gradient boosting 24 CoreHard. Train with Python, predict with C++ ● de facto standard universal method ● multiple well known C++ implementations with python bindings ○ XGBoost ○ LigthGBM ○ CatBoost ● each implementation has its own custom model format
  • 26. CatBoost 25 CoreHard. Train with Python, predict with C++ ● C API and C++ wrapper ● own build system (ymake) class CatboostClassifier: public BinaryClassifier { public: CatboostClassifier(const std::string& modepath); ~CatboostClassifier() override; double predict_proba(const features_t& feat) const override { double result = 0.0; if (!CalcModelPredictionSingle(model_, feat.data(), feat.size(), nullptr, 0, &result, 1)) { throw std::runtime_error{"CalcModelPredictionFlat error message:" + GetErrorString()}; } return result; } private: ModelCalcerHandle* model_; }
  • 27. CatBoost 26 CoreHard. Train with Python, predict with C++ ● ROC-AUC = 0.9999
  • 28. Image classification 27 CoreHard. Train with Python, predict with C++ ● Handwritten digits recognizer – MNIST ● input – gray-scale pixels 28x28 ● output – digit on picture (0, 1, … 9)
  • 29. Multilayer perceptron 28 CoreHard. Train with Python, predict with C++ Image from: [6]
  • 30. Quality metrics 29 CoreHard. Train with Python, predict with C++ ●
  • 31. Multilayer perceptron 30 CoreHard. Train with Python, predict with C++ ● auto MlpClassifier::predict_proba(const features_t& feat) const { VectorXf x{feat.size()}; auto o1 = sigmav(w1_ * x); auto o2 = softmax(w2_ * o1); return o2; }
  • 32. Multilayer perceptron 31 CoreHard. Train with Python, predict with C++ ● auto MlpClassifier::predict_proba(const features_t& feat) const { VectorXf x{feat.size()}; auto o1 = sigmav(w1_ * x); auto o2 = softmax(w2_ * o1); return o2; }
  • 33. Convolutional networks 31 CoreHard. Train with Python, predict with C++ ● State of the Art algorithms in image processing ● a lot of C++ implementation with python bindings ○ TensorFlow ○ Caffe ○ MXNet ○ CNTK
  • 34. Tensorflow 31 CoreHard. Train with Python, predict with C++ ● C++ API ● Bazel build system ● Hint – prebuild C API
  • 35. Conclusion 31 CoreHard. Train with Python, predict with C++ ● Don’t be fear of the ML ● Try simpler things first ● Get benefits from different languages
  • 36. References 02 CoreHard. Train with Python, predict with C++ 1. Andrew Ng, Machine Learning – coursera 2. How to Get the Right Creative Requirements From Your Client 3. The Forgotten Step in CRISP-DM and ASUM-DM Methodologies 4. Energy efficiency Data Set 5. KDD Cup 1999 6. MNIST training with Multi Layer Perceptron 7. Code samples
  • 37. + + + + + + + Many thanks! Let’s talk Pavel Filonov Pavel.Filonov@kaspersky.com +7(966)077-32-80 +