SlideShare a Scribd company logo
Popular Data Science Tools
Angel Marchev 2.0
Kaloyan Haralampiev
2
3
What we did?
Over 3000 members
Cooperation with communities
and Universities from Europe
and Asia
More than 50
countries
25 real cases in 3
Datathons
63 superb solutions
Students and experts
with up to 20 years of
experience
4 years
Area of Machine Learning,
NLP, Data enrichment,
Computer Vision and AI
Working with SME and Big
companies
4
Hack the Fake
News Datathon
What we dare to?
Our first event
Two projects with
30 volunteers
Big data conference
2014, Nov 2015 2016
2017
2018
The First online
#Datathon2018
Academia
#Datathon
Apr
• Over 50 meetups
• 8 conferences participation
• 2 workshops
Jun SepFeb
Hack the News
#Datathon
#Datathon2018
v2
First Datathon in CEE
Mar May
5
Past #Datathon2018
144 participants
39 teams 9 cases
493 chat rooms
16563 messages
exchanged
32 mentors and
industry experts
24
countries
The #Datathon2018 participants
managed to solve all cases
There was great fun with more than
4 fun sessions
А lot of beer and pizza was
consumed
38 quality Data solutions at the
end
Great results challenging
even for the companies
38 solutions
6
Impressions
Milena Yankova
Head of Research &
Innovation
Shashank Shekhar
Manager - Data Sciences .
Agamemnon
Baltagiannis
Principal Data Scientist
Tomislav Križan
CPO, Member of the Board
“The results of our case are
impressive and have further
motivated our R&D department to
explore more opportunities and apply
some of the team results that worked
on it.”
“The best thing about this Datathon
was its global footprint. I was amazed
by the sheer enthusiasm that the
participants demonstrated. The
resilience and adaptability shown by a
lot of them in providing a working
solution to real life problems made
this Datathon a huge success."
“Thank you all for this great
weekend. It was a fantastic
challenge and I am happy that I
saw deep technical work from all
the participants! I will be always
here to support the DSS
community”
“From all finalists we did see
good and novel approach...
also those who didn't arrive to
finals, were also really close ...
so good job to all teams!"
“The teams solutions were well documented in CRISP-DM
Methodology at Datathon 2018 organized by DSS, in which
Kaufland was proud to participate”
About The Tools
Introduction
• It is impossible to cover all tools,
• so we reduced the number of tools covered to the ones we use
• Still the task is hard, due to:
– Various types of tools (noise in the input data)
– Many criteria (so multi-dimensional problem)
– Tools for many purposes (overlapping categories)
• Hmmmm..!? Sounds like an ideal case for Multi-dimensional
scaling (MDS)
• SO LET’S GO FULL NERDY ON IT
MDS Map
Features:
Application
• Statistics
• Econometrics
• Data mining
Workflow
• Console
• Menus
• Nodes
• Online
License
• Free
• Non-free
Relatedness
Popularity
Interactivity
- Free
- Non free
Popularity:
MDS Map
“The All-Stars”
“The Classics”
“The User-Friendlies”
“The On-liners”
The Classics
Excel Data Analysis
• Application: Statistical analysis
• Interface: Menus and windows
• Price: Licensed
• Pros: Availability (almost
everybody have Excel)
• Cons: Works with selected cells
not with variable names
IBM SPSS Statistics
• Application: Statistical analysis; Econometric analysis
• Interface: Menus and windows; Command console
• Price: Licensed
• Pros: Very large set of analyses
• Cons: Non-interactive
PSPP
• Application: Statistical analysis
• Interface: Menus and windows
• Price: Free
• Pros: “Free” SPSS Statistics
• Cons: Relatively small set of
analyses; Non-interactive
eViews
• Application: Econometric
analysis
• Interface: Menus and
windows; Command console
• Price: Licensed
• Pros: Efficient calculations
• Cons: Data import issues
Gretl
• Application: Econometric analysis
• Interface: Menus and windows;
Command console
• Price: Free
• Pros: Hansl (localized user manual)
• Cons: Limit to the volume of data
The All-stars
Python
• Application: Statistical analysis;
Econometric analysis; Data
mining
• Interface: Command console
• Price: Free
• Pros: Global community
developing libs
• Cons:
R (+R studio)
• Application: Statistical analysis;
Econometric analysis; Data mining
• Interface: Command console
• Price: Free
• Pros: Global community
developing libs
• Cons: a little weird language
Jupyter Notebook
• Application: Data mining
• Interface: Online platform
• Price: Free
• Pros: Industry standard for Data Science
• Cons:
MatLab
• Application: Statistical analysis; Econometric analysis
• Interface: Command console
• Price: Licensed
• Pros: Great documentation,
parallel computing
• Cons: Expensive
The User-friendlies
JASP
• Application: Statistical analysis
• Interface: Menus and windows
• Price: Free
• Pros: Interactive
• Cons: Relatively small set of
analyses
Weka
• Application: Statistical analysis; Data
mining
• Interface: Graphical stream/workflow
• Price: Free
• Pros: One of the original revolutionaries
• Cons: outdated and clumsy
Rapid Miner
• Application: Statistical analysis; Data
mining
• Interface: Graphical stream/workflow
• Price: Licensed
• Pros: Probably the most intuitive interface
• Cons:
KNIME
• Application: Statistical analysis; Data mining
• Interface: Graphical stream/workflow
• Price: Free
• Pros: Interactive
• Cons: Relatively small set of analyses
Orange
• Application: Data mining
• Interface: Graphical
stream/workflow
• Price: Free
• Pros: Interactive
• Cons: Relatively small set
of analyses
IBM SPSS Modeler
• Application: Econometric analysis; Data mining
• Interface: Graphical stream/workflow
• Price: Licensed
• Pros: well utilizing resources
• Cons: not user friendly when dealing with lots of
features
MatLab Classification Learner
• Application: Data mining
• Interface: Graphical
stream/workflow
• Price: Licensed
• Pros: part of Matlab
environment
• Cons: still under
development to include more
models
The On-liners
Microsoft Azure
• Application: Data mining
• Interface: Online
platform
• Price: Licensed
• Pros: Many tools already
available
• Cons: Could be a little
hard to set-up
IBM Watson Studio
• Application: Data mining
• Interface: Online platform
• Price: Licensed
• Pros: brand new
• Cons: still some computability issues
Amazon ML
• Application: Data mining
• Interface: Online platform
• Price: Licensed
• Pros: integrated with AWS
S3 and could work real-
time
• Cons: still under
development to include
more models
Google Colab
• Application: Data mining
• Interface: Online platform
• Price: Free
• Pros: GPU computation via Tensor Flow
• Cons: 12 hours at a time
Selecting the right tool
Selection tree
• What type of problem do you solve? (Application)
• What type of interface would be suitable? (Workflow)
• Licensed or non-licensed? (Price)
Application Workflow Price Software
Statistical analysis Menus and windows Licensed Excel Data Analysis IBM SPSS Statistics
Free PSPP JASP
Command console Licensed MatLab IBM SPSS Statistics
Free R (+ R Studio) Python
Graphical stream/workflow Licensed Rapid Miner
Free KNIME Weka
Econometric analysis Menus and windows Licensed eViews IBM SPSS Statistics
Free Gretl
Command console Licensed eViews IBM SPSS Statistics MatLab
Free Gretl R (+ R Studio) Python
Graphical stream/workflow Licensed IBM SPSS Modeler
Data mining Command console Licensed Matlab
Free R (+ R Studio) Python
Graphical stream/workflow Licensed IBM SPSS Modeler Rapid Miner Matlab Classification App
Free Orange KNIME Weka
Online platform Licensed IBM Watson Studio Microsoft Azure Amazon ML
Free Google Colab Jupyter Notebook
Q & A
• angel.marchev@datasciencesociety.net
k_haralampiev@phls.uni-sofia.bg

More Related Content

What's hot

Neo4j GraphDay Seattle- Sept19- graphs are ai
Neo4j GraphDay Seattle- Sept19-  graphs are aiNeo4j GraphDay Seattle- Sept19-  graphs are ai
Neo4j GraphDay Seattle- Sept19- graphs are ai
Neo4j
 
Graph database & neo4j
Graph database & neo4jGraph database & neo4j
Graph database & neo4j
Sandip Jadhav
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with Dato
Turi, Inc.
 
Building Custom
Machine Learning Algorithms
with Apache SystemML
Building Custom
Machine Learning Algorithms
with Apache SystemMLBuilding Custom
Machine Learning Algorithms
with Apache SystemML
Building Custom
Machine Learning Algorithms
with Apache SystemML
sparktc
 
The Graph Revolution
The Graph RevolutionThe Graph Revolution
The Graph Revolution
Peter Presnell
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015
Turi, Inc.
 
ETL – Everything you need to know
ETL – Everything you need to knowETL – Everything you need to know
ETL – Everything you need to know
Adi Polak
 
Bi 2.0 hadoop everywhere
Bi 2.0   hadoop everywhereBi 2.0   hadoop everywhere
Bi 2.0 hadoop everywhere
Dmitry Tolpeko
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Sri Ambati
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
Daniel Marcous
 
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Sri Ambati
 
Recommendations @ Rakuten Group
Recommendations @ Rakuten GroupRecommendations @ Rakuten Group
Recommendations @ Rakuten Group
recsysfr
 

What's hot (12)

Neo4j GraphDay Seattle- Sept19- graphs are ai
Neo4j GraphDay Seattle- Sept19-  graphs are aiNeo4j GraphDay Seattle- Sept19-  graphs are ai
Neo4j GraphDay Seattle- Sept19- graphs are ai
 
Graph database & neo4j
Graph database & neo4jGraph database & neo4j
Graph database & neo4j
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with Dato
 
Building Custom
Machine Learning Algorithms
with Apache SystemML
Building Custom
Machine Learning Algorithms
with Apache SystemMLBuilding Custom
Machine Learning Algorithms
with Apache SystemML
Building Custom
Machine Learning Algorithms
with Apache SystemML
 
The Graph Revolution
The Graph RevolutionThe Graph Revolution
The Graph Revolution
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015
 
ETL – Everything you need to know
ETL – Everything you need to knowETL – Everything you need to know
ETL – Everything you need to know
 
Bi 2.0 hadoop everywhere
Bi 2.0   hadoop everywhereBi 2.0   hadoop everywhere
Bi 2.0 hadoop everywhere
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
 
Recommendations @ Rakuten Group
Recommendations @ Rakuten GroupRecommendations @ Rakuten Group
Recommendations @ Rakuten Group
 

Similar to Data science tools - A.Marchev and K.Haralampiev

Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
Andrew Musselman
 
Building a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlowBuilding a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlow
GoDataDriven
 
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Digipolis Antwerpen
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos Guestrin
Turi, Inc.
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Tomasz Bednarz
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine Learning
Neo4j
 
Neo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperativeNeo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j
 
IT_Tools_in_Research.ppt
IT_Tools_in_Research.pptIT_Tools_in_Research.ppt
IT_Tools_in_Research.ppt
DrKRanjithSinghCompu
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
markgrover
 
Think Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceThink Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial Intelligence
Data Science Milan
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
Sri Ambati
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
Pharo
 
DataCanvas: Big Data Analytic Flow in Cloud
DataCanvas: Big Data Analytic Flow in CloudDataCanvas: Big Data Analytic Flow in Cloud
DataCanvas: Big Data Analytic Flow in Cloud
Lei Fang
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Sri Ambati
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
 
Engage 2020 - Domino Application Strategy: Key insights for successful modern...
Engage 2020 - Domino Application Strategy: Key insights for successful modern...Engage 2020 - Domino Application Strategy: Key insights for successful modern...
Engage 2020 - Domino Application Strategy: Key insights for successful modern...
panagenda
 
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
panagenda
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
Travis Oliphant
 
Market research of the analytics tools
Market research of the analytics toolsMarket research of the analytics tools
Market research of the analytics tools
Konstantin Prokhorov, MBA, MSc
 
Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"
Diego Oppenheimer
 

Similar to Data science tools - A.Marchev and K.Haralampiev (20)

Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
 
Building a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlowBuilding a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlow
 
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos Guestrin
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine Learning
 
Neo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperativeNeo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperative
 
IT_Tools_in_Research.ppt
IT_Tools_in_Research.pptIT_Tools_in_Research.ppt
IT_Tools_in_Research.ppt
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
Think Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceThink Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial Intelligence
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
 
DataCanvas: Big Data Analytic Flow in Cloud
DataCanvas: Big Data Analytic Flow in CloudDataCanvas: Big Data Analytic Flow in Cloud
DataCanvas: Big Data Analytic Flow in Cloud
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Engage 2020 - Domino Application Strategy: Key insights for successful modern...
Engage 2020 - Domino Application Strategy: Key insights for successful modern...Engage 2020 - Domino Application Strategy: Key insights for successful modern...
Engage 2020 - Domino Application Strategy: Key insights for successful modern...
 
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
 
Market research of the analytics tools
Market research of the analytics toolsMarket research of the analytics tools
Market research of the analytics tools
 
Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"
 

More from Data Science Society

[Data Meetup] Data Science in Finance - Factor Models in Finance
[Data Meetup] Data Science in Finance - Factor Models in Finance[Data Meetup] Data Science in Finance - Factor Models in Finance
[Data Meetup] Data Science in Finance - Factor Models in Finance
Data Science Society
 
[Data Meetup] Data Science in Finance - Building a Quant ML pipeline
[Data Meetup] Data Science in Finance -  Building a Quant ML pipeline[Data Meetup] Data Science in Finance -  Building a Quant ML pipeline
[Data Meetup] Data Science in Finance - Building a Quant ML pipeline
Data Science Society
 
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
Data Science Society
 
Computer Vision in Real Estate
Computer Vision in Real EstateComputer Vision in Real Estate
Computer Vision in Real Estate
Data Science Society
 
ML in Proptech - Concept to Production
ML in Proptech  -  Concept to ProductionML in Proptech  -  Concept to Production
ML in Proptech - Concept to Production
Data Science Society
 
Lessons Learned: Linked Open Data implemented in 2 Use Cases
Lessons Learned: Linked Open Data implemented in 2 Use CasesLessons Learned: Linked Open Data implemented in 2 Use Cases
Lessons Learned: Linked Open Data implemented in 2 Use Cases
Data Science Society
 
AI methods for localization in noisy environment
AI methods for localization in noisy environment AI methods for localization in noisy environment
AI methods for localization in noisy environment
Data Science Society
 
Object Identification and Detection Hackathon Solution
Object Identification and Detection Hackathon Solution Object Identification and Detection Hackathon Solution
Object Identification and Detection Hackathon Solution
Data Science Society
 
Data Science for Open Innovation in SMEs and Large Corporations
Data Science for Open Innovation in SMEs and Large CorporationsData Science for Open Innovation in SMEs and Large Corporations
Data Science for Open Innovation in SMEs and Large Corporations
Data Science Society
 
Air Pollution in Sofia - Solution through Data Science by Kiwi team
Air Pollution in Sofia - Solution through Data Science by Kiwi teamAir Pollution in Sofia - Solution through Data Science by Kiwi team
Air Pollution in Sofia - Solution through Data Science by Kiwi team
Data Science Society
 
Machine Learning in Astrophysics
Machine Learning in AstrophysicsMachine Learning in Astrophysics
Machine Learning in Astrophysics
Data Science Society
 
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
Data Science Society
 
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Data Science Society
 
DNA Analytics - What does really goes into Sausages - Datathon2018 Solution
DNA Analytics - What does really goes into Sausages - Datathon2018 SolutionDNA Analytics - What does really goes into Sausages - Datathon2018 Solution
DNA Analytics - What does really goes into Sausages - Datathon2018 Solution
Data Science Society
 
Relationships between research tasks and data structure (basic methods and a...
Relationships between research tasks and data structure (basic  methods and a...Relationships between research tasks and data structure (basic  methods and a...
Relationships between research tasks and data structure (basic methods and a...
Data Science Society
 
Problems of Application of Machine Learning in the CRM - panel
Problems of Application of Machine Learning in the CRM - panel Problems of Application of Machine Learning in the CRM - panel
Problems of Application of Machine Learning in the CRM - panel
Data Science Society
 
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Data Science Society
 
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav NakovIntelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Data Science Society
 
Master class Hristo Hadjitchonev - Aubg
Master class Hristo Hadjitchonev - Aubg Master class Hristo Hadjitchonev - Aubg
Master class Hristo Hadjitchonev - Aubg
Data Science Society
 
Open Data reveals corruption practices - case from Datathon 2017
Open Data reveals corruption practices - case from Datathon 2017Open Data reveals corruption practices - case from Datathon 2017
Open Data reveals corruption practices - case from Datathon 2017
Data Science Society
 

More from Data Science Society (20)

[Data Meetup] Data Science in Finance - Factor Models in Finance
[Data Meetup] Data Science in Finance - Factor Models in Finance[Data Meetup] Data Science in Finance - Factor Models in Finance
[Data Meetup] Data Science in Finance - Factor Models in Finance
 
[Data Meetup] Data Science in Finance - Building a Quant ML pipeline
[Data Meetup] Data Science in Finance -  Building a Quant ML pipeline[Data Meetup] Data Science in Finance -  Building a Quant ML pipeline
[Data Meetup] Data Science in Finance - Building a Quant ML pipeline
 
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
 
Computer Vision in Real Estate
Computer Vision in Real EstateComputer Vision in Real Estate
Computer Vision in Real Estate
 
ML in Proptech - Concept to Production
ML in Proptech  -  Concept to ProductionML in Proptech  -  Concept to Production
ML in Proptech - Concept to Production
 
Lessons Learned: Linked Open Data implemented in 2 Use Cases
Lessons Learned: Linked Open Data implemented in 2 Use CasesLessons Learned: Linked Open Data implemented in 2 Use Cases
Lessons Learned: Linked Open Data implemented in 2 Use Cases
 
AI methods for localization in noisy environment
AI methods for localization in noisy environment AI methods for localization in noisy environment
AI methods for localization in noisy environment
 
Object Identification and Detection Hackathon Solution
Object Identification and Detection Hackathon Solution Object Identification and Detection Hackathon Solution
Object Identification and Detection Hackathon Solution
 
Data Science for Open Innovation in SMEs and Large Corporations
Data Science for Open Innovation in SMEs and Large CorporationsData Science for Open Innovation in SMEs and Large Corporations
Data Science for Open Innovation in SMEs and Large Corporations
 
Air Pollution in Sofia - Solution through Data Science by Kiwi team
Air Pollution in Sofia - Solution through Data Science by Kiwi teamAir Pollution in Sofia - Solution through Data Science by Kiwi team
Air Pollution in Sofia - Solution through Data Science by Kiwi team
 
Machine Learning in Astrophysics
Machine Learning in AstrophysicsMachine Learning in Astrophysics
Machine Learning in Astrophysics
 
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
 
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
 
DNA Analytics - What does really goes into Sausages - Datathon2018 Solution
DNA Analytics - What does really goes into Sausages - Datathon2018 SolutionDNA Analytics - What does really goes into Sausages - Datathon2018 Solution
DNA Analytics - What does really goes into Sausages - Datathon2018 Solution
 
Relationships between research tasks and data structure (basic methods and a...
Relationships between research tasks and data structure (basic  methods and a...Relationships between research tasks and data structure (basic  methods and a...
Relationships between research tasks and data structure (basic methods and a...
 
Problems of Application of Machine Learning in the CRM - panel
Problems of Application of Machine Learning in the CRM - panel Problems of Application of Machine Learning in the CRM - panel
Problems of Application of Machine Learning in the CRM - panel
 
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
 
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav NakovIntelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
 
Master class Hristo Hadjitchonev - Aubg
Master class Hristo Hadjitchonev - Aubg Master class Hristo Hadjitchonev - Aubg
Master class Hristo Hadjitchonev - Aubg
 
Open Data reveals corruption practices - case from Datathon 2017
Open Data reveals corruption practices - case from Datathon 2017Open Data reveals corruption practices - case from Datathon 2017
Open Data reveals corruption practices - case from Datathon 2017
 

Recently uploaded

3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
Mudde &  Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...Mudde &  Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
frank0071
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
RASHMI M G
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
Anemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptxAnemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptx
muralinath2
 
Red blood cells- genesis-maturation.pptx
Red blood cells- genesis-maturation.pptxRed blood cells- genesis-maturation.pptx
Red blood cells- genesis-maturation.pptx
muralinath2
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 

Recently uploaded (20)

3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
Mudde &  Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...Mudde &  Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
Anemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptxAnemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptx
 
Red blood cells- genesis-maturation.pptx
Red blood cells- genesis-maturation.pptxRed blood cells- genesis-maturation.pptx
Red blood cells- genesis-maturation.pptx
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 

Data science tools - A.Marchev and K.Haralampiev

  • 1. Popular Data Science Tools Angel Marchev 2.0 Kaloyan Haralampiev
  • 2. 2
  • 3. 3 What we did? Over 3000 members Cooperation with communities and Universities from Europe and Asia More than 50 countries 25 real cases in 3 Datathons 63 superb solutions Students and experts with up to 20 years of experience 4 years Area of Machine Learning, NLP, Data enrichment, Computer Vision and AI Working with SME and Big companies
  • 4. 4 Hack the Fake News Datathon What we dare to? Our first event Two projects with 30 volunteers Big data conference 2014, Nov 2015 2016 2017 2018 The First online #Datathon2018 Academia #Datathon Apr • Over 50 meetups • 8 conferences participation • 2 workshops Jun SepFeb Hack the News #Datathon #Datathon2018 v2 First Datathon in CEE Mar May
  • 5. 5 Past #Datathon2018 144 participants 39 teams 9 cases 493 chat rooms 16563 messages exchanged 32 mentors and industry experts 24 countries The #Datathon2018 participants managed to solve all cases There was great fun with more than 4 fun sessions А lot of beer and pizza was consumed 38 quality Data solutions at the end Great results challenging even for the companies 38 solutions
  • 6. 6 Impressions Milena Yankova Head of Research & Innovation Shashank Shekhar Manager - Data Sciences . Agamemnon Baltagiannis Principal Data Scientist Tomislav Križan CPO, Member of the Board “The results of our case are impressive and have further motivated our R&D department to explore more opportunities and apply some of the team results that worked on it.” “The best thing about this Datathon was its global footprint. I was amazed by the sheer enthusiasm that the participants demonstrated. The resilience and adaptability shown by a lot of them in providing a working solution to real life problems made this Datathon a huge success." “Thank you all for this great weekend. It was a fantastic challenge and I am happy that I saw deep technical work from all the participants! I will be always here to support the DSS community” “From all finalists we did see good and novel approach... also those who didn't arrive to finals, were also really close ... so good job to all teams!" “The teams solutions were well documented in CRISP-DM Methodology at Datathon 2018 organized by DSS, in which Kaufland was proud to participate”
  • 8. Introduction • It is impossible to cover all tools, • so we reduced the number of tools covered to the ones we use • Still the task is hard, due to: – Various types of tools (noise in the input data) – Many criteria (so multi-dimensional problem) – Tools for many purposes (overlapping categories) • Hmmmm..!? Sounds like an ideal case for Multi-dimensional scaling (MDS) • SO LET’S GO FULL NERDY ON IT
  • 9. MDS Map Features: Application • Statistics • Econometrics • Data mining Workflow • Console • Menus • Nodes • Online License • Free • Non-free Relatedness Popularity Interactivity - Free - Non free Popularity:
  • 10. MDS Map “The All-Stars” “The Classics” “The User-Friendlies” “The On-liners”
  • 12. Excel Data Analysis • Application: Statistical analysis • Interface: Menus and windows • Price: Licensed • Pros: Availability (almost everybody have Excel) • Cons: Works with selected cells not with variable names
  • 13. IBM SPSS Statistics • Application: Statistical analysis; Econometric analysis • Interface: Menus and windows; Command console • Price: Licensed • Pros: Very large set of analyses • Cons: Non-interactive
  • 14. PSPP • Application: Statistical analysis • Interface: Menus and windows • Price: Free • Pros: “Free” SPSS Statistics • Cons: Relatively small set of analyses; Non-interactive
  • 15. eViews • Application: Econometric analysis • Interface: Menus and windows; Command console • Price: Licensed • Pros: Efficient calculations • Cons: Data import issues
  • 16. Gretl • Application: Econometric analysis • Interface: Menus and windows; Command console • Price: Free • Pros: Hansl (localized user manual) • Cons: Limit to the volume of data
  • 18. Python • Application: Statistical analysis; Econometric analysis; Data mining • Interface: Command console • Price: Free • Pros: Global community developing libs • Cons:
  • 19. R (+R studio) • Application: Statistical analysis; Econometric analysis; Data mining • Interface: Command console • Price: Free • Pros: Global community developing libs • Cons: a little weird language
  • 20. Jupyter Notebook • Application: Data mining • Interface: Online platform • Price: Free • Pros: Industry standard for Data Science • Cons:
  • 21. MatLab • Application: Statistical analysis; Econometric analysis • Interface: Command console • Price: Licensed • Pros: Great documentation, parallel computing • Cons: Expensive
  • 23. JASP • Application: Statistical analysis • Interface: Menus and windows • Price: Free • Pros: Interactive • Cons: Relatively small set of analyses
  • 24. Weka • Application: Statistical analysis; Data mining • Interface: Graphical stream/workflow • Price: Free • Pros: One of the original revolutionaries • Cons: outdated and clumsy
  • 25. Rapid Miner • Application: Statistical analysis; Data mining • Interface: Graphical stream/workflow • Price: Licensed • Pros: Probably the most intuitive interface • Cons:
  • 26. KNIME • Application: Statistical analysis; Data mining • Interface: Graphical stream/workflow • Price: Free • Pros: Interactive • Cons: Relatively small set of analyses
  • 27. Orange • Application: Data mining • Interface: Graphical stream/workflow • Price: Free • Pros: Interactive • Cons: Relatively small set of analyses
  • 28. IBM SPSS Modeler • Application: Econometric analysis; Data mining • Interface: Graphical stream/workflow • Price: Licensed • Pros: well utilizing resources • Cons: not user friendly when dealing with lots of features
  • 29. MatLab Classification Learner • Application: Data mining • Interface: Graphical stream/workflow • Price: Licensed • Pros: part of Matlab environment • Cons: still under development to include more models
  • 31. Microsoft Azure • Application: Data mining • Interface: Online platform • Price: Licensed • Pros: Many tools already available • Cons: Could be a little hard to set-up
  • 32. IBM Watson Studio • Application: Data mining • Interface: Online platform • Price: Licensed • Pros: brand new • Cons: still some computability issues
  • 33. Amazon ML • Application: Data mining • Interface: Online platform • Price: Licensed • Pros: integrated with AWS S3 and could work real- time • Cons: still under development to include more models
  • 34. Google Colab • Application: Data mining • Interface: Online platform • Price: Free • Pros: GPU computation via Tensor Flow • Cons: 12 hours at a time
  • 36. Selection tree • What type of problem do you solve? (Application) • What type of interface would be suitable? (Workflow) • Licensed or non-licensed? (Price) Application Workflow Price Software Statistical analysis Menus and windows Licensed Excel Data Analysis IBM SPSS Statistics Free PSPP JASP Command console Licensed MatLab IBM SPSS Statistics Free R (+ R Studio) Python Graphical stream/workflow Licensed Rapid Miner Free KNIME Weka Econometric analysis Menus and windows Licensed eViews IBM SPSS Statistics Free Gretl Command console Licensed eViews IBM SPSS Statistics MatLab Free Gretl R (+ R Studio) Python Graphical stream/workflow Licensed IBM SPSS Modeler Data mining Command console Licensed Matlab Free R (+ R Studio) Python Graphical stream/workflow Licensed IBM SPSS Modeler Rapid Miner Matlab Classification App Free Orange KNIME Weka Online platform Licensed IBM Watson Studio Microsoft Azure Amazon ML Free Google Colab Jupyter Notebook
  • 37. Q & A • angel.marchev@datasciencesociety.net k_haralampiev@phls.uni-sofia.bg