SlideShare a Scribd company logo
1 of 16
Data Mining Tools
Kowshik
Madhumati
Mayur
Mohamed Sharique
Vidyashankar
• Open source
• Data visualization and analysis
• Novice and experts
• Through Python scripting
• Available for all popular platforms, including
Windows, Mac OS X and variants of Linux.
• Founded on 1996
• Orange is distributed free under the GPL.
• M&D at the Bioinformatics Laboratory of the
Faculty of Computer and Information
Science, University of Ljubljana, Slovenia.
Product Details
Company Details
Python is a widely used general-purpose, high-level programming language.
GNU General Public License is the most widely used free software license
Features
• Visual Programming
• Visualization
• Interaction and Data Analytics
• Large Toolbox
• Scripting Interface
• Extendable
• Documentation
• Open Source
• Platform Independence
Success Stories
• Astra-Zeneca, a pharmaceutical giant, which uses
Orange in drug development and sponsors the
development of several related parts of Orange
• At Jožef Stefan Institute, the visual programming
interface has been upgraded in Orange4WS to
support service-oriented architectures
Screenshot
• Latest R-language engine for statistical computing
• Open source, R- Enterprise, R-Cloud(Paid version )
• Data visualization and analysis up to 16 TB
• Extended capabilities with reproducible R tool Kits
• Windows , Mac OS and variants of Linux.
• Founded on 1993 in New Zealand
• Robert and Rossa pioneer in R language
development .
• R has General Public Licence.
• Many Big MNC companies are using R software.
Product Details
Company Details
Useful Functions • Graphics Visualization
• Spatial Data Analysis
• Clustering
• Text Mining
• Social Network Analysis and Graph mining
• Statistics
• Data Manipulation
Success Stories
• Bank of America
• Bing
• Facebook
• Ford
• Google
Screenshot
• Open source
• a collection of machine learning algorithms
• Data visualization and analysis
• Java based platform
• Most researchers and practitioners
• Founded on 1997
• University of Waikato
Product Details
Company Details
Public License is the most widely used free software license
Features • General public license
• GUI for interacting
• Explorer is the main user interface of WEKA
• primitive tasks including data pre-processing,
classification, regression, clustering, association rules
and visualization
• Execute data files in multiple format
• One exceptional feature of WEKA is the database
connection using JDBC with any RDBMS package
• The Weka mailing list has over 1100
subscribers in 50 countries, including
subscribers from many major companies
such as Rechtsportal
Success Stories
Screenshot
• Open source.
• Data visualization and analysis
• Machine Learning
• Data Mining, Text Mining.
• Business Intelligence.
• Works on java runtime.
• Available on all major operating systems and
platforms
• Started as YALE in 2001 by Ralf Klinkenberg, Ingo
Mierswa, and Simon Fische
• In 2006 it was renamed by Rapidminer since
developed by Rapid-1 founded by Ralf
Klinkenberg, Ingo Mierswa
• Licensed by AGPL.
Product Details
Company Details
Features • A visual - code-free - environment, so no programming needed
• Design of analysis processes
• Predictive analytics (with pre-made templates)
• Data loading
• Data transformation
• Data Modelling
• Data visualization (with lots of visualizations)
• Allows you to work with different types and sizes of data sources
• Platform Independence.
• Acts as a powerful scripting language engine along with a
graphical user
• Modular operator concept.
• CISCO
• PAYPAL
• EBAY
• MIELE
• VOLKSWAGEN
Success Stories
Screenshot
COMPARISON OF ALL TOOLS
WEKA RAPIDMINER R-
PROGRAMMING
ORANGE
FORMATS
SUPPORTED
ONLY 4 FILE
FORMATS ARE
SUPPORTED
SUPPORTS
MORE FILE
FORMATS
(Approx 22)
SUPPORTS MORE
FILE FORMATS
SUPPORTS
MORE FILE
FORMATS
USER
INTERFACE
EASY USER
INTERFACE
DIFFICULT USER
INTERFACE
SIMPLE IN UNIX
OS,DIFFICULT IN
WINDOWS AND
MAC
EASY
CONNECTIVITY WORSE
CONNECTIVITY
WITH EXCEL
AND NON JAVA
DATABASES
EASILY
CONNECTED
WITH EXCEL
EASY
CONNECTIVITY
WITH EXCEL AND
OTHER
DATABASES
BETTER
THAN WEKA
Orange has elegant and concise scripting and can also be run in an ETL
GUI mode.
R has elegant and concise scripting integrated with a vast statistical
library.
RapidMiner has a lot of functionality, is polished and has good
connectivity.
WEKA is the easiest GUI to learn and use.
• http://old.biolab.si/
• http://en.wikipedia.org/
• http://www.predictiveanalyticsto
day.com/
• http://thenewstack.io/
• www.facebook.com/
• www.slideshare.net/
• www.kdnuggets.com/
• www.researchgate.net
• https://rapidminer.com/
• www.r-project.org
• sourceforge.net/projects/weka
• www.thearling.com

More Related Content

What's hot

What's hot (20)

What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 
Primality
PrimalityPrimality
Primality
 
Data visualization
Data visualizationData visualization
Data visualization
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
3 Data Mining Tasks
3  Data Mining Tasks3  Data Mining Tasks
3 Data Mining Tasks
 
Daa notes 3
Daa notes 3Daa notes 3
Daa notes 3
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Mycin presentation
Mycin presentationMycin presentation
Mycin presentation
 
Data cleaning-outlier-detection
Data cleaning-outlier-detectionData cleaning-outlier-detection
Data cleaning-outlier-detection
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Privacy, security and ethics in data science
Privacy, security and ethics in data sciencePrivacy, security and ethics in data science
Privacy, security and ethics in data science
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 

Viewers also liked

Data mining tools
Data mining toolsData mining tools
Data mining tools
suganmca14
 
A comparative analysis of data mining tools for performance mapping of wlan data
A comparative analysis of data mining tools for performance mapping of wlan dataA comparative analysis of data mining tools for performance mapping of wlan data
A comparative analysis of data mining tools for performance mapping of wlan data
IAEME Publication
 
RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?
Sven Van Poucke, MD, PhD
 
DATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGEDATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGE
Neeraj Goswami
 

Viewers also liked (16)

Data mining tools used in business intelligence
Data mining tools used in business intelligenceData mining tools used in business intelligence
Data mining tools used in business intelligence
 
Data mining tools
Data mining toolsData mining tools
Data mining tools
 
A comparative analysis of data mining tools for performance mapping of wlan data
A comparative analysis of data mining tools for performance mapping of wlan dataA comparative analysis of data mining tools for performance mapping of wlan data
A comparative analysis of data mining tools for performance mapping of wlan data
 
Slides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMinerSlides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMiner
 
Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...
Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...
Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...
 
M Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson classM Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson class
 
RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?
 
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareData Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
 
Rapidminer
RapidminerRapidminer
Rapidminer
 
Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7
 
Data Mining Tools / Orange
Data Mining Tools / OrangeData Mining Tools / Orange
Data Mining Tools / Orange
 
Data Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research OpportunitiesData Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research Opportunities
 
RapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid Miner
 
DATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGEDATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGE
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 
Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 

Similar to Data mining tools overall

Similar to Data mining tools overall (20)

Open source presentation to Cork County Council
Open source presentation to Cork County CouncilOpen source presentation to Cork County Council
Open source presentation to Cork County Council
 
Data Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopData Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache Hadoop
 
Know thy logos
Know thy logosKnow thy logos
Know thy logos
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 Forecast
 
Intro to open source - 101 presentation
Intro to open source - 101 presentationIntro to open source - 101 presentation
Intro to open source - 101 presentation
 
Which postgres is_right_for_me_20130517
Which postgres is_right_for_me_20130517Which postgres is_right_for_me_20130517
Which postgres is_right_for_me_20130517
 
Sinergija 12 WP8 is around the corner
Sinergija 12 WP8 is around the cornerSinergija 12 WP8 is around the corner
Sinergija 12 WP8 is around the corner
 
caseywest
caseywestcaseywest
caseywest
 
caseywest
caseywestcaseywest
caseywest
 
Chap004
Chap004Chap004
Chap004
 
Top 10 DevOps tools for software development
 Top 10 DevOps tools for software development  Top 10 DevOps tools for software development
Top 10 DevOps tools for software development
 
X tuple open erp system
X tuple open erp system X tuple open erp system
X tuple open erp system
 
Pandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperiencePandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data Experience
 
Ibis: Scaling the Python Data Experience
Ibis: Scaling the Python Data ExperienceIbis: Scaling the Python Data Experience
Ibis: Scaling the Python Data Experience
 
2014.07.11 biginsights data2014
2014.07.11 biginsights data20142014.07.11 biginsights data2014
2014.07.11 biginsights data2014
 
Coding Secure Infrastructure in the Cloud using the PIE framework
Coding Secure Infrastructure in the Cloud using the PIE frameworkCoding Secure Infrastructure in the Cloud using the PIE framework
Coding Secure Infrastructure in the Cloud using the PIE framework
 
SamSegalResume
SamSegalResumeSamSegalResume
SamSegalResume
 
Big Data Technologies.pdf
Big Data Technologies.pdfBig Data Technologies.pdf
Big Data Technologies.pdf
 
Android Workshop Part 1
Android Workshop Part 1Android Workshop Part 1
Android Workshop Part 1
 
UI Dev in Big data world using open source
UI Dev in Big data world using open sourceUI Dev in Big data world using open source
UI Dev in Big data world using open source
 

More from Mohamed Sharique Vellikan

Technology The Driving Force Behind Remarketing RepossessedSurrendered Vehicl...
Technology The Driving Force Behind Remarketing RepossessedSurrendered Vehicl...Technology The Driving Force Behind Remarketing RepossessedSurrendered Vehicl...
Technology The Driving Force Behind Remarketing RepossessedSurrendered Vehicl...
Mohamed Sharique Vellikan
 

More from Mohamed Sharique Vellikan (11)

Technology The Driving Force Behind Remarketing RepossessedSurrendered Vehicl...
Technology The Driving Force Behind Remarketing RepossessedSurrendered Vehicl...Technology The Driving Force Behind Remarketing RepossessedSurrendered Vehicl...
Technology The Driving Force Behind Remarketing RepossessedSurrendered Vehicl...
 
Market segmentation
Market segmentationMarket segmentation
Market segmentation
 
Mohamed sharique(shipping agents)
Mohamed sharique(shipping agents)Mohamed sharique(shipping agents)
Mohamed sharique(shipping agents)
 
Mohamed sharique (buying and leasing)
Mohamed sharique (buying and leasing)Mohamed sharique (buying and leasing)
Mohamed sharique (buying and leasing)
 
Quality control methods
Quality control methodsQuality control methods
Quality control methods
 
HR audit
HR auditHR audit
HR audit
 
Hero motocorp
Hero motocorpHero motocorp
Hero motocorp
 
big data and cloud computing
big data and cloud computingbig data and cloud computing
big data and cloud computing
 
Canada vs India Hofstede
Canada vs India HofstedeCanada vs India Hofstede
Canada vs India Hofstede
 
Chola builders project
Chola builders projectChola builders project
Chola builders project
 
CSR - Aditya birla group
CSR - Aditya birla groupCSR - Aditya birla group
CSR - Aditya birla group
 

Recently uploaded

Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
pyhepag
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 

Recently uploaded (20)

Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Data analytics courses in Nepal Presentation
Data analytics courses in Nepal PresentationData analytics courses in Nepal Presentation
Data analytics courses in Nepal Presentation
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 

Data mining tools overall

  • 2. • Open source • Data visualization and analysis • Novice and experts • Through Python scripting • Available for all popular platforms, including Windows, Mac OS X and variants of Linux. • Founded on 1996 • Orange is distributed free under the GPL. • M&D at the Bioinformatics Laboratory of the Faculty of Computer and Information Science, University of Ljubljana, Slovenia. Product Details Company Details Python is a widely used general-purpose, high-level programming language. GNU General Public License is the most widely used free software license
  • 3. Features • Visual Programming • Visualization • Interaction and Data Analytics • Large Toolbox • Scripting Interface • Extendable • Documentation • Open Source • Platform Independence Success Stories • Astra-Zeneca, a pharmaceutical giant, which uses Orange in drug development and sponsors the development of several related parts of Orange • At Jožef Stefan Institute, the visual programming interface has been upgraded in Orange4WS to support service-oriented architectures
  • 5. • Latest R-language engine for statistical computing • Open source, R- Enterprise, R-Cloud(Paid version ) • Data visualization and analysis up to 16 TB • Extended capabilities with reproducible R tool Kits • Windows , Mac OS and variants of Linux. • Founded on 1993 in New Zealand • Robert and Rossa pioneer in R language development . • R has General Public Licence. • Many Big MNC companies are using R software. Product Details Company Details
  • 6. Useful Functions • Graphics Visualization • Spatial Data Analysis • Clustering • Text Mining • Social Network Analysis and Graph mining • Statistics • Data Manipulation Success Stories • Bank of America • Bing • Facebook • Ford • Google
  • 8. • Open source • a collection of machine learning algorithms • Data visualization and analysis • Java based platform • Most researchers and practitioners • Founded on 1997 • University of Waikato Product Details Company Details Public License is the most widely used free software license
  • 9. Features • General public license • GUI for interacting • Explorer is the main user interface of WEKA • primitive tasks including data pre-processing, classification, regression, clustering, association rules and visualization • Execute data files in multiple format • One exceptional feature of WEKA is the database connection using JDBC with any RDBMS package • The Weka mailing list has over 1100 subscribers in 50 countries, including subscribers from many major companies such as Rechtsportal Success Stories
  • 11. • Open source. • Data visualization and analysis • Machine Learning • Data Mining, Text Mining. • Business Intelligence. • Works on java runtime. • Available on all major operating systems and platforms • Started as YALE in 2001 by Ralf Klinkenberg, Ingo Mierswa, and Simon Fische • In 2006 it was renamed by Rapidminer since developed by Rapid-1 founded by Ralf Klinkenberg, Ingo Mierswa • Licensed by AGPL. Product Details Company Details
  • 12. Features • A visual - code-free - environment, so no programming needed • Design of analysis processes • Predictive analytics (with pre-made templates) • Data loading • Data transformation • Data Modelling • Data visualization (with lots of visualizations) • Allows you to work with different types and sizes of data sources • Platform Independence. • Acts as a powerful scripting language engine along with a graphical user • Modular operator concept. • CISCO • PAYPAL • EBAY • MIELE • VOLKSWAGEN Success Stories
  • 14. COMPARISON OF ALL TOOLS WEKA RAPIDMINER R- PROGRAMMING ORANGE FORMATS SUPPORTED ONLY 4 FILE FORMATS ARE SUPPORTED SUPPORTS MORE FILE FORMATS (Approx 22) SUPPORTS MORE FILE FORMATS SUPPORTS MORE FILE FORMATS USER INTERFACE EASY USER INTERFACE DIFFICULT USER INTERFACE SIMPLE IN UNIX OS,DIFFICULT IN WINDOWS AND MAC EASY CONNECTIVITY WORSE CONNECTIVITY WITH EXCEL AND NON JAVA DATABASES EASILY CONNECTED WITH EXCEL EASY CONNECTIVITY WITH EXCEL AND OTHER DATABASES BETTER THAN WEKA
  • 15. Orange has elegant and concise scripting and can also be run in an ETL GUI mode. R has elegant and concise scripting integrated with a vast statistical library. RapidMiner has a lot of functionality, is polished and has good connectivity. WEKA is the easiest GUI to learn and use.
  • 16. • http://old.biolab.si/ • http://en.wikipedia.org/ • http://www.predictiveanalyticsto day.com/ • http://thenewstack.io/ • www.facebook.com/ • www.slideshare.net/ • www.kdnuggets.com/ • www.researchgate.net • https://rapidminer.com/ • www.r-project.org • sourceforge.net/projects/weka • www.thearling.com

Editor's Notes

  1. contains a GUI for interacting with data files and producing visual results
  2. Explorer has several panels providing access to the main components of the workbench: the Preprocess panel has facilities for importing data from a database, a CSV file, etc, and to preprocess this data using a filtering algorithm. Such filters can be used to transform the data and make it possible to delete instances and attributes as per specific criteria. The Classify panel provides the features to apply classification and regression algorithms to the dataset, to estimate the accuracy of the resulting predictive model and visualise erroneous predictions, ROC curves or the model. The Associate panel provides the access for association rule learning to identify the interrelationships between attributes in the data. The Cluster panel or module provides access to the clustering techniques, including simple k-means algorithm and many others. The Select attributes panel provides access to the algorithms for the identification of the most predictive attributes in a dataset. The Visualize panel depicts a scatter plot matrix in which individual scatter plots can be selected, enlarged and analysed using various selection operators.