SlideShare a Scribd company logo
1 of 17
RAPIDMINER: INTRODUCTION TO DATAMINING
AGENDA ,[object Object]
Introduction to RapidMiner
Use of RapidMiner for Data Mining
Download and Installation Steps
Memory Usage , Plug-ins & Settings
Supported File Formats,[object Object]
Different levels of analysis that are available: Artificial neural networks – Non-linear predictive models that resemble biological neural networks in structure. Genetic algorithms - Optimization techniques that use processes such as genetic combination, mutation, and natural selection in a design based on the concepts of natural evolution. Decision trees – Provide a set of rules that you can apply to a new dataset to predict the outcome.        Examples: ,[object Object]
Chi Square Automatic Interaction Detection (CHAID) . CART and CHAID are decision tree techniques used for classification of a dataset. Rule induction – The extraction of useful if-then rules from data based on statistical significance. Nearest neighbor – Classify records based on the k-most similar  records Data visualization - Visual interpretation of complex relationships in multidimensional data.
Applications Can be divided into four major kinds: Classification Numerical prediction Association Clustering Some examples: Automatic abstraction Financial forecasting Targeted marketing Medical diagnosis Credit card fraud detection Weather forecasting etc.
Introduction to RapidMiner RapidMiner (formerly YALE*)is an environment for machine learning and data mining experiments.  RapidMiner is used for both research and real-world data mining tasks. Software versions:  ,[object Object]
Enterprise edition (Community Edition + More Features + Services + Guarantees) *YALE - Yet Another Learning Environment
  Some properties of RapidMiner: Written in Java Knowledge discovery processes are modelled as operator trees Internal XML representation ensures standardized interchange format of data mining experiments Scriptinglanguage allows for automating large-scale experiments Multi-layered data view concept ensures efficient and transparent data handling GUI, command-line mode (batch mode), and Java API for using RapidMiner from other programs Several plugins already exist A large set of high-dimensional visualization schemes for data and models offered by its plotting facility. Applications: text mining, multimedia mining, feature engineering, data stream mining and tracking drifting concepts, development of ensemble methods, and distributed data mining.
Use of RapidMiner for Data Mining Using RapidMiner ,[object Object]
GUI can be used to design XML description of the operator tree
Break points can be used to check the intermediate resultsUse from a separate program Command line version and Java API can be used to invoke RapidMiner in your programs without using the GUI
Download and Installation Steps Download The latest version of RapidMiner can be downloaded from http://rapid-i.com/content/blogsection/7/82/lang,en/ by selecting the appropriate version(Windows x86, x64 etc.) and RapidMiner edition Installation Windows executable Download the windows executable (.exe) file Double-click the rapidminer-xxx-instal.exe file to run it Follow the instructions

More Related Content

Viewers also liked

Data Mining
Data MiningData Mining
Data Mining
brobelo
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 

Viewers also liked (20)

Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7
 
Tutorial de instalacion de sql-server 2012 en windows 7 y 8.1
Tutorial de instalacion de sql-server 2012 en windows 7 y 8.1Tutorial de instalacion de sql-server 2012 en windows 7 y 8.1
Tutorial de instalacion de sql-server 2012 en windows 7 y 8.1
 
Data Science Thailand Meetup#11
Data Science Thailand Meetup#11Data Science Thailand Meetup#11
Data Science Thailand Meetup#11
 
Data mining ppt
Data mining pptData mining ppt
Data mining ppt
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Knowledge engineering
Knowledge engineeringKnowledge engineering
Knowledge engineering
 
Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...
 
Data Mining
Data MiningData Mining
Data Mining
 
Modelos predictivos: datos, métodos, problemas y aplicaciones
Modelos predictivos: datos, métodos, problemas y aplicacionesModelos predictivos: datos, métodos, problemas y aplicaciones
Modelos predictivos: datos, métodos, problemas y aplicaciones
 
Métodos predictivos y Descriptivos - MINERÍA DE DATOS
Métodos predictivos y Descriptivos - MINERÍA DE DATOSMétodos predictivos y Descriptivos - MINERÍA DE DATOS
Métodos predictivos y Descriptivos - MINERÍA DE DATOS
 
Mineria de Datos
Mineria de DatosMineria de Datos
Mineria de Datos
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
 
Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMinerMining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMiner
 
Presentacion data mining (mineria de datos)- base de datos
Presentacion data mining (mineria de datos)- base de datosPresentacion data mining (mineria de datos)- base de datos
Presentacion data mining (mineria de datos)- base de datos
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model Selection
 

More from Rapidmining Content (11)

RapidMiner: Data Mining And Rapid Miner
RapidMiner:  Data Mining And Rapid MinerRapidMiner:  Data Mining And Rapid Miner
RapidMiner: Data Mining And Rapid Miner
 
RapidMiner: Setting Up A Process
RapidMiner:  Setting Up A ProcessRapidMiner:  Setting Up A Process
RapidMiner: Setting Up A Process
 
RapidMiner: Rapid Miner Products
RapidMiner:  Rapid Miner ProductsRapidMiner:  Rapid Miner Products
RapidMiner: Rapid Miner Products
 
RapidMiner: Advanced Processes And Operators
RapidMiner:  Advanced Processes And OperatorsRapidMiner:  Advanced Processes And Operators
RapidMiner: Advanced Processes And Operators
 
RapidMiner: Learning Schemes In Rapid Miner5
RapidMiner:   Learning Schemes In Rapid Miner5RapidMiner:   Learning Schemes In Rapid Miner5
RapidMiner: Learning Schemes In Rapid Miner5
 
RapidMiner: Performance Validation And Visualization
RapidMiner:  Performance Validation And VisualizationRapidMiner:  Performance Validation And Visualization
RapidMiner: Performance Validation And Visualization
 
Rapid Miner: Data Transformation
Rapid Miner:   Data TransformationRapid Miner:   Data Transformation
Rapid Miner: Data Transformation
 
Rapid Miner: Nested Subprocesses
Rapid Miner:  Nested SubprocessesRapid Miner:  Nested Subprocesses
Rapid Miner: Nested Subprocesses
 
Rapidminer: Visualization Capabilities
Rapidminer:   Visualization CapabilitiesRapidminer:   Visualization Capabilities
Rapidminer: Visualization Capabilities
 
Rapidminer: Modelling Data
Rapidminer:  Modelling DataRapidminer:  Modelling Data
Rapidminer: Modelling Data
 
Rapidminer: Important Elements
Rapidminer: Important ElementsRapidminer: Important Elements
Rapidminer: Important Elements
 

Recently uploaded

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

RAPIDMINER: Introduction To Rapidminer

  • 2.
  • 4. Use of RapidMiner for Data Mining
  • 6. Memory Usage , Plug-ins & Settings
  • 7.
  • 8.
  • 9. Chi Square Automatic Interaction Detection (CHAID) . CART and CHAID are decision tree techniques used for classification of a dataset. Rule induction – The extraction of useful if-then rules from data based on statistical significance. Nearest neighbor – Classify records based on the k-most similar records Data visualization - Visual interpretation of complex relationships in multidimensional data.
  • 10. Applications Can be divided into four major kinds: Classification Numerical prediction Association Clustering Some examples: Automatic abstraction Financial forecasting Targeted marketing Medical diagnosis Credit card fraud detection Weather forecasting etc.
  • 11.
  • 12. Enterprise edition (Community Edition + More Features + Services + Guarantees) *YALE - Yet Another Learning Environment
  • 13. Some properties of RapidMiner: Written in Java Knowledge discovery processes are modelled as operator trees Internal XML representation ensures standardized interchange format of data mining experiments Scriptinglanguage allows for automating large-scale experiments Multi-layered data view concept ensures efficient and transparent data handling GUI, command-line mode (batch mode), and Java API for using RapidMiner from other programs Several plugins already exist A large set of high-dimensional visualization schemes for data and models offered by its plotting facility. Applications: text mining, multimedia mining, feature engineering, data stream mining and tracking drifting concepts, development of ensemble methods, and distributed data mining.
  • 14.
  • 15. GUI can be used to design XML description of the operator tree
  • 16. Break points can be used to check the intermediate resultsUse from a separate program Command line version and Java API can be used to invoke RapidMiner in your programs without using the GUI
  • 17. Download and Installation Steps Download The latest version of RapidMiner can be downloaded from http://rapid-i.com/content/blogsection/7/82/lang,en/ by selecting the appropriate version(Windows x86, x64 etc.) and RapidMiner edition Installation Windows executable Download the windows executable (.exe) file Double-click the rapidminer-xxx-instal.exe file to run it Follow the instructions
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. Supported File Formats Can read data files, read & write models, parameter sets and attribute sets. Most important – examples and instances
  • 23. Data files & attribute description files ARFFEXAMPLESOURCE - .arff format DATABASEEXAMPLESOURCE – To read from databases SPARSEFORMATEXAMPLESOURCE DENSEFORMATEXAMPLESOURCE Attribute description file (.aml) in order to retrieve metadata about the instances XML Attributes that can be set: Name – unique name of the attribute Sourcefile – name of the file containing the data(default used if not specified) Sourcecol –column within the file(Starting from 1) Sourcecol_end – sourcecol-sourcecol_end attributes are generated with the same properties. Valuetype– one out of nominal,numeric, integer, real, ordered, binominal, polynominal and file_path Blocktype – one out of single_value, value_series, value_series_start, value_series_end, interval, interval_start, interval_end
  • 24. Model files (.mod files) Contains the models generated by previous runs MODELWRITER – to write model files MODELLOADER – to read model files MODELAPPLIER – to apply model files Attribute construction files (.att files) ATTRIBUTECONSTRUCTIONWRITER – writes an attribute set ATTRIBUTECONSTRUCTIONLOADER – reads an attribute set Parameter set files (.par files) GRIDPARAMETEROPTIMIZTION – generates a set of optimal parameters for a particular task PARAMETERSETLOADER – use the parameter files Attribute weight files (.wgt files) Attibute selection is seen as attribute weighing which allows for more flexibility ATTRIBUTEWEIGHTSWRITER – to write attribute weights to a file ATTRIBUTEWEIGHTSLOADER – to read the attribute weights ATTRIBUTEWEIGHTSAPPLIER – to apply in the example sets
  • 26. More questions… Reach us at support@dataminingtools.net VISIT: WWW.DATAMININGTOOLS.NET