SlideShare a Scribd company logo
1 of 14
Introduction to XLMiner™:  PARTITION DATA XLMiner and Microsoft Office are registered trademarks of the respective owners.
Introduction to Partition Data Generally the data sets used in mining are enormous. Hence in order to mine data easily ,one method is to divide/partition data. Partitioning data means dividing the data set into multiple partitions that are mutually exclusive i.e. they do not overlap or the partitions have no data records are common. Partitioning data generally results in 3 sets of data: Training Data set :- This partition is used to create/build the mining model. Validation Data set :- : It is used to check whether the model developed using the training set is accurate or not. The validation set consists of data whose result (the value of the variable to be determined) is already known so that results obtained after applying the model and the actual results can be matched. Test data set :- It is used to determine how the model would perform when it encounters real world data.  http://dataminingtools.net
Types of Partitions XLMiner allows us to create 2 kinds of partitions: Standard Partition: Creates 3 partitions based on the partition ratios provided. Data records are randomly elected and every record  has an equal chance of lying in any of the partition. ,[object Object]
Specify percentages :Unlike automatic, if selected ,the user can specify the ratio of the partitions created in terms of percentages.
Equal partitions: Selecting this option sets a partitioning ratio of 33.3(training): 33.3(validation): 33.3(test) .Partition with oversampling: This method of partitioning is used when the percentage of successes in the output variable is very low in the dataset but we want to train the data with a particular percentage of successes. http://dataminingtools.net
Data Set used for Partition http://dataminingtools.net
Standard Partition (Automatic)-Step 1 http://dataminingtools.net
Standard Partition (Automatic)-Output 	Testing Set			Validation Set http://dataminingtools.net
Standard Partition (Specify)-Step 1 Selecting “Specify percentages” allows us to set the partitioning ratios as per our need. Here we have set a ratio of 50(testing):30(validation):20(test) http://dataminingtools.net
Standard Partition (Equal)-Step 1 Selecting “Equal” sets the partitioning ratio at 33.3% for each partition creating 3 equal sized partitions. http://dataminingtools.net
Oversampled Partition – Data Set In order to oversample a data set, it must contain at least 1 data item that accepts only 2 distinct values, not more and only then can it be used as the success class(the data item which is oversampled) http://dataminingtools.net
Oversampled Partition – Step 1 http://dataminingtools.net
Oversampled Partition – Output The records in the training data set http://dataminingtools.net
Oversampled Partition – Output Rows in Validation set = 27,  		Rows in testing set = 30% of 27 = 12. http://dataminingtools.net

More Related Content

Viewers also liked

XL-Miner: Classification
XL-Miner: ClassificationXL-Miner: Classification
XL-Miner: Classificationxlminer content
 
XL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl MinerXL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl Minerxlminer content
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 

Viewers also liked (20)

XL-MINER:Data Utilities
XL-MINER:Data UtilitiesXL-MINER:Data Utilities
XL-MINER:Data Utilities
 
XL-Miner: Classification
XL-Miner: ClassificationXL-Miner: Classification
XL-Miner: Classification
 
XL-Miner: Time Series
XL-Miner: Time SeriesXL-Miner: Time Series
XL-Miner: Time Series
 
XL MINER: Associations
XL MINER: AssociationsXL MINER: Associations
XL MINER: Associations
 
XL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl MinerXL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl Miner
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
XL-MINER:Prediction
XL-MINER:PredictionXL-MINER:Prediction
XL-MINER:Prediction
 
Prueba de corridas arriba y abajo de la media
Prueba de corridas arriba y abajo de la mediaPrueba de corridas arriba y abajo de la media
Prueba de corridas arriba y abajo de la media
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
XL-MINER: Data Exploration
XL-MINER: Data ExplorationXL-MINER: Data Exploration
XL-MINER: Data Exploration
 
Introduction To XL-Miner
Introduction To XL-MinerIntroduction To XL-Miner
Introduction To XL-Miner
 
XL-MINER: Data Utilities
XL-MINER: Data UtilitiesXL-MINER: Data Utilities
XL-MINER: Data Utilities
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 
Ria
RiaRia
Ria
 
Radio immuno assay
Radio immuno assayRadio immuno assay
Radio immuno assay
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 

Similar to XL-MINER:Partition (20)

Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
prova4
prova4prova4
prova4
 
provalast
provalastprovalast
provalast
 
test3
test3test3
test3
 
test2
test2test2
test2
 
provoora
provooraprovoora
provoora
 
remoto2
remoto2remoto2
remoto2
 
provacompleta2
provacompleta2provacompleta2
provacompleta2
 
finalelocale2
finalelocale2finalelocale2
finalelocale2
 
domenica2
domenica2domenica2
domenica2
 
provarealw4
provarealw4provarealw4
provarealw4
 
test2
test2test2
test2
 
prova3
prova3prova3
prova3
 
stasera1
stasera1stasera1
stasera1
 
provarealw2
provarealw2provarealw2
provarealw2
 
prova5
prova5prova5
prova5
 
provarealw3
provarealw3provarealw3
provarealw3
 
finalelocale
finalelocalefinalelocale
finalelocale
 
testsfw3
testsfw3testsfw3
testsfw3
 
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 

XL-MINER:Partition

  • 1. Introduction to XLMiner™: PARTITION DATA XLMiner and Microsoft Office are registered trademarks of the respective owners.
  • 2. Introduction to Partition Data Generally the data sets used in mining are enormous. Hence in order to mine data easily ,one method is to divide/partition data. Partitioning data means dividing the data set into multiple partitions that are mutually exclusive i.e. they do not overlap or the partitions have no data records are common. Partitioning data generally results in 3 sets of data: Training Data set :- This partition is used to create/build the mining model. Validation Data set :- : It is used to check whether the model developed using the training set is accurate or not. The validation set consists of data whose result (the value of the variable to be determined) is already known so that results obtained after applying the model and the actual results can be matched. Test data set :- It is used to determine how the model would perform when it encounters real world data. http://dataminingtools.net
  • 3.
  • 4. Specify percentages :Unlike automatic, if selected ,the user can specify the ratio of the partitions created in terms of percentages.
  • 5. Equal partitions: Selecting this option sets a partitioning ratio of 33.3(training): 33.3(validation): 33.3(test) .Partition with oversampling: This method of partitioning is used when the percentage of successes in the output variable is very low in the dataset but we want to train the data with a particular percentage of successes. http://dataminingtools.net
  • 6. Data Set used for Partition http://dataminingtools.net
  • 7. Standard Partition (Automatic)-Step 1 http://dataminingtools.net
  • 8. Standard Partition (Automatic)-Output Testing Set Validation Set http://dataminingtools.net
  • 9. Standard Partition (Specify)-Step 1 Selecting “Specify percentages” allows us to set the partitioning ratios as per our need. Here we have set a ratio of 50(testing):30(validation):20(test) http://dataminingtools.net
  • 10. Standard Partition (Equal)-Step 1 Selecting “Equal” sets the partitioning ratio at 33.3% for each partition creating 3 equal sized partitions. http://dataminingtools.net
  • 11. Oversampled Partition – Data Set In order to oversample a data set, it must contain at least 1 data item that accepts only 2 distinct values, not more and only then can it be used as the success class(the data item which is oversampled) http://dataminingtools.net
  • 12. Oversampled Partition – Step 1 http://dataminingtools.net
  • 13. Oversampled Partition – Output The records in the training data set http://dataminingtools.net
  • 14. Oversampled Partition – Output Rows in Validation set = 27, Rows in testing set = 30% of 27 = 12. http://dataminingtools.net
  • 15. Thank you For more visit: http://dataminingtools.net http://dataminingtools.net
  • 16. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net