Submit Search
Upload
Attribute Reduction:An Implementation of Heuristic Algorithm using Apache Spark
•
0 likes
•
29 views
IRJET Journal
Follow
https://www.irjet.net/archives/V4/i12/IRJET-V4I12167.pdf
Read less
Read more
Engineering
Report
Share
Report
Share
1 of 3
Download now
Download to read offline
Recommended
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...
IRJET Journal
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET Journal
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
AM Publications
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Benjamin Bengfort
probabilistic ranking
probabilistic ranking
FELIX75
Enhancing Big Data Analysis by using Map-reduce Technique
Enhancing Big Data Analysis by using Map-reduce Technique
journalBEEI
Master defense presentation 2019 04_18_rev2
Master defense presentation 2019 04_18_rev2
Hyun Wong Choi
Data clustering using map reduce
Data clustering using map reduce
Varad Meru
Recommended
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...
IRJET Journal
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET Journal
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
AM Publications
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Benjamin Bengfort
probabilistic ranking
probabilistic ranking
FELIX75
Enhancing Big Data Analysis by using Map-reduce Technique
Enhancing Big Data Analysis by using Map-reduce Technique
journalBEEI
Master defense presentation 2019 04_18_rev2
Master defense presentation 2019 04_18_rev2
Hyun Wong Choi
Data clustering using map reduce
Data clustering using map reduce
Varad Meru
Cost Efficient PageRank Computation using GPU : NOTES
Cost Efficient PageRank Computation using GPU : NOTES
Subhajit Sahu
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016
ijcsbi
Ch14
Ch14
phanleson
Predict the Average Temperatures of Baghdad City by Used Artificial Neural Ne...
Predict the Average Temperatures of Baghdad City by Used Artificial Neural Ne...
IJERA Editor
Short Term Electrical Load Forecasting by Artificial Neural Network
Short Term Electrical Load Forecasting by Artificial Neural Network
IJERA Editor
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
Ankit Rathi
An Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal Clusters
IJCSEA Journal
Principal component analysis and lda
Principal component analysis and lda
Suresh Pokharel
Load shedding in power system using the AHP algorithm and Artificial Neural N...
Load shedding in power system using the AHP algorithm and Artificial Neural N...
IJAEMSJORNAL
Pca analysis
Pca analysis
kunasujitha
Cloud workload analysis and simulation
Cloud workload analysis and simulation
Prabhakar Ganesamurthy
CCC-Bicluster Analysis for Time Series Gene Expression Data
CCC-Bicluster Analysis for Time Series Gene Expression Data
IRJET Journal
SECh1214
SECh1214
Joe Christensen
Fusion method used to tolerate the faults occurred in disrtibuted system
Fusion method used to tolerate the faults occurred in disrtibuted system
eSAT Publishing House
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratch
EshanAgarwal4
Optimized Access Strategies for a Distributed Database Design
Optimized Access Strategies for a Distributed Database Design
Waqas Tariq
RAIL: Risk-Averse Imitation Learning | Invited talk at Intel AI Workshop at K...
RAIL: Risk-Averse Imitation Learning | Invited talk at Intel AI Workshop at K...
Anirban Santara
Principal Component Analysis
Principal Component Analysis
Ricardo Wendell Rodrigues da Silveira
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
mailjkb
IRJET-Attribute Reduction using Apache Spark
IRJET-Attribute Reduction using Apache Spark
IRJET Journal
Water Quality Index Calculation of River Ganga using Decision Tree Algorithm
Water Quality Index Calculation of River Ganga using Decision Tree Algorithm
IRJET Journal
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
IRJET Journal
More Related Content
What's hot
Cost Efficient PageRank Computation using GPU : NOTES
Cost Efficient PageRank Computation using GPU : NOTES
Subhajit Sahu
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016
ijcsbi
Ch14
Ch14
phanleson
Predict the Average Temperatures of Baghdad City by Used Artificial Neural Ne...
Predict the Average Temperatures of Baghdad City by Used Artificial Neural Ne...
IJERA Editor
Short Term Electrical Load Forecasting by Artificial Neural Network
Short Term Electrical Load Forecasting by Artificial Neural Network
IJERA Editor
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
Ankit Rathi
An Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal Clusters
IJCSEA Journal
Principal component analysis and lda
Principal component analysis and lda
Suresh Pokharel
Load shedding in power system using the AHP algorithm and Artificial Neural N...
Load shedding in power system using the AHP algorithm and Artificial Neural N...
IJAEMSJORNAL
Pca analysis
Pca analysis
kunasujitha
Cloud workload analysis and simulation
Cloud workload analysis and simulation
Prabhakar Ganesamurthy
CCC-Bicluster Analysis for Time Series Gene Expression Data
CCC-Bicluster Analysis for Time Series Gene Expression Data
IRJET Journal
SECh1214
SECh1214
Joe Christensen
Fusion method used to tolerate the faults occurred in disrtibuted system
Fusion method used to tolerate the faults occurred in disrtibuted system
eSAT Publishing House
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratch
EshanAgarwal4
Optimized Access Strategies for a Distributed Database Design
Optimized Access Strategies for a Distributed Database Design
Waqas Tariq
RAIL: Risk-Averse Imitation Learning | Invited talk at Intel AI Workshop at K...
RAIL: Risk-Averse Imitation Learning | Invited talk at Intel AI Workshop at K...
Anirban Santara
Principal Component Analysis
Principal Component Analysis
Ricardo Wendell Rodrigues da Silveira
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
mailjkb
What's hot
(19)
Cost Efficient PageRank Computation using GPU : NOTES
Cost Efficient PageRank Computation using GPU : NOTES
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016
Ch14
Ch14
Predict the Average Temperatures of Baghdad City by Used Artificial Neural Ne...
Predict the Average Temperatures of Baghdad City by Used Artificial Neural Ne...
Short Term Electrical Load Forecasting by Artificial Neural Network
Short Term Electrical Load Forecasting by Artificial Neural Network
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
An Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal Clusters
Principal component analysis and lda
Principal component analysis and lda
Load shedding in power system using the AHP algorithm and Artificial Neural N...
Load shedding in power system using the AHP algorithm and Artificial Neural N...
Pca analysis
Pca analysis
Cloud workload analysis and simulation
Cloud workload analysis and simulation
CCC-Bicluster Analysis for Time Series Gene Expression Data
CCC-Bicluster Analysis for Time Series Gene Expression Data
SECh1214
SECh1214
Fusion method used to tolerate the faults occurred in disrtibuted system
Fusion method used to tolerate the faults occurred in disrtibuted system
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratch
Optimized Access Strategies for a Distributed Database Design
Optimized Access Strategies for a Distributed Database Design
RAIL: Risk-Averse Imitation Learning | Invited talk at Intel AI Workshop at K...
RAIL: Risk-Averse Imitation Learning | Invited talk at Intel AI Workshop at K...
Principal Component Analysis
Principal Component Analysis
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
Similar to Attribute Reduction:An Implementation of Heuristic Algorithm using Apache Spark
IRJET-Attribute Reduction using Apache Spark
IRJET-Attribute Reduction using Apache Spark
IRJET Journal
Water Quality Index Calculation of River Ganga using Decision Tree Algorithm
Water Quality Index Calculation of River Ganga using Decision Tree Algorithm
IRJET Journal
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
IRJET Journal
Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...
eSAT Journals
Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...
eSAT Publishing House
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET Journal
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET Journal
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
IRJET Journal
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data Analysis
IRJET Journal
Machine Learning - Simple Linear Regression
Machine Learning - Simple Linear Regression
Siddharth Shrivastava
Sorting_project_2.pdf
Sorting_project_2.pdf
VrushaliSathe2
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithm
IRJET Journal
Parallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive Indexing
IRJET Journal
IRJET - Airplane Crash Analysis and Prediction using Machine Learning
IRJET - Airplane Crash Analysis and Prediction using Machine Learning
IRJET Journal
Energy analytics with Apache Spark workshop
Energy analytics with Apache Spark workshop
QuantUniversity
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
IRJET Journal
IRJET- Privacy Preservation using Apache Spark
IRJET- Privacy Preservation using Apache Spark
IRJET Journal
Machine Learning, K-means Algorithm Implementation with R
Machine Learning, K-means Algorithm Implementation with R
IRJET Journal
BATCH 1 FIRST REVIEW-1.pptx
BATCH 1 FIRST REVIEW-1.pptx
SurajRavi16
Visualizing and Forecasting Stocks Using Machine Learning
Visualizing and Forecasting Stocks Using Machine Learning
IRJET Journal
Similar to Attribute Reduction:An Implementation of Heuristic Algorithm using Apache Spark
(20)
IRJET-Attribute Reduction using Apache Spark
IRJET-Attribute Reduction using Apache Spark
Water Quality Index Calculation of River Ganga using Decision Tree Algorithm
Water Quality Index Calculation of River Ganga using Decision Tree Algorithm
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data Analysis
Machine Learning - Simple Linear Regression
Machine Learning - Simple Linear Regression
Sorting_project_2.pdf
Sorting_project_2.pdf
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithm
Parallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive Indexing
IRJET - Airplane Crash Analysis and Prediction using Machine Learning
IRJET - Airplane Crash Analysis and Prediction using Machine Learning
Energy analytics with Apache Spark workshop
Energy analytics with Apache Spark workshop
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
IRJET- Privacy Preservation using Apache Spark
IRJET- Privacy Preservation using Apache Spark
Machine Learning, K-means Algorithm Implementation with R
Machine Learning, K-means Algorithm Implementation with R
BATCH 1 FIRST REVIEW-1.pptx
BATCH 1 FIRST REVIEW-1.pptx
Visualizing and Forecasting Stocks Using Machine Learning
Visualizing and Forecasting Stocks Using Machine Learning
More from IRJET Journal
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
React based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
More from IRJET Journal
(20)
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
React based fullstack edtech web application
React based fullstack edtech web application
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Recently uploaded
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
Arindam Chakraborty, Ph.D., P.E. (CA, TX)
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
Morshed Ahmed Rahath
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
BhangaleSonal
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
RashidFaridChishti
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
SCMS School of Architecture
Hostel management system project report..pdf
Hostel management system project report..pdf
Kamal Acharya
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
vershagrag
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
pritamlangde
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
meghakumariji156
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptx
NANDHAKUMARA10
Thermal Engineering Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
DineshKumar4165
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
BhangaleSonal
School management system project Report.pdf
School management system project Report.pdf
Kamal Acharya
Online food ordering system project report.pdf
Online food ordering system project report.pdf
Kamal Acharya
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
SCMS School of Architecture
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
jabtakhaidam7
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
Amil baba
Computer Graphics Introduction To Curves
Computer Graphics Introduction To Curves
ChandrakantDivate1
Recently uploaded
(20)
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
Hostel management system project report..pdf
Hostel management system project report..pdf
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptx
Thermal Engineering Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
School management system project Report.pdf
School management system project Report.pdf
Online food ordering system project report.pdf
Online food ordering system project report.pdf
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
Computer Graphics Introduction To Curves
Computer Graphics Introduction To Curves
Attribute Reduction:An Implementation of Heuristic Algorithm using Apache Spark
1.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 877 Attribute Reduction:An Implementation of Heuristic Algorithm using Apache Spark Pratik Ghodke1, Vishesh Shah2, Sani Asnain Mulla3, Aman Pawar4,Prof. Bhushan Pawar5 1,2,3,4 Students, 5Assistant Professor Department of Computer Engineering, Savitribai Phule Pune University, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Most weather event occurs in the troposphere, the lowest level of atmosphere. Weather describethedegreeof the condition as hot or cold, minimum and maximum temperature, clear or cloudy, atmospheric pressure. This degree changes at instant time due to atmospheric condition. Each degree can be considered as attribute in this data which is important, but many attributes within this data are redundant and unnecessary. The structure of Weather Forecast data is complex and consists of structured data, unstructured data and semi-structured data. However, many attribute reduction algorithms are time-consuming. To solve this problem, we are using the methodology of rough sets to build Weather Forecast knowledge representation system. By utilizing good benefits of in-memory computing,an attribute reduction algorithm for weather forecast using Spark is proposed. In this algorithm, we are using a heuristic formula for measuring the importance of the attribute to reduce search space, and a suitable algorithm for simplifying weather forecast decision table, which improves the computation power. Key Words: Big Data, Spark, Resilient Distributed Data, Rough Set Theory 1. INTRODUCTION Weather forecast data plays very important role in the development of the nation economy that based on natural resources. As the weather data is collected from various sources, the attributes in the data may be redundant and also may not be useful to predict the correct climate of a particular area at instant time. Such attributes may increase the running cost and decreases the performance of the data mining algorithms. So to make the algorithm more powerful it is necessary to remove unimportant attributesand select only effective attribute. To solve this problem many attribute reduction algorithms are introduced but are time consuming. Rough set theory is used from extracting the vital information. The features and advantagesof the Spark are usefultodetermine attribute reduction algorithms to compress the data by removing redundant attributes which will be useful in decision making and reduce the time complexity for processing the algorithm. The heuristicfunctionusedstudies the importance of attribute to reduce search space of algorithm. 2. THEORETICAL BACKGROUND [1] Pawlak in 1982 proposed Rough Set Theory, the most high-powered mathematical technique appliedtohandlethe blurriness and ambiguity of data. It demonstrates that the rough set technique can be used for blend and examine the approximations in the distributed context. [2] M. Zaharia proposed that ResilientDistributedDatasetsis used for describing a programming interface that efficiently provides fault tolerance. [3] M. R. Anderson and M. Cafarella proposed a data-centric system called Zombie, which speed ups feature engineering with the help of quick-witted input selection,whichoptimize the inner loop of the process. [4] J. Zhang et.al. have proposed three different parallel matrix-based techniques to actions on large-scale,imperfect data. They are constructed on MapReduce andperformedon Twister, a lightweight runtime system. [5] J. Zhang et.al. Proposed a parallel approach for large- scale attribute reduction based on Hadoop MapReduce.This method analysis shows that these parallel modules are powerful and efficient for big data. The classification accuracy is enhanced by this module and processing is done faster. 2.1 SPARK A lightning-fast cluster computing technology known as Spark is used for fast computation. It computes interactive queries and stream processing using extended MapReduce. In-memory cluster computing, is the main trait of Spark speed-ups the processing of an application. Main Components of Spark are GraphX, MLib, Spark Streaming and Spark SQL. The fundamental data structure of Spark is Resilient Distributed Datasets (RDD).RDD is a fault-tolerant assembly of elements, which is works in parallel. It is inexhaustible. Each dataset is divided and computed on different cluster nodes. RDDs support Python, Java or Scala objects, also for user-defined classes.
2.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 878 3. ALGORITHM Based on notation given by Rough Set Theory, a weather forecast decision table can be illustrated as W = (U, A, V, f). U: is the non-empty set of all weather forecast data and can be depicted as U= {w1, w2, w3,……, wn}. A: is the non-empty set of attributes associated with the forecast data and can be depicted as A= C U D, where the elements in C are conditional attributes and elements belongs to D are decision making attributes for the records of weather forecast data. V: it gives the value of the attributes present in A. f: it is a mapping function used to get a value V associated with a single record from set U. POS: is the positive region used to classify new elements from the previously classified elements of set U Module 1 Input: Weather Forecast Decision table. Output: A consistent weather forecast table which can be mathematically visualized as W’= (U’, A’, V’, f’). Steps: 1. Load the given/considered weather forecast decision table using appropriate function /method /API given by the language. (For e.g. spark .text file () method in Apache Spark). 2. Classify all the elements of U based on conditional attributes of A such that each equivalent class Ei is mapped as a key to a value Vi as an id of records present in U. 3. Find the most frequent value from an equivalent class E and add it to the new Forecast Decision Table W’. 4. Stop. Module 2 Input: W’= (U’, A’, V’, f’). Output: Resultant reduced attributes. Steps: 1. Initialize a temporary attribute reduction set. 2. Execute Module 3 to calculate equivalent classE for each attribute a є A’. 3. Execute Module 4 to compute attribute significance and POS of the considered attribute. 4. Select the best significant attribute from A’ and add it to the resultant attribute set. 5. Eliminate the record from U’ associated with the selected attribute ‘a’ and update U’. 6. Use Module 1 to transform the updated weather forecast decision table. 7. Check if U’ is empty If (U’ is empty) Display the resultant attribute set. Else Repeat from step 3. 8. Stop. Module 3 Input: Resultant attribute set. Output: Equivalent Class E’i. Steps: 1. Classify all the elements of U’ based on conditional attributes of A’ such that each equivalent class E’i is mapped as a key to a value V’i as an id of records present in U’. 2. Stop. Module 4 Input: Equivalent Class of the candidate attribute. Output: Significance and POS of candidate attribute. Steps: 1. Check the consistency of the candidate attribute. If (candidate attribute is consistent) Compute the POS of the attribute else Remove the equivalent class E’. 2. Compute the significance of the candidate attribute from the calculated POS. 3. Display significance and POS of the candidate attribute. 4. Stop. 4. METHODOLOGY The rate of increasing data day by day in data age is tremendous. Complex data consists of structured data, un- structured data and semi-structured data generated for diverse system. Each attribute in this dataset is important,
3.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 879 but many attributes within it are redundant and unnecessary. To reduce the attributes, algorithm for attribute reduction in Spark is used to minimize the data attributes. In-memory cluster computing, is the main traitof Spark speed-ups the processing of an application. For better resource minimization and processing speed, Cluster Manager is used. An executor is assigned to eachand every application, which is awake till the whole application is performed and multiple threads are used to run tasks. Spark is sceptic to the subordinate cluster manager. The driver program should listen and accept connections received from executors during its lifetime. The driver program should be network available through the worker nodes. The architecture first module performs the algorithm to get decision making for rough set theory of equivalenceclass decision table because attribute reduction cannot be performed on unsteady decision table. Then using module 2 heuristic function, we can reduce the time complexity of equivalence class and get reduce search space. Module 3 chooses the best candidate from equivalence class and then it is added to attribute reduct. Every time candidate is added, attribute reduct is updated. Module 4 first reckons positive region, the module checks if decision attribute values are inconsistent or consistent. If attribute is inconsistent then the equivalence class will be deleted. Output of the intermediate result is stored. 5. EXPERIMENTAL ENVIRONMENT To run parallel algorithms, we create a cluster of computers consisting four nodes. One computer serves as a master node, and the other three computers serve as slave nodes. They are connected with each other via an Giga- Ethernet. Detailed information regarding the configuration of software and hardware is described in Table I. Table 1: Environment of Experiments Resource Description Hardware 2 GHz processor, Network cards, 8GB RAM Software Ubuntu 14.04, Scala, Apache Spark 6. PROPOSED OUTCOMES The outcome of the project will be storedinanequivalence class which can be accessed by the user. Output of the project represents the reduction in the attributes and data processing cost due to Spark which processes in-memory clusters computing. The state of memory is stored asobject which isshareable. The algorithm is capable of better load sharing and handle fault tolerance. The disk has replicated copies of RDD’s available at each node. REFERENCES [1] Z. Pawlak and A. Skowron, “Rough sets: Some extensions,” Information Sciences, vol. 177, no. 1, pp. 28–40, 2007. [2] M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, “Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing,” USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012, pp. 2–2. [3] M. R. Anderson and M. Cafarella, “Input selection for fast feature engineering,” in 2016 IEEE 32nd International Conference on Data Engineering (ICDE). IEEE, 2016, pp. 577–588. [4] J. Zhang, J.-S. Wong, Y. Pan, and T. Li, “A parallel matrix-based method for computing approximations in incomplete information systems” IEEE Transactions on Knowledge and Data Engineering, vol. 27,no.2,pp.326– 339, 2015. [5] J. Zhang, T. Li, and Y. Pan, “Plar: Parallel large-scale attribute reduction on cloud systems,” in International Conference on Parallel and Distributed
Download now