SlideShare a Scribd company logo
1 of 4
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 08 | Aug 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1027
A STUDY ON DATA MINING IN SOFTWARE
Sreenivasulu Tholuchuri
MCA, Hyderabad, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Data mining for software engineering is a process
of discovering software engineering data in databases. In
simple words, it’s a series of actions to extract knowledgefrom
useful patterns and relationships in hugevolumesofdatabases
and use that knowledge to improve the software engineering
process. It uses tools from artificial intelligence and statics
with database management to analyze large digital
collections, known as data sets. Data mining is widely used in
business (insurance, banking, and retail), science research
(astronomy, medicine) and government defense departments.
Key Words: Data Mining, Software Engineering, Error
Detection, Clustering, Association
1. INTRODUCTION
Data Mining also called asknowledge discoveryindatabases
or KDD for making productive use of mined knowledge in
operable way.
1.1 Early Ages
During 1980's data storage capacities in computers
increased a lot and many big companies started began to
store transactional data which resulted collections of huge
volume records, often called as data ware houses. This data
warehouses were too large to be analyzed with traditional
statistical approaches.
With the aim of knowledge discovery several computer
science workshops and conferences were held for adapting
the techniques from the field of Artificial Intelligence (AI) ---
such as neural networks, genetic algorithms, machine
learning etc. Thisled to the First International Conferenceon
Knowledge Discovery and Data Mining (FICKDD), held in
Montreal, and the launch in 1997 of the journal Data Mining
and Knowledge Discovery which was also the period when
many early data-mining companies were formed and
products were introduced.
Now a day’s vast amounts of data are collected daily.
Figuring out such data is an important need.
“We are living in the information age”is a popular saying;
however, we are actually living in the data age. Terabytes or
petabytes of data pour into our computer networks, the
World Wide Web (WWW), and various data storage devices
every day from business, society, science and engineering,
medicine, and almost every other aspect of daily life. This
explosive growth of available data volume is a result of the
computerization of our society and the fast development of
powerful data collection and storage tools. Businesses
worldwide generate gigantic data sets, including sales
transactions, stock trading records, product descriptions,
sales promotions, company profiles and performance, and
customer feedback.
For example, large stores, such as Wal-Mart, handle
hundredsof millionsof transactionsperweekatthousandsof
branches around the world. Scientific and engineering
practices generate high orders of thousands of terabytes of
data in a continuous manner, from remote sensing, process
measuring, scientific experiments, system performance,
engineeringobservations, andenvironmentsurveillance.The
medical and health industrygeneratestremendousamounts
of data from medical records, patient monitoring, and
medical imaging. Billions of Web searches supported by
search engines process thousands of terabytes of data daily.
Communities and social media have become increasingly
important data sources, producing digital pictures and
videos, blogs, Web communities, and various kinds of social
networks. The list of sources that generate huge amounts of
data is endless. This hazardous growing, universally
available, and gigantic body of data makes our time truly the
data age. Powerful and versatile tools are badly needed to
automatically uncover valuable information from the
tremendousamounts ofdata andtotransform such datainto
organized knowledge.This essentiality hasled tothe birth of
data mining.The field is young, dynamic, andpromising.Data
mining has and will continue to make great strides in our
journey from the data age toward the coming information
age.
2. DATA MINING
Data Mining ismoreappropriatelynamed“knowledgemining
from data,” which seems somewhat long. However, the
shorterterm, knowledge miningmay notreflecttheattention
on mining from large amounts of data. Though, Mining is
expressiveterm characterizing the processthat findsa small
set of precious nuggets from a great deal of raw material.
In addition, other terms have a same meaning to data
mining—for example, knowledge mining from data,
knowledge extraction, data/pattern analysis, data
archaeology, and data dredging.
Many people think data mining as a synonym for another
popularly used term, knowledge discovery from data, or
KDD, while others view data mining as merely an essential
step in the process of knowledge discovery.
The knowledge discovery process is shown in Figure below
as an iterative sequence of the following steps:
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 08 | Aug 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1028
1) Data cleaning (to remove inconsistent data and
noise)
2) Data integration (phase where multiple data
sources may be combined)
3) Data selection (phase where data relevant to the
analysis task are retrieved from the database)
4) Data transformation (phase where data are
transformed and consolidated into forms fitting for
mining by performing summary or aggregation
operations)
5) Data mining (an crucial process where intelligent
methods are applied to extract data patterns)
6) Pattern evaluation (phase to identify the truly
interesting patterns representing knowledgebased
on interestingness measures)
7) Knowledge presentation (phasewherevisualization
and knowledge representation techniques are used
to show mined knowledge to users)
Steps1 to 4 are different formsof data preprocessing, where
data are prepared for mining. The data mining process may
interact with the user or a knowledge base. The interesting
patterns are showed to the user and may be stored as new
knowledge in the knowledge base. The previousview shows
data mining as one step in the knowledge discovery process,
albeit an essential one because it uncovers hidden patterns
for evaluation. However, in industry, in media, and in the
research milieu, the term data mining is generally used to
refer to the entire knowledge discovery process (perhaps
because the term is shorter than knowledge discovery from
data). Therefore, we adopt a broad view of data mining
functionality: Data mining is the process of discovering
interesting patterns and knowledge from huge amounts of
data. The data sources can include databases, data
warehouses, the Web, other informationrepositories,ordata
that flow into the system dynamically.
Fig -1: Data Mining Steps
3. GOALS OF SOFTWARE ENGINEERING
Requirement Analysis: In this phase ofSEtaskgatheringof
software requirementsfromclient,analyzeanddocumenting
data are done. It’s a functional or non-functional need to be
implemented in the system. Client’sacceptanceismandatory
to proceed for further process.
Whatever document prepared in this phase is called
Software Requirement Specifications (SRS)
System Design: System design is a process of defining user
interfaces, modules, architecture and the dataforasystemto
satisfy client requirements. Here we will implement overall
product design as per client requirement different types of
SDK will be used.
Development/Programming: The source code of the
program is written in different programming languages as
per client requirement. It is called programming process in
software development. Coding reserved for actualwritingof
source code. It is a main part in the software development.
Software development organization requires good
programmers to define the standard style of code called
Coding standards. It gives a good appearance to the code
written by different software programmers. It should be
understandable, reusable which followsgood programming
practices. Naming conventions, limitation of data types and
using of variables, constants are main coding standards.
Error Detection/Bug Fixes: Error detection or bug fix is a
essential process for effective and proper software project
planning. Some data related software bugs are kept in bug
repositories. It contains information related to bugs. A bug
fix contains data and code related. There are many types of
programming bugs, design bugs, data bugsthatcreateerrors
in system implementation may require fixes that are
successfully resolved by development team.
Testing: Software testing is not a cost effective. It is the
important phase in software development. There are
different stages in testing to validate or verify software.
Verification and validation processes are concerned with
checking that software being developed meets its
specification and delivers the functionality expected by the
people paying for the software.
During testing, errorscan mask (hide) othererrors.Whenan
error leads to unexpected outputs, you can never be sure if
later output anomalies are due to a new error or are side
effects of the original error. Because analysis is a static
process, you don’t have to be concerned with interactions
between errors. Consequently, a single analysis session can
discover many errors in a system.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 08 | Aug 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1029
Maintenance: Good software should deliver the required
functionality and performance to the user and should be
maintainable, dependable, and usable. Software will be
written in such a way so that it can be develop to meet the
changing needs of customers. This is a demanding attribute
because software change is an impending requirement of a
changing business environment.
Agile procedures, used in the maintenance process
itself, are likely to be effective, whether or not an agile way
has been used for system development. Incremental
delivery, design for change and maintaining simplicity all
make sense when software isbeing modified. In fact,youcan
think of an agile development process as a process of
software expansion.
Fig -2: Software Engineering Tasks, Data Mining
Techniques & Data Mining in Software Engineering
4. TECHNIQUES IN DATA MINING
4.1 Association rule
Association rule mining is an examplewheretheuse
of constraints and interestingness measures can ensure the
completeness of mining. The problem of mining association
rules can be reduced to that of mining frequent item sets.
Association Rule mining approach is applied to the
records in order to discover the patterns that are possiblyto
cause high severity defects.
Max association rule mining algorithms employ a
support–confidence framework. Even though minimum
support and confidence thresholdshelp weed outorexclude
the exploration of a good number of uninteresting rules,
most of the rules generated are still not interesting to the
users. Regrettably, this isespecially true when miningatlow
support thresholds or mining for long patterns. This has
been a big obstacle for successful application of association
rule mining
4.2 Classification:
Classification is the process of finding a model (or
function) that describes and distinguishes data classes or
concepts. The models are derived based on the analysis of a
set of training data (i.e., data objects for which the class
labels are known). The model is used to predict the class
label of objects for which the class label is unknown.
A neural network, when used for categorization, is
typically a collection of neuron-like processing units with
weighted connections between the units. There are various
other methodsfor constructing classificationmodels,suchas
Bayesian classification, support vector machines, and k-
nearest-neighbor classification
Regression analysis is a statistical
technique/approach that is most often used for numeric
prediction, although othermethodsexist as well. Regression
also encompasses the identification of distribution trends
based on the available data.
The resulting classification should maximally distinguish
each class from the others, presenting an organized picture
of the data set.
4.3 Clustering:
Clustering plays a central role in customer
relationship management, whichgroupscustomersbasedon
their similarities. Using relevance mining techniques,we can
better understand features of each customer group and
develop customized customer reward programs.
Clustering techniques consider data tuples as
objects. They partition the objects into groups,orclusters,so
that objects within a cluster are “similar” to one anotherand
“dissimilar” to objects in other clusters. Similarity is
commonly defined in terms of how “close” the objects are in
space, based on a distance function. The “quality” ofacluster
may be represented by its diameter, the maximum distance
between any two objects in the cluster.
The specified set of entities to cluster needs to be
identified, before applying clustering to a software system.
The next phase is attribute selection. Max software
clustering methods at first transform a fact base to a data
table, where each row describes one entity to be clustered.
Each column containsthe value for a specific attribute. After
accomplishment of all preparation steps the clustering
algorithm can start to execute. Clustering algorithmsusedin
software engineering are: graph-theoretical algorithms,
construction algorithms, optimization algorithms,
hierarchical algorithms. For high dimensional data, many of
the existing methods fail due to the curse of dimensionality,
which contribute particular distance functions problematic
in high-dimensional spaces which led to new era of
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 08 | Aug 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1030
clustering algorithms for high-dimensional data that focus
on subspace clustering and correlation clustering that also
looks for arbitrary rotated subspace clusters that can be
modeled by giving a correlation of their attributes.
Fig -3: Characterization of Data Mining
5. CONCLUSIONS
In this paper, I have tried to provide an analysis of Data
Mining and its origin. Why data mining is essentialinthisera
of computer world. An analyzed information aboutSoftware
Engineering and it various phases. How Data Mining in
Software Engineering is classified and Data Mining
Techniques used in processfor fruitfulknowledgediscovery.
REFERENCES
[1] Tao Xie,Jian Pei,Ahmed E. Hassan, "Mining Software
Engineering Data"
[2] Lovedeep, Varinder Kaur Atri, "Applications of Data
Mining Techniques in Software Engineering", IJEECS
[3] Jiawei Han,Micheline Kamber & Jian Pei, "Data Mining
Concepts and Techniques", Third Edition
[4] Ian Sommerville, "SOFTWARE ENGINEERING" Ninth
Edition

More Related Content

What's hot

Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Toolsijsrd.com
 
IRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its ChallengesIRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its ChallengesIRJET Journal
 
Big data – A Review
Big data – A ReviewBig data – A Review
Big data – A ReviewIRJET Journal
 
Big Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New ChallengesBig Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New ChallengesEditor IJCATR
 
Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoopRemas Ittahir
 
Data Mining With Excel 2007 And SQL Server 2008
Data Mining With Excel 2007 And SQL Server 2008Data Mining With Excel 2007 And SQL Server 2008
Data Mining With Excel 2007 And SQL Server 2008Mark Tabladillo
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...Experfy
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentationmillerca2
 
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATAA REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATAIJMIT JOURNAL
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1Mahmoud Alfarra
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Miningtobiemuir
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsMotaz Saad
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar reportmayurik19
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introductionhktripathy
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesSaif Ullah
 

What's hot (20)

Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
 
IRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its ChallengesIRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its Challenges
 
Big data – A Review
Big data – A ReviewBig data – A Review
Big data – A Review
 
Big Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New ChallengesBig Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New Challenges
 
Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoop
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Data Mining With Excel 2007 And SQL Server 2008
Data Mining With Excel 2007 And SQL Server 2008Data Mining With Excel 2007 And SQL Server 2008
Data Mining With Excel 2007 And SQL Server 2008
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
 
Data Science
Data ScienceData Science
Data Science
 
Data Mining
Data MiningData Mining
Data Mining
 
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATAA REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
 
Data mining
Data miningData mining
Data mining
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Mining
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
1
11
1
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 

Similar to IRJET- A Study on Data Mining in Software

Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
IRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare ApplicationsIRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare ApplicationsIRJET Journal
 
IRJET- Analysis of Big Data Technology and its Challenges
IRJET- Analysis of Big Data Technology and its ChallengesIRJET- Analysis of Big Data Technology and its Challenges
IRJET- Analysis of Big Data Technology and its ChallengesIRJET Journal
 
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSISCASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSISIRJET Journal
 
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...IRJET Journal
 
An Overview of General Data Mining Tools
An Overview of General Data Mining ToolsAn Overview of General Data Mining Tools
An Overview of General Data Mining ToolsIRJET Journal
 
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...IRJET Journal
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageIRJET Journal
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big DataIRJET Journal
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET Journal
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
Big Data: Privacy and Security Aspects
Big Data: Privacy and Security AspectsBig Data: Privacy and Security Aspects
Big Data: Privacy and Security AspectsIRJET Journal
 
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...IRJET Journal
 
Association rule visualization technique
Association rule visualization techniqueAssociation rule visualization technique
Association rule visualization techniquemustafasmart
 
The Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | QuboleThe Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | QuboleVasu S
 

Similar to IRJET- A Study on Data Mining in Software (20)

Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
IRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare ApplicationsIRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare Applications
 
IRJET- Analysis of Big Data Technology and its Challenges
IRJET- Analysis of Big Data Technology and its ChallengesIRJET- Analysis of Big Data Technology and its Challenges
IRJET- Analysis of Big Data Technology and its Challenges
 
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSISCASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
 
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
 
An Overview of General Data Mining Tools
An Overview of General Data Mining ToolsAn Overview of General Data Mining Tools
An Overview of General Data Mining Tools
 
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their Usage
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big Data
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial Domain
 
Data Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope SurveyData Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope Survey
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Big Data: Privacy and Security Aspects
Big Data: Privacy and Security AspectsBig Data: Privacy and Security Aspects
Big Data: Privacy and Security Aspects
 
Seminar Report Vaibhav
Seminar Report VaibhavSeminar Report Vaibhav
Seminar Report Vaibhav
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
 
Association rule visualization technique
Association rule visualization techniqueAssociation rule visualization technique
Association rule visualization technique
 
The Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | QuboleThe Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | Qubole
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdfKamal Acharya
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxSCMS School of Architecture
 
Ground Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementGround Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementDr. Deepak Mudgal
 
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...ronahami
 
Introduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptxIntroduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptxhublikarsn
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...josephjonse
 
Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...ppkakm
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"mphochane1998
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 
Memory Interfacing of 8086 with DMA 8257
Memory Interfacing of 8086 with DMA 8257Memory Interfacing of 8086 with DMA 8257
Memory Interfacing of 8086 with DMA 8257subhasishdas79
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessorAshwiniTodkar4
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptAfnanAhmad53
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 

Recently uploaded (20)

Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
Ground Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementGround Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth Reinforcement
 
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
 
Introduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptxIntroduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptx
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Memory Interfacing of 8086 with DMA 8257
Memory Interfacing of 8086 with DMA 8257Memory Interfacing of 8086 with DMA 8257
Memory Interfacing of 8086 with DMA 8257
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .ppt
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 

IRJET- A Study on Data Mining in Software

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 08 | Aug 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1027 A STUDY ON DATA MINING IN SOFTWARE Sreenivasulu Tholuchuri MCA, Hyderabad, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Data mining for software engineering is a process of discovering software engineering data in databases. In simple words, it’s a series of actions to extract knowledgefrom useful patterns and relationships in hugevolumesofdatabases and use that knowledge to improve the software engineering process. It uses tools from artificial intelligence and statics with database management to analyze large digital collections, known as data sets. Data mining is widely used in business (insurance, banking, and retail), science research (astronomy, medicine) and government defense departments. Key Words: Data Mining, Software Engineering, Error Detection, Clustering, Association 1. INTRODUCTION Data Mining also called asknowledge discoveryindatabases or KDD for making productive use of mined knowledge in operable way. 1.1 Early Ages During 1980's data storage capacities in computers increased a lot and many big companies started began to store transactional data which resulted collections of huge volume records, often called as data ware houses. This data warehouses were too large to be analyzed with traditional statistical approaches. With the aim of knowledge discovery several computer science workshops and conferences were held for adapting the techniques from the field of Artificial Intelligence (AI) --- such as neural networks, genetic algorithms, machine learning etc. Thisled to the First International Conferenceon Knowledge Discovery and Data Mining (FICKDD), held in Montreal, and the launch in 1997 of the journal Data Mining and Knowledge Discovery which was also the period when many early data-mining companies were formed and products were introduced. Now a day’s vast amounts of data are collected daily. Figuring out such data is an important need. “We are living in the information age”is a popular saying; however, we are actually living in the data age. Terabytes or petabytes of data pour into our computer networks, the World Wide Web (WWW), and various data storage devices every day from business, society, science and engineering, medicine, and almost every other aspect of daily life. This explosive growth of available data volume is a result of the computerization of our society and the fast development of powerful data collection and storage tools. Businesses worldwide generate gigantic data sets, including sales transactions, stock trading records, product descriptions, sales promotions, company profiles and performance, and customer feedback. For example, large stores, such as Wal-Mart, handle hundredsof millionsof transactionsperweekatthousandsof branches around the world. Scientific and engineering practices generate high orders of thousands of terabytes of data in a continuous manner, from remote sensing, process measuring, scientific experiments, system performance, engineeringobservations, andenvironmentsurveillance.The medical and health industrygeneratestremendousamounts of data from medical records, patient monitoring, and medical imaging. Billions of Web searches supported by search engines process thousands of terabytes of data daily. Communities and social media have become increasingly important data sources, producing digital pictures and videos, blogs, Web communities, and various kinds of social networks. The list of sources that generate huge amounts of data is endless. This hazardous growing, universally available, and gigantic body of data makes our time truly the data age. Powerful and versatile tools are badly needed to automatically uncover valuable information from the tremendousamounts ofdata andtotransform such datainto organized knowledge.This essentiality hasled tothe birth of data mining.The field is young, dynamic, andpromising.Data mining has and will continue to make great strides in our journey from the data age toward the coming information age. 2. DATA MINING Data Mining ismoreappropriatelynamed“knowledgemining from data,” which seems somewhat long. However, the shorterterm, knowledge miningmay notreflecttheattention on mining from large amounts of data. Though, Mining is expressiveterm characterizing the processthat findsa small set of precious nuggets from a great deal of raw material. In addition, other terms have a same meaning to data mining—for example, knowledge mining from data, knowledge extraction, data/pattern analysis, data archaeology, and data dredging. Many people think data mining as a synonym for another popularly used term, knowledge discovery from data, or KDD, while others view data mining as merely an essential step in the process of knowledge discovery. The knowledge discovery process is shown in Figure below as an iterative sequence of the following steps:
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 08 | Aug 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1028 1) Data cleaning (to remove inconsistent data and noise) 2) Data integration (phase where multiple data sources may be combined) 3) Data selection (phase where data relevant to the analysis task are retrieved from the database) 4) Data transformation (phase where data are transformed and consolidated into forms fitting for mining by performing summary or aggregation operations) 5) Data mining (an crucial process where intelligent methods are applied to extract data patterns) 6) Pattern evaluation (phase to identify the truly interesting patterns representing knowledgebased on interestingness measures) 7) Knowledge presentation (phasewherevisualization and knowledge representation techniques are used to show mined knowledge to users) Steps1 to 4 are different formsof data preprocessing, where data are prepared for mining. The data mining process may interact with the user or a knowledge base. The interesting patterns are showed to the user and may be stored as new knowledge in the knowledge base. The previousview shows data mining as one step in the knowledge discovery process, albeit an essential one because it uncovers hidden patterns for evaluation. However, in industry, in media, and in the research milieu, the term data mining is generally used to refer to the entire knowledge discovery process (perhaps because the term is shorter than knowledge discovery from data). Therefore, we adopt a broad view of data mining functionality: Data mining is the process of discovering interesting patterns and knowledge from huge amounts of data. The data sources can include databases, data warehouses, the Web, other informationrepositories,ordata that flow into the system dynamically. Fig -1: Data Mining Steps 3. GOALS OF SOFTWARE ENGINEERING Requirement Analysis: In this phase ofSEtaskgatheringof software requirementsfromclient,analyzeanddocumenting data are done. It’s a functional or non-functional need to be implemented in the system. Client’sacceptanceismandatory to proceed for further process. Whatever document prepared in this phase is called Software Requirement Specifications (SRS) System Design: System design is a process of defining user interfaces, modules, architecture and the dataforasystemto satisfy client requirements. Here we will implement overall product design as per client requirement different types of SDK will be used. Development/Programming: The source code of the program is written in different programming languages as per client requirement. It is called programming process in software development. Coding reserved for actualwritingof source code. It is a main part in the software development. Software development organization requires good programmers to define the standard style of code called Coding standards. It gives a good appearance to the code written by different software programmers. It should be understandable, reusable which followsgood programming practices. Naming conventions, limitation of data types and using of variables, constants are main coding standards. Error Detection/Bug Fixes: Error detection or bug fix is a essential process for effective and proper software project planning. Some data related software bugs are kept in bug repositories. It contains information related to bugs. A bug fix contains data and code related. There are many types of programming bugs, design bugs, data bugsthatcreateerrors in system implementation may require fixes that are successfully resolved by development team. Testing: Software testing is not a cost effective. It is the important phase in software development. There are different stages in testing to validate or verify software. Verification and validation processes are concerned with checking that software being developed meets its specification and delivers the functionality expected by the people paying for the software. During testing, errorscan mask (hide) othererrors.Whenan error leads to unexpected outputs, you can never be sure if later output anomalies are due to a new error or are side effects of the original error. Because analysis is a static process, you don’t have to be concerned with interactions between errors. Consequently, a single analysis session can discover many errors in a system.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 08 | Aug 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1029 Maintenance: Good software should deliver the required functionality and performance to the user and should be maintainable, dependable, and usable. Software will be written in such a way so that it can be develop to meet the changing needs of customers. This is a demanding attribute because software change is an impending requirement of a changing business environment. Agile procedures, used in the maintenance process itself, are likely to be effective, whether or not an agile way has been used for system development. Incremental delivery, design for change and maintaining simplicity all make sense when software isbeing modified. In fact,youcan think of an agile development process as a process of software expansion. Fig -2: Software Engineering Tasks, Data Mining Techniques & Data Mining in Software Engineering 4. TECHNIQUES IN DATA MINING 4.1 Association rule Association rule mining is an examplewheretheuse of constraints and interestingness measures can ensure the completeness of mining. The problem of mining association rules can be reduced to that of mining frequent item sets. Association Rule mining approach is applied to the records in order to discover the patterns that are possiblyto cause high severity defects. Max association rule mining algorithms employ a support–confidence framework. Even though minimum support and confidence thresholdshelp weed outorexclude the exploration of a good number of uninteresting rules, most of the rules generated are still not interesting to the users. Regrettably, this isespecially true when miningatlow support thresholds or mining for long patterns. This has been a big obstacle for successful application of association rule mining 4.2 Classification: Classification is the process of finding a model (or function) that describes and distinguishes data classes or concepts. The models are derived based on the analysis of a set of training data (i.e., data objects for which the class labels are known). The model is used to predict the class label of objects for which the class label is unknown. A neural network, when used for categorization, is typically a collection of neuron-like processing units with weighted connections between the units. There are various other methodsfor constructing classificationmodels,suchas Bayesian classification, support vector machines, and k- nearest-neighbor classification Regression analysis is a statistical technique/approach that is most often used for numeric prediction, although othermethodsexist as well. Regression also encompasses the identification of distribution trends based on the available data. The resulting classification should maximally distinguish each class from the others, presenting an organized picture of the data set. 4.3 Clustering: Clustering plays a central role in customer relationship management, whichgroupscustomersbasedon their similarities. Using relevance mining techniques,we can better understand features of each customer group and develop customized customer reward programs. Clustering techniques consider data tuples as objects. They partition the objects into groups,orclusters,so that objects within a cluster are “similar” to one anotherand “dissimilar” to objects in other clusters. Similarity is commonly defined in terms of how “close” the objects are in space, based on a distance function. The “quality” ofacluster may be represented by its diameter, the maximum distance between any two objects in the cluster. The specified set of entities to cluster needs to be identified, before applying clustering to a software system. The next phase is attribute selection. Max software clustering methods at first transform a fact base to a data table, where each row describes one entity to be clustered. Each column containsthe value for a specific attribute. After accomplishment of all preparation steps the clustering algorithm can start to execute. Clustering algorithmsusedin software engineering are: graph-theoretical algorithms, construction algorithms, optimization algorithms, hierarchical algorithms. For high dimensional data, many of the existing methods fail due to the curse of dimensionality, which contribute particular distance functions problematic in high-dimensional spaces which led to new era of
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 08 | Aug 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1030 clustering algorithms for high-dimensional data that focus on subspace clustering and correlation clustering that also looks for arbitrary rotated subspace clusters that can be modeled by giving a correlation of their attributes. Fig -3: Characterization of Data Mining 5. CONCLUSIONS In this paper, I have tried to provide an analysis of Data Mining and its origin. Why data mining is essentialinthisera of computer world. An analyzed information aboutSoftware Engineering and it various phases. How Data Mining in Software Engineering is classified and Data Mining Techniques used in processfor fruitfulknowledgediscovery. REFERENCES [1] Tao Xie,Jian Pei,Ahmed E. Hassan, "Mining Software Engineering Data" [2] Lovedeep, Varinder Kaur Atri, "Applications of Data Mining Techniques in Software Engineering", IJEECS [3] Jiawei Han,Micheline Kamber & Jian Pei, "Data Mining Concepts and Techniques", Third Edition [4] Ian Sommerville, "SOFTWARE ENGINEERING" Ninth Edition