This document provides an overview of data mining including:
- Data mining techniques like classification, prediction, clustering which are used to analyze patterns in data.
- The importance of data mining for applications in fields like banking, retail, and healthcare to discover useful knowledge from large datasets.
- Issues with data mining like security, performance, and methodology challenges as well as future trends like using more advanced algorithms and computing resources to handle diverse and large datasets.
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
DOWLD SLIDES.pptx
1. DATA MINING
A DETAILED STUDY
AND ITS LITERATURE
SURVEY
SUBMITTED BY- ANKUR UTSAV
DEPARTMENT OF ELECTRONICS AND
COMMUNICATION ENGINEERING,
BIRLA INSTITUTE OF TECHNOLOGY, PATNA
2. CONTENTS
INTRODUCTION
DATA MINING
IMPORTANCE OF DATA MINING
ISSUES OF DATA MINING
TECHNIQUES OF DATA MINING
DATA MINING APPLICATIONS
CONCLUSION AND FUTURE TRENDS
REFERENCES
3. INTRODUCTION
• Data mining is the method of analyzing secreted patterns of data.
• Data mining mainly includes extracting the data, transforming the data , and uploading the
data against the data warehouse organization.
• Data mining uses complex arithmetical algorithms to arrange the data.
• Data mining is applicable in different fields as banking and financial services, Health care,
Telecommunications.
5. HISTORY
• Data mining is the field's evaluation and the term Data mining" was coined in 1990.
• Its root can be sketched down three family lines:-
• i.) classical statistics, ii.)AI and iii.)machine language.
• Statistics: It is the base of the majority of the technologies, which we are building data
mining. e.g. regression investigation, standard allocation, standard deviation, variance,
distinguish analysis, cluster analysis, and self-assurance intervals.
• AI: Artificial intelligence is the authority that tries to follow how the brain works with
encoding methods like, eg - making a program which plays chess.
6. • Machine learning: It is the combination of AI and statistics. It is a branch of AI, including
all the set of algorithms which is applied in the above discussed Statistical Models
,predictive diagnostics are done by using clustering and classification.
• It is basically the transformation or modification of machine learning technology to
business applications.
8. IMPORTANCE OF DATA MINING
• Data can create profits. It is a crucial monetary asset of a endeavour.
• Many businesses which can be used for discovering and exploring knowledge from
available data sets through data mining.
• Data mining helps us for the fore casting of future trends.
• Data mining plays a vital role in the early stages of data management by the help of
skilled and efficient data entry service providers.
• Data mining is also handy in locating the data variance patterns which are necessary in
scam recognition and fields of pathetic or false data modification.
9. ISSUES OF DATA MINING
• Security and social issues:-Now-a-days, the most common issue in the data
collection which can be shared is security.
• User interface issues:- The information which are discovered by data mining
technique is useful if it is fascinating and user can be able to understood.
• Mining methodology issues:-These issues refer to the approaches used in data
mining and its restriction.
10. • Data source issues:- The realistic issues like the multiplicity of data types and
philosophical issues like the data surplus troubles which includes many issues
which are linked to the data sources.
• Performance Issues:- There can be performance related issues such as follows -
i.)Efficiency and scalability of data mining algorithms –For getting high efficiency
of extraction of the information from a huge and massive amount of data.
• ii.)Parallel, spread, and incremental mining algorithms – Parallel and spread data
mining algorithms comes in to picture when huge size databases are used, large
allocation of data and complex methods of data mining.
12. TECHNIQUES OF DATA MINING
• Classification: It obtains a model to establish the group of item based on its attributes. :- It is a
old data mining system which works on machine learning mostly in classification, everything
in a set of data is classified in predefined group or set of classes
• Prediction: Its job is to forecast the probable values of lost or future data.
• Time series: It is a series of events where the next event is different kinds of preceding events.
• Association: It discovers the association or connection between a set of items. It is one of the
finest identified data mining system. In this system, particular item pattern on other item based
discovery is done when making a relationship. Association rules are if or then declaration
which assist to reveal relations among seemingly distinct data in relational database.
13. • Clustering: Clustering used to identify data items that are comparable to one another. :- It
makes useful and meaningful group of items which have same characteristic using
automated techniques.
• It’s major task involves exploration of data mining, many statistical analysis, bio
informatics etc. There are five types of cluster:- Well separated, centre based cluster,
contiguous clusters, density based , shared property.
• Summarization: It is the simplification of data. A set of appropriate data is sum up that
results in a minor set which gives combined knowledge of the data.
15. DATA MINING APPLICATION
• Data mining is major worried with the investigation of data that has been adopted. Data
mining is a energetic and quick-expanding area with vast strengths.
• Medical and Pharmacy:-Data mining allow distinguishing patient behaviour to see
incoming office stay. Data mining helps in identification of successful help therapies
applicable for different illness its application are constantly increasing in a variety of
domain to offer more unknown information which can increase the business competence,
efficiency.
16. • Web mining:-Web mining is the function of data mining system to discover patterns,
structures, and knowledge from the Web.
• Health Care :-Health care is also one of the first important areas of activity that boosted
the intensive development of the data mining methods, starting from visualization
techniques, predicting health care costs and ending with computer-aid diagnosis.
• Detection of Banking and Finance:-The banking and financial services domain is one of
the first and most important areas for data mining applications. Thus, in banking, data
mining methods were intensively used in modelling and forecasting credit fraud, risk
assessment etc.
17. • Retail Industry: - Data Mining plays a vital role in retail industry as it looks for the
collection of huge amount of data of sales, client purchasing history, goods shipping,
expenditure and services. As population is increasing day by day therefore, quantity of
data collected will also increase day by day. That indicates the demand of data mining in
future is very high.
18. CONCLUSION
• Data Mining is the process of extracting knowledge from massive sets of data.
• Data mining is very Important domain as it deals with data and day by day population is
increasing.
• Researchers would mainly centre on the issue and challenges of data mining. Data mining
software is used to analyze the data.
• Proper predictions can be done by data analysis and algorithms.
• We can use data mining for discion making.
19. FUTURE TRENDS
• Data mining as changing trends-
• Past:-Earlier uses the statistical, Machine Learning algorithm used for numerical and
structured data for traditional database. Its main area of application was for business
purpose. It's computing resources was 4GPL and its related technique.
• Present:- Now a days, Data mining was advanced statistical, Machine learning, Artificial
Intelligence and pattern recognition techniques. It is applicable for structured, semi
structured, and unstructured data formats. Its main area of application is business, web,
medical diagnosis etc. Its computing resource is high speed network, high end storage
devices, Distributed computing etc.
20. • Future:-In coming future, Data mining will use soft computing (fuzzy logic), neural network
and genetic programming algorithm. It will include high dimensional, speed data streams,
sequences, noise in the time, series, graph etc as data formats. It's main area of application is
Business, web, Medical Diagnosis scientific and research analysis fields (biomedical
application, remote sensing etc). Social Network etc. Its computing resources will be
multivalent technologies and cloud computing.
21. REFERENCES
• [1] Pang-Ning Tan, Michael Steinbach, Vipin Kumar, "Introduction to Data Mining", Addison
Wesley, 2002.
• [2] S. Mitra, S.K.Pal & Mitra , P., Data mining in soft computing framework: A survey, IEEE
transactions on neural networks, 13(1), 3-14,2002.
• [3] Parvez Ahmad, Saqib Qamar, Syed Qasim Afser Rizvi, Techniques of Data Mining in
Healthcare : A Review, International Journal of Computer Applications (0975 – 8887) Volume 120
– No.15, June 2015.
• [4] Hsinchun Chen, Sherrilynne, S. Fuller, Carol Friedman and William Hersh, Knowledge
Management, Data Mining and text mining in medical informatics.
22. • [5] Zhu, Xingquan; Davidson, Ian (2007). Knowledge Discovery and Data Mining: Challenges and Realities. New York,
NY: Hershey. pp. 31–48. ISBN 978-1- 59904-252-7
• [6] Md. Ansarul Haque1, Tamjid Rahman , “SENTIMENT ANALYSIS BY USING FUZZY LOGIC”, International Journal
of Computer Science, Engineering and Information Technology (IJCSEIT), Vol. 4,No. 1, February 2014.
• [7] Vijayaran S, Sudha. “An Effective Classification Rule Technique for Heart Disease Prediction”.International Journal of
Engineering Associates, February 2013.
• [8] Fayadd, U., Piatesky -Shapiro, G., and Smyth, P, From Data Mining To Knowledge Discovery in Databases”, The MIT
Press, ISBN 0–26256097–6, Fayap, 1996.
• [9] Huan Liu and Lei Yu, “Toward Integrating Feature Selection Algorithms for Classification and Clustering”,IEEE
Transactions on Knowledge and Data Engineering Volume 17 Issue 4,April 2005
• [10] Meenu Sharma, “Clustering In Data Mining : A Brief Review”, International Journal Of Core Engineering &
Management (IJCEM) Volume 1, Issue 5, August 2014.