SlideShare a Scribd company logo
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
_______________________________________________________________________________________ 
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 185 
A COMPREHENSIVE SURVEY ON DATA MINING Kautkar Rohit A1 1M. Tech Scholar, Computer science and technology, Maharashtra Institute of Technology (MIT), Aurangabad, Maharashtra, India Abstract Now a day’s internet is a significant place for interchanging of data like text, images, audio, and video and for share-out information preferably in digital form. The usage of internet leads to accessing the immense amount of data. Data may be unstructured data, structure data, and semi-structured data. So we are storing and processing such vast amount of data having gigantic complexity. Researchers from the University of Berkeley estimate that every year about one Exabyte (= 1 Million Terabyte) of data brought forth, of which a large portion is in digital form. For ex., 100 hours of videos are uploaded on YouTube every minute (statistics from YouTube). The query is how to analyze such ample data effectively and efficiently? Answer is data mining. Data mining pertains to the process of analyzing, studying such declamatory quantity of data for witnessing useful patterns and knowledge. Today we have accumulation of large amount of data but deficiency of knowledge. In this paper our focusing on surveillance of a nimble arising field data mining which is also known as knowledge discovery from data (KDD). We are constituting fundamentals of data mining, also several strategies for analyzing data like classification, estimation, prediction, association rules, clustering etc., data mining process, its types like web, content and structure mining. After understating basics, we are presenting the different data mining models like decision tree, neural network with their bedrocks. Also presents real world applications and future scope of the dynamic and extraordinary discipline. Keywords: - Internet, Data, Unstructured Data, Structure Data, Semi-Structured Data, Data Mining, Knowledge Discovery from Data (KDD) 
--------------------------------------------------------------------***---------------------------------------------------------------------- 1. INTRODUCTION Now day’s internet is a crucial place for interchanging the information and for sharing, dealing out the digital contents like text, audio, video, images etc. In that social networking sites acts crucial role like face book (www.facebook.com), LinkedIn micro blogging sites like Twitter (www.twitter.com) for sharing digital contents. So they produce, process huge and gigantic quantity of data every day in fact every minute. For example According to public statistics more than 48 hours of video contents are uploaded every minute and billions views are rendered every day on YouTube (www.youtube.com). Researchers from the University of Berkeley estimate that every year about 1 Exabyte (= 1 Million Terabyte) of data are generated, of which a large portion is available in digital form [3]. For goes on updates it connects to social networking sites like Face book, Twitter etc. so amount of data produce is quite vast and may have size in millions megabytes (or more) and complex in structure. So it leads to use highly efficient, advanced tools and techniques for intension of analyzing, processing such data. Analyzing and processing of data allow comprehending useful information and knowledge about the data. The term "Data mining" is acquainted in 1990s [12]. So inquisitory for knowledge in data is nothing but data mining [13]. Mining is important since it grant learning about versatile trends lives in data [18]. 
For example, Google search engine which accept and process huge amount of information every minute. If it just have a large accumulation of data and do not have knowledge regarding it? Then it’s pathetic. On other hand, if we employ data mining tools and techniques then we may find practicable patterns and trends. Google demonstrates different current trends according to geographical area which is example of gaining knowledge from user submitted queries. Search engines and social networking sites like Google and face book produce, store and process huge amount of data in terms of volume, velocity and variety denoted as big data. Big data integrates structured, semi-structured and structured data. So in reality data mining go through big data which in huge in size and complex in structure. The ambitious matter is that, it is continuously and rapidly increasing. There are several reasons for increasing the data on internet in rapid fashion. The use of internet leads to access, process and creation of data. For example, the use of face book leads sharing or uploading any images, videos. It’s nothing but creation of data. Another example is use of twitter. When we post any twit then it contributes to creation of Data. The Company’s which deals with huge amount of data everyday for them the field of data mining is vital.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
_______________________________________________________________________________________ 
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 186 
The data over internet may available in different types. Figure 1 depicted its types like structure, semi-structure and structure. When we have vast amount of data then it produces big data which creates another challenge in every aspect. The sources of data may be internal or external. Such data analyzed by predefined tools and technique referred as data mining. It allows discovering utile trends and patterns in data. 
Fig 1: Types of Data The simplest example of pattern, milk can be purchased with bread or computer can be purchased with Antivirus software [13]. According to Gartner group (www.gartner.com), data mining is the process of discovering meaningful correlations, patterns and trends by shifting large amount of data store in repositories, using pattern recognition technologies as well as statistical and mathematical techniques [12]. We have large amount of data available but don’t have knowledge concerning it. So the data mining imparts a way for experiencing knowledge from data. For example, Wall mart a megastore, which records millions of transaction every day, give rise to creation of huge data [13]. So it we don’t have a satisfying way to canvass it then its utility will be atrophied. So the usefulness can be enhanced by analyzing it with different tools and techniques. At present, the data mining established its grandness in field of information retrieval and processing and there is no uncertainty that in near future, it will keep arising. Data mining also called Knowledge discovery in Database (KDD) i.e. the process of exploring useful patterns and knowledge from data. Knowledge discovery is the treatment of analyzing data for future projection [21]. To gratify the demand of analyzing the available large volume data sets, more efficiently the data mining employs diverse fields like Machine learning, Information retrieval, Statistics, and visualization. Figure 2 shows; there are two primal intentions of data mining process- prediction and description. Prediction permit to influence the stranger or future values of pursuit where as description grant noticing patterns and describing the data [6]. The Meta Group estimates that the market size for data mining market will grow from $50 million in 1996 to $800 million by 2000 [11]. 
As shown in figure 3, the consolidation of diverse fields leads to new dynamic, encouraged area data mining. The data mining construct is actually formulated from statistics and machine learning. It also consolidates artificial intelligence, Information retrieval, visualization etc. We can conflate data mining with machine learning techniques for deceitful usage of cellular telephones grounded on profiling customers [29]. Fundamentally machines ascertain from data collected in past like humans learns from past experience and every time uses yesteryear experience to perform more beneficial. It is similar to supervised learning also referred as classification. It uses classification algorithms like decision tree or naive bayes [14]. When genuine utile pattern detected, then its usefulness can be determined with assist of statistics. We may visualize our data that to analyzed, the data visualization is ministrant for discovering interesting patterns easily. Basically computers cater easy mode to view data in more powerful and efficient manner. Now there are more powerful visualization tools are available [18]. The visualization tools can help for discovering approximate idea about the kind of data and its quality, also to anticipate where the useful patterns will situate [30]. At start-of paper, we stated, the internet is place for interchanging the digital contents over web. So it having gigantic size of information available and to explore accurate information in which user is interested is challenge. The search is most prominent practical application of web. The field which is participating in this problem is Text mining. The target of Information retrieval system is to elicit documents pertained to user query (a set of keywords) from huge amount of available information. The user’s information demand can be provide in form of query to Search engines [14]. Text analysis is performed for ordering documents. After exploring documents, their degree of matching keyword query is used to rank those [19]. Different commercial algorithms used by search engines. Basically Information Retrieval (IR) Systems are employs Information agents [30] which search for appropriate documents. The challenge is retrieve relevant documents to user query with fast response. Also the web pages are not just text; they also with hyperlinks and anchored text [14]. Data mining system can be very complex or simple as it integrates different arenas. All subfields are important in data mining as they grant constructing solution to a greater extent complex problem. It also support for miscellany of data mining system [13]. So data mining system can be class based on measures like kind of database used for mining, kinds of knowledge mined and levels of abstraction of knowledge mined. Data mining system can be constructed based on what type of pattern we are exploring, regular or irregular [13]. Statistics are useful for assess data. The data may continuous or discrete. So different themes used to measure continuous data like average, mean, median, mode. To measure discrete data, histograms can be used which basically represent single values [31].
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
_______________________________________________________________________________________ 
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 187 
Fig 2: Data Mining Process Goal 
Fig 3: Association of data mining with other fields 2. RELATED WORK Various data mining techniques that allow extracting unidentified relationships among the data items from large data collection that are useful for decision making [24]. Techniques for performing clustering on supervised data proposed [23]. Data mining techniques can play an important role in rule refinement even if the sample size is limited [8]. Artificial neural network models are proposed for imitate the human brain model. Decision tree allow segmenting the data in to appropriate groups. Both artificial neural network and decision tree have idea where to look for patterns in data. The use of market basket analysis & clustering techniques does not require any knowledge about relationships in the data, knowledge is discovered when these techniques are applied to the data. Market basket analysis tools sift through data to let retailers know what products are being purchased together. Clusters prove to be most useful when they are integrated into a marketing strategy [8]. Different marketing schemes shows that the firms profit can be increased by focusing on different marketing resources [25]. Apriori Algorithm [26] for data mining for frequent pattern mining proposed which uses prior knowledge about data. Different approaches to automatic construction of web pages by using log file data proposed [27]. The expected location of web pages can be handle page caching by browser. For this, algorithm automatically discovers pages in a website whose location is different from where visitors expect to find them [28]. 
The concept of a KDDM process model was primitively devised in 1989. Initial KD systems provided only a single DM technique, such as a decision tree or clustering algorithm, with a very imperfect support for the overall process framework. Initially it was using exclusive Data mining technique like association rule or clustering or decision tree. In Advances Knowledge Discovery and Data Mining define the process as being highly interactive and complex [24]. They also hint that KDDM process may comprise aggregation of Data mining techniques for processing and observing more complex patterns [10]. 3. DATA MINING STRATEGIES AND PROCESS 3.1 Data Mining Strategies There are following strategies for analyzing data items- 
1. Classification in which items are ordered /classified in to usable target classes. 
2. Estimation allows ascertaining unidentified output variables. 
3. Prediction allows determining the future consequence. It is similar to classification and estimation. 
4. Association Rules allow analyzing the big data sets for detecting useful patterns and relationships between data items. Masses of applications are consume association rule concept. 
5. Clustering allow grouping of items according to alike properties and behaviors. There are various algorithms are available for clustering like K-means. 
3.2. Data Mining Process Exploring utile patterns and knowledge from available data invokes fuse of steps as depicted in figure 4, like [13]- 
1. Data Cleaning 
As we go through, data is collected from internal or external sources and may contain noise and inconsistent information. Data cleaning phase allow getting rid of noise and inconsistent data. When data set is large, then data cleaning is time demanding process. 
2. Data integration 
The data may available in scattered structure. So data integration phase allow to blend it from different sources. Phase 1 and 2 are considered as preprocessing phases and accompanying data may be stored in to data warehouse. (Not mandatory).
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
_______________________________________________________________________________________ 
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 188 
Fig 4: Data Mining Process [22] 
3. Data selection 
Allow retrieving data from database for analyzing. According to problem domain, the different data sources can be selected. 
4. Data transformation 
The data transformed into the form which will be appropriate for processing and analyzing. 
5. Data mining 
Actual grasping of useful knowledge and patterns in data is achieved by data mining. 
6. Pattern evaluation 
In this, noticing truly interesting patterns, various standards like lift, support, confidence etc. can be used. 
7. Knowledge presentation 
Visualizing knowledge and representation technique can be used to constitute results. 4. DATA MINING MODELS There are many different data models are available for figuring out the data mining problems. The models are like decision tree, neural network, naive bayes, classifiers, lazy learners etc. 4.1 Decision Trees It is also mentioned as classification tree. The decision tree allows distinguishing data and classifying it into different subdivisions or groups. As shown in Figure 5, it follows the divide and conquers scheme. As it is decision tree it permits drawing decisions by trialing different potential considerations. The node of tree depicting the testing of attributes values. It allows testing of condition with several relational and comparison operators. Decision tree can be used for both description and prediction. Various algorithms are available like ID5, C4.5 and C5 etc. Decision tree act upon data that can be presents using table where each row is keyed out as instance shown in Table 1. Various useful patterns can be extracted from table for learning and knowledge intention as shown in figure 5. 
Fig 5: Decision Tree Decision trees are interesting as they are based on rules. They are used to explore relationships amongst the data items. The decision trees are structure used to divide large data set in to small sets based on rules. Rules can be written into English language [31]. Table 1: Table with Different Instance of Data 
Sr.No 
Employee Name 
Age 
Gender 
Salary 
Item Buy 
1 
Ram 
30 
M 
40000 
TV 
2 
Chetan 
35 
M 
52000 
{TV - >VCD Player} 
3 
Prasad 
25 
M 
25000 
{Laptop- >Antivirus Software} 
4 
Sangeeta 
45 
F 
45000 
{Laptop- >Antivirus software} 
4.2 Neural Network The concept of neural network is fundamentally inherited from the construction of human brain. Basically the neural networks are used for acquiring knowledge from various data sets and amend the performance of system. In Neural Network the knowledge is presented using interconnected processors addressed neurons. Each neuron consist weight value and with defined function. [1, 5] As indicated in figure 6, there are different input vectors(x1, x2…..xm) with consorted weights (w1, w2…..wm) and possible outcome positive or negative. The neurons are cultivated to reacts each input vector. The vectors from training set is applied over Neural Network, if the output is right then no alteration is made otherwise the weights and bias are adjusted. When the NN is fully trained, then it will classify the applied input vector by using its knowledge. Different classifiers can be used to classify the output.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
_______________________________________________________________________________________ 
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 189 
Fig 6: Neural Network 4.3 Association Rules Association rules basically give the relation between different attributes and their values. Gobs of useful associations can be explored by analyzing the data sets. Association rule utilizes the concept of support and confidence to determine the usefulness of rule. Patterns are practicable for probing the buying habits of customer for business analysis. Support for items X1 and Y1 for particular transaction can be given by percentage of transactions from data set which let in both items i.e. X1 U Y1. Confidence for association rule X1->Y1 is given by percentage of transaction from data set containing X1 also with Y1 [13]. The rule is genuinely useful if it has support and confidence value above the defined support count for that dataset. The support count can fix by domain expert [13]. 4.4 Naive Bayes The naive bayes constructs model that predict the probability of particular outcome. The possible outcomes can be predicted by finding patterns and correlations between the data instances. Afterward it picks out the data mining model for representation of the patterns and relations. 4.5 Machine Learning and Statistics As we know the data mining field extensively uses the concepts of machine learning and statistics for mining complex patterns. The statistics and machine learning are really much pertained arenas and in much sense depend on each other. The statistics are basically used for constructing the hypothesis where the hypothesis is used in the machine learning for generalization process. 5. TYPES OF MINING 5.1 Web Mining The web mining grants eliciting information from web documents and services using data mining techniques. Web mining term devised by Etzioni [4]. 
Internet comprises immense volume of data and is the most declamatory data source for acquiring and apportioning information. The web mining pertains to the discovering useful information from the web hyperlink structure, web page contents and web usage data. There are three classes as shown in figure 7 [4]- 
Fig 7: Web Mining Types 5.1.1 Web Structure Mining Web structure mining has to do with discovering the useful knowledge is from the hyperlinks. Basically, hyperlinks represent structure of web. Here, we are interested in structure of web documents and also in structure of hyperlinks [4]. It as well applies concept of social network analysis in which the graph can be used to represent the structure. For example, on twitter, the graph based database (FLOCK DB) can be utilized to represent users with their followers. 5.1.2 Web Content Mining Basically it grants to extract the useful patterns and knowledge from the web page contents. It is based on traditional mining process which allows classifying and clustering web pages (referred as items) according to their types of contents [4].There are two different standpoints are available in concern with web content mining- 
1) Information Retrieval 2) DB View 
Basically information retrieval is for semi structured and unstructured data. It permits querying the data and finding useful information. Database view is basically for the structured information having proper organization for storage and queries. 5.1.3 Web Usage Mining Web usage mining rivets the technique that could predict user behavior whenever the user interacts with web. The data is viewed in terms of interactivity with web application. The web usage mining basically uses the web logs and server logs for analyzing it. The usage data can be depicted with relational tables or graphs. The usage mining is applies concepts like association rules, machine learning etc. for mining aim. The usage mining is important at many places like web application construction, marketing etc.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
_______________________________________________________________________________________ 
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 190 
The web usage mining is useful in learning the user profile for providing better personalized experience. It is also helpful in the dynamic recommendation systems. Web usage mining permits to study user access patterns for giving better experience [19]. The issue of privacy arises; it exposes the privacy because the user profile data is used for clustering and analyzing the user profiles. 6. APPLICATIONS OF DATA MINING The data mining is applicable in panoptic and diverse areas. The data mining tools and techniques are really play important role in fields where data is integral, highly sensitive and important [3]. 
1. Retail Industry 
Huge amount of data is generated and stored. Data may comprise transaction details of sales, customer shopping etc. The stored data can be utilized for ascertaining the buying habits of customers, ameliorating the business, supervising the shelf space etc. 
2. Telecommunication Industry 
In telecommunication industry it can be useful for improving service quality, making better use of resources etc. 
3. Biological Data Analysis 
The data mining is useful here, to interpret the biological sequence and structure. 
4. Semantic web 
The web pages are analyzed and organized according to their types of contents. It utilizes the proficiency referred as Resource description framework (RDF) for classifying web pages. The RDF is implemented at lots of websites like orkut, face book for tagging purpose. 
5. Business Trends 
To serve customer more accurately, quickly and efficiently the data mining can be used. 
6. Financial Data Analysis 
Data collected from different sources of financial sectors and analyzed using data mining. 
7. Sports 
Many games played worldwide. So on each day many games are planned and played. It causes the vast amount of data creation. The data about each game and player can be analyzed and used to predict the performance of player. 
8. Manufacturing Process 
In manufacturing the data mining is useful for finding faults in manufacturing process and those faults can be amended. The data is collected from the manufacturing system. 7. CONCLUSIONS 
This study paper canvassed the extremely dynamic and substantial area data mining. In this paper we confronted, the basics of data mining, why data mining is important, different strategies for data mining like classification, estimation, prediction, clustering, and association rules. Also we studied that many applications now day’s based on strategies association rule for mining various trends. Also data mining process describes different phases in order to find the useful patterns and knowledge. It too introduces various data mining models like decision tree which allow making decision, NN which mimic the human brain structure and used for mining patterns more accurately, association rules for exploring relationships between data items. Naive bayes, machine learning and statistics are original ancestral of data mining field. The important portion of paper gives the type of web mining like structure mining, usage mining and content mining with their description. At long last it presents the diverse areas uses the data mining. We also resolve that analyzing the huge amount of data with prominent complexity is challenging chore and data mining supplies a direction to deal with such data efficiently and effectively. The data mining is one of the spectacular and challenging field today’s date and no doubts that, hereafter it will play vital primal in many areas. ACKNOWLEDGEMENTS I am enormously grateful to Prof. Mrs. K.V. Bhosale, Prof. V. Kala, Prof. R.B. Mapari, and Prof. Sonavane of Computer science and technology Department, MIT, Aurangabad. As well thankful to Prof. M. M. Goswami, Prof. P. Ghotkar, Prof. Mrs. R.D. Kalambe, Prof. Mrs. M.S. Arade, Prof. Miss. N. S. Nikale, Prof. Mrs. S. P. Dudhe of Government Polytechnic, Nashik. I genuinely thankful to Prof. J. Gorane,Prof. N. A. Anwat who directed me in right way for penning above work. REFERENCES 
[1] Singh, Y., & Chauhan, A. S. (2009). Neural Networks In Data Mining. Journal Of Theoretical And Applied Information Technology, 5(6), Pages 36-42. 
[2] Keim, D. A. (2002). Information Visualization and Visual Data Mining. Visualization and Computer Graphics, IEEE Transactions On, Volume 8(1), Pages 1-8. 
[3] Silwattananusarn, T., & Tuamsuk, K. (2012). Data Mining and Its Applications for Knowledge Management: A Literature Review from 2007 to 2012. Arxiv Preprint Arxiv: 1210.2872. 
[4] Kosala, R., & Blockeel, H. (2000). Web Mining Research: A Survey. ACM Sigkdd Explorations Newsletter, Volume 2(1), Pages 1-15. 
[5] Lu, H., Setiono, R., & Liu, H. (1996). Effective Data Mining Using Neural Networks. Knowledge and Data Engineering, IEEE Transactions On, Volume 8(6), Pages 957-961. 
[6] Chaudhary, R., Singh, P., & Mahajan, R. A SURVEY ON DATA MINING TECHNIQUES. 
[7] Jain, N., & Srivastava, V. (2013). DATA MINING TECHNIQUES: A SURVEY PAPER. IJRET: International Journal of Research in Engineering and Technology, Volume 2(11).
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
_______________________________________________________________________________________ 
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 191 
[8] Hilage, T. A., & Kulkarni, R. V. (2012). Review of Literature on Data Mining. IJRRAS, Volume 10(1), Pages 107-114. 
[9] Mann, A. K., & Kaur, N. (2013). Survey Paper on Clustering Techniques. International Journal of Science, Engineering and Technology Research, Volume 2(4), Pp-0803. 
[10] Kurgan, L. A., & Musilek, P. (2006). A Survey Of Knowledge Discovery And Data Mining Process Models. The Knowledge Engineering Review, Volume 21(01), Pages 1-24. 
[11] Goebel, M., & Gruenwald, L. (1999). A Survey of Data Mining and Knowledge Discovery Software Tools. ACM SIGKDD Explorations Newsletter, Volume 1(1), Pages 20-33. 
[12] Sharma, M. Data Mining: A Literature Survey. 
[13] Han, J., & Kamber, M. (2006). Data Mining, Southeast Asia Edition: Concepts and Techniques. Morgan Kaufmann. 
[14] Liu, B. (2007). Web Data Mining. Springer-Verlag Berlin Heidelberg. 
[15] Bramer, M. (2013). Principles of Data Mining. Springer. 
[16] Witten, I. H., & Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann. 
[17] Wattenhofer, M., Wattenhofer, R., & Zhu, Z. (2012, June). The Youtube Social Network. In ICWSM. 
[18] Hand, D. J., Mannila, H., & Smyth, P. (2001). Principles of Data Mining. MIT Press. 
[19] Markov, Z., & Larose, D. T. (2007). Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage. John Wiley & Sons. 
[20] Larose, D. T. (2006). Data Mining Methods & Models. John Wiley & Sons. 
[21] Bhosle, K. (2010). KDS for Sericulture Cocoon Production. International Journal of Computer Applications, Volume 1(18), Pages 86-90. 
[22] Raval, K. M. Data Mining Techniques. International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2(10), Pages 439-442. 
[23] Ahmed, S., Coenen, F., & Leng, P. (2006). Tree- based partitioning of date for association rule mining. Knowledge and information systems, Volume 10(3), Pages 315-331. 
[24] Fayyad, U. M. (1996). Data mining and knowledge discovery: Making sense out of data. IEEE Intelligent Systems, Volume 11(5), Pages 20-25. 
[25] Peppers, D. (1993). The one to one future: Building relationship one customer at a time. 
[26] Agrawal, R., & Srikant, R. (1994, September). Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB (Volume. 1215, Pages. 487-499). 
[27] Perkowitz, M., & Etzioni, O. (1999, July). Adaptive web sites: Conceptual cluster mining. In IJCAI (Volume. 99, Pages. 264-269). 
[28] Srikant, R., & Yang, Y. (2001, April). Mining web logs to improve website organization. 
In Proceedings of the 10th international conference on World Wide Web (Pages. 430-437). ACM. 
[29] Fawcett, T., & Provost, F. J. (1996, August). Combining Data Mining and Machine Learning for Effective User Profiling. In KDD (Pages. 8-13). 
[30] Sumathi, S., & Sivanandam, S. N. (2006). Introduction to data mining and its applications. Springer. 
[31] Berry, M. J., & Linoff, G. (1997). Data mining techniques: for marketing, sales, and customer support. John Wiley & Sons, Inc. 
BIOGRAPHIES 
Kautkar Rohit A, Pursuing M. Tech From MIT, Aurangabad. Since last three years he is assisting as a lecturer in Computer Technology at Government Polytechnic, Nasik. 
He has accomplished B. Tech (Computer science and Engineering) from SGGSIE&T, Nanded. He also completed Diploma in Information Technology from Government Polytechnic, Nasik. His domain of pursuit is Data Mining as well Big data, Natural Language Processing and web associated fields.

More Related Content

What's hot

Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
csandit
 
C03406021027
C03406021027C03406021027
C03406021027
theijes
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEW
ieijjournal
 
Elementary Concepts of data minig
Elementary Concepts of data minigElementary Concepts of data minig
Elementary Concepts of data minig
Dr Anjan Krishnamurthy
 
New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...
IRJET Journal
 
Big DataParadigm, Challenges, Analysis, and Application
Big DataParadigm, Challenges, Analysis, and ApplicationBig DataParadigm, Challenges, Analysis, and Application
Big DataParadigm, Challenges, Analysis, and Application
Uyoyo Edosio
 
A Survey on Big Data Mining Challenges
A Survey on Big Data Mining ChallengesA Survey on Big Data Mining Challenges
A Survey on Big Data Mining Challenges
Editor IJMTER
 
LITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUD
LITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUDLITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUD
LITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUD
International Journal of Technical Research & Application
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
AbcdDcba12
 
Data Mining
Data MiningData Mining
Data Mining
Megha Sharma
 
Big Data Paradigm - Analysis, Application and Challenges
Big Data Paradigm - Analysis, Application and ChallengesBig Data Paradigm - Analysis, Application and Challenges
Big Data Paradigm - Analysis, Application and Challenges
Uyoyo Edosio
 
A review on Visualization Approaches of Data mining in heavy spatial databases
A review on Visualization Approaches of Data mining in heavy spatial databasesA review on Visualization Approaches of Data mining in heavy spatial databases
A review on Visualization Approaches of Data mining in heavy spatial databases
IOSR Journals
 
Big data
Big dataBig data
Big data
Abhishek Palo
 
Top 10 Read articles in Web & semantic technology
 Top  10 Read articles in Web & semantic technology Top  10 Read articles in Web & semantic technology
Top 10 Read articles in Web & semantic technology
dannyijwest
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science
suresh sood
 
Data Mining
Data MiningData Mining
Data Mining
Mîrză MuNib
 

What's hot (16)

Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
 
C03406021027
C03406021027C03406021027
C03406021027
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEW
 
Elementary Concepts of data minig
Elementary Concepts of data minigElementary Concepts of data minig
Elementary Concepts of data minig
 
New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...
 
Big DataParadigm, Challenges, Analysis, and Application
Big DataParadigm, Challenges, Analysis, and ApplicationBig DataParadigm, Challenges, Analysis, and Application
Big DataParadigm, Challenges, Analysis, and Application
 
A Survey on Big Data Mining Challenges
A Survey on Big Data Mining ChallengesA Survey on Big Data Mining Challenges
A Survey on Big Data Mining Challenges
 
LITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUD
LITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUDLITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUD
LITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUD
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Big Data Paradigm - Analysis, Application and Challenges
Big Data Paradigm - Analysis, Application and ChallengesBig Data Paradigm - Analysis, Application and Challenges
Big Data Paradigm - Analysis, Application and Challenges
 
A review on Visualization Approaches of Data mining in heavy spatial databases
A review on Visualization Approaches of Data mining in heavy spatial databasesA review on Visualization Approaches of Data mining in heavy spatial databases
A review on Visualization Approaches of Data mining in heavy spatial databases
 
Big data
Big dataBig data
Big data
 
Top 10 Read articles in Web & semantic technology
 Top  10 Read articles in Web & semantic technology Top  10 Read articles in Web & semantic technology
Top 10 Read articles in Web & semantic technology
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science
 
Data Mining
Data MiningData Mining
Data Mining
 

Viewers also liked

Improving power quality in microgrid by means of using
Improving power quality in microgrid by means of usingImproving power quality in microgrid by means of using
Improving power quality in microgrid by means of using
eSAT Publishing House
 
Analysis of power quality improvement in grid
Analysis of power quality improvement in gridAnalysis of power quality improvement in grid
Analysis of power quality improvement in grid
eSAT Publishing House
 
E hdas e- healthcare diagnosis & advisory system
E  hdas e- healthcare diagnosis & advisory systemE  hdas e- healthcare diagnosis & advisory system
E hdas e- healthcare diagnosis & advisory system
eSAT Publishing House
 
Properties of the cake layer in the ultrafiltration of polydisperse colloidal...
Properties of the cake layer in the ultrafiltration of polydisperse colloidal...Properties of the cake layer in the ultrafiltration of polydisperse colloidal...
Properties of the cake layer in the ultrafiltration of polydisperse colloidal...
eSAT Publishing House
 
Reconfigurable and versatile bil rc architecture design with an area and powe...
Reconfigurable and versatile bil rc architecture design with an area and powe...Reconfigurable and versatile bil rc architecture design with an area and powe...
Reconfigurable and versatile bil rc architecture design with an area and powe...
eSAT Publishing House
 
The association of predisposing and enabling factors
The association of predisposing and enabling factorsThe association of predisposing and enabling factors
The association of predisposing and enabling factors
eSAT Publishing House
 
The review on automatic license plate recognition
The review on automatic license plate recognitionThe review on automatic license plate recognition
The review on automatic license plate recognition
eSAT Publishing House
 
Low complexity digit serial fir filter by multiple constant multiplication al...
Low complexity digit serial fir filter by multiple constant multiplication al...Low complexity digit serial fir filter by multiple constant multiplication al...
Low complexity digit serial fir filter by multiple constant multiplication al...
eSAT Publishing House
 
Optimal cfo channel estimation for adaptive receiver design
Optimal cfo channel estimation for adaptive receiver designOptimal cfo channel estimation for adaptive receiver design
Optimal cfo channel estimation for adaptive receiver design
eSAT Publishing House
 
Implementation of pid control to reduce wobbling in a
Implementation of pid control to reduce wobbling in aImplementation of pid control to reduce wobbling in a
Implementation of pid control to reduce wobbling in a
eSAT Publishing House
 
Performance analysis of autonomous microgrid
Performance analysis of autonomous microgridPerformance analysis of autonomous microgrid
Performance analysis of autonomous microgrid
eSAT Publishing House
 
Design and implementation of secured scan based attacks on ic’s by using on c...
Design and implementation of secured scan based attacks on ic’s by using on c...Design and implementation of secured scan based attacks on ic’s by using on c...
Design and implementation of secured scan based attacks on ic’s by using on c...
eSAT Publishing House
 
Comparative study of tribological characteristics of
Comparative study of tribological characteristics ofComparative study of tribological characteristics of
Comparative study of tribological characteristics of
eSAT Publishing House
 
Smoke detection in video for early warning using
Smoke detection in video for early warning usingSmoke detection in video for early warning using
Smoke detection in video for early warning using
eSAT Publishing House
 
Design of all digital phase locked loop
Design of all digital phase locked loopDesign of all digital phase locked loop
Design of all digital phase locked loop
eSAT Publishing House
 
Comparative investigation of vlm codes for joined wing
Comparative investigation of vlm codes for joined wingComparative investigation of vlm codes for joined wing
Comparative investigation of vlm codes for joined wing
eSAT Publishing House
 
Wireless server for total healthcare system for clients
Wireless server for total healthcare system for clientsWireless server for total healthcare system for clients
Wireless server for total healthcare system for clients
eSAT Publishing House
 
Investigation of behaviour of 3 degrees of freedom
Investigation of behaviour of 3 degrees of freedomInvestigation of behaviour of 3 degrees of freedom
Investigation of behaviour of 3 degrees of freedom
eSAT Publishing House
 
A survey on incremental relaying protocols in cooperative communication
A survey on incremental relaying protocols in cooperative communicationA survey on incremental relaying protocols in cooperative communication
A survey on incremental relaying protocols in cooperative communication
eSAT Publishing House
 
Power system harmonic reduction using shunt active filter
Power system harmonic reduction using shunt active filterPower system harmonic reduction using shunt active filter
Power system harmonic reduction using shunt active filter
eSAT Publishing House
 

Viewers also liked (20)

Improving power quality in microgrid by means of using
Improving power quality in microgrid by means of usingImproving power quality in microgrid by means of using
Improving power quality in microgrid by means of using
 
Analysis of power quality improvement in grid
Analysis of power quality improvement in gridAnalysis of power quality improvement in grid
Analysis of power quality improvement in grid
 
E hdas e- healthcare diagnosis & advisory system
E  hdas e- healthcare diagnosis & advisory systemE  hdas e- healthcare diagnosis & advisory system
E hdas e- healthcare diagnosis & advisory system
 
Properties of the cake layer in the ultrafiltration of polydisperse colloidal...
Properties of the cake layer in the ultrafiltration of polydisperse colloidal...Properties of the cake layer in the ultrafiltration of polydisperse colloidal...
Properties of the cake layer in the ultrafiltration of polydisperse colloidal...
 
Reconfigurable and versatile bil rc architecture design with an area and powe...
Reconfigurable and versatile bil rc architecture design with an area and powe...Reconfigurable and versatile bil rc architecture design with an area and powe...
Reconfigurable and versatile bil rc architecture design with an area and powe...
 
The association of predisposing and enabling factors
The association of predisposing and enabling factorsThe association of predisposing and enabling factors
The association of predisposing and enabling factors
 
The review on automatic license plate recognition
The review on automatic license plate recognitionThe review on automatic license plate recognition
The review on automatic license plate recognition
 
Low complexity digit serial fir filter by multiple constant multiplication al...
Low complexity digit serial fir filter by multiple constant multiplication al...Low complexity digit serial fir filter by multiple constant multiplication al...
Low complexity digit serial fir filter by multiple constant multiplication al...
 
Optimal cfo channel estimation for adaptive receiver design
Optimal cfo channel estimation for adaptive receiver designOptimal cfo channel estimation for adaptive receiver design
Optimal cfo channel estimation for adaptive receiver design
 
Implementation of pid control to reduce wobbling in a
Implementation of pid control to reduce wobbling in aImplementation of pid control to reduce wobbling in a
Implementation of pid control to reduce wobbling in a
 
Performance analysis of autonomous microgrid
Performance analysis of autonomous microgridPerformance analysis of autonomous microgrid
Performance analysis of autonomous microgrid
 
Design and implementation of secured scan based attacks on ic’s by using on c...
Design and implementation of secured scan based attacks on ic’s by using on c...Design and implementation of secured scan based attacks on ic’s by using on c...
Design and implementation of secured scan based attacks on ic’s by using on c...
 
Comparative study of tribological characteristics of
Comparative study of tribological characteristics ofComparative study of tribological characteristics of
Comparative study of tribological characteristics of
 
Smoke detection in video for early warning using
Smoke detection in video for early warning usingSmoke detection in video for early warning using
Smoke detection in video for early warning using
 
Design of all digital phase locked loop
Design of all digital phase locked loopDesign of all digital phase locked loop
Design of all digital phase locked loop
 
Comparative investigation of vlm codes for joined wing
Comparative investigation of vlm codes for joined wingComparative investigation of vlm codes for joined wing
Comparative investigation of vlm codes for joined wing
 
Wireless server for total healthcare system for clients
Wireless server for total healthcare system for clientsWireless server for total healthcare system for clients
Wireless server for total healthcare system for clients
 
Investigation of behaviour of 3 degrees of freedom
Investigation of behaviour of 3 degrees of freedomInvestigation of behaviour of 3 degrees of freedom
Investigation of behaviour of 3 degrees of freedom
 
A survey on incremental relaying protocols in cooperative communication
A survey on incremental relaying protocols in cooperative communicationA survey on incremental relaying protocols in cooperative communication
A survey on incremental relaying protocols in cooperative communication
 
Power system harmonic reduction using shunt active filter
Power system harmonic reduction using shunt active filterPower system harmonic reduction using shunt active filter
Power system harmonic reduction using shunt active filter
 

Similar to A comprehensive survey on data mining

A COMPREHENSIVE SURVEY ON DATA MINING
A COMPREHENSIVE SURVEY ON DATA MININGA COMPREHENSIVE SURVEY ON DATA MINING
A COMPREHENSIVE SURVEY ON DATA MINING
Arlene Smith
 
A Clustering Based Approach for knowledge discovery on web.
A Clustering Based Approach for knowledge discovery on web.A Clustering Based Approach for knowledge discovery on web.
A Clustering Based Approach for knowledge discovery on web.
NIET Journal of Engineering & Technology (NIETJET)
 
Data Mining @ Information Age
Data Mining @ Information AgeData Mining @ Information Age
Data Mining @ Information Age
IIRindia
 
Isolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’sIsolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’s
eSAT Journals
 
A LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMININGA LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMINING
Carrie Romero
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
Subrat Swain
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET Journal
 
Big data Paper
Big data PaperBig data Paper
Big data Paper
Daryaz Fares
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its Applications
Tracy Hill
 
A STUDY- KNOWLEDGE DISCOVERY APPROACHESAND ITS IMPACT WITH REFERENCE TO COGNI...
A STUDY- KNOWLEDGE DISCOVERY APPROACHESAND ITS IMPACT WITH REFERENCE TO COGNI...A STUDY- KNOWLEDGE DISCOVERY APPROACHESAND ITS IMPACT WITH REFERENCE TO COGNI...
A STUDY- KNOWLEDGE DISCOVERY APPROACHESAND ITS IMPACT WITH REFERENCE TO COGNI...
ijistjournal
 
Ijet v5 i6p12
Ijet v5 i6p12Ijet v5 i6p12
Semantic Web Mining of Un-structured Data: Challenges and Opportunities
Semantic Web Mining of Un-structured Data: Challenges and OpportunitiesSemantic Web Mining of Un-structured Data: Challenges and Opportunities
Semantic Web Mining of Un-structured Data: Challenges and Opportunities
CSCJournals
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
IRJET Journal
 
A Review On Data Mining From Past To The Future
A Review On Data Mining From Past To The FutureA Review On Data Mining From Past To The Future
A Review On Data Mining From Past To The Future
Kaela Johnson
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
IJCSEA Journal
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
IJCSEA Journal
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
IJCSEA Journal
 
1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx
braycarissa250
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
JOSEPH FRANCIS
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Onyebuchi nosiri
 

Similar to A comprehensive survey on data mining (20)

A COMPREHENSIVE SURVEY ON DATA MINING
A COMPREHENSIVE SURVEY ON DATA MININGA COMPREHENSIVE SURVEY ON DATA MINING
A COMPREHENSIVE SURVEY ON DATA MINING
 
A Clustering Based Approach for knowledge discovery on web.
A Clustering Based Approach for knowledge discovery on web.A Clustering Based Approach for knowledge discovery on web.
A Clustering Based Approach for knowledge discovery on web.
 
Data Mining @ Information Age
Data Mining @ Information AgeData Mining @ Information Age
Data Mining @ Information Age
 
Isolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’sIsolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’s
 
A LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMININGA LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMINING
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial Domain
 
Big data Paper
Big data PaperBig data Paper
Big data Paper
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its Applications
 
A STUDY- KNOWLEDGE DISCOVERY APPROACHESAND ITS IMPACT WITH REFERENCE TO COGNI...
A STUDY- KNOWLEDGE DISCOVERY APPROACHESAND ITS IMPACT WITH REFERENCE TO COGNI...A STUDY- KNOWLEDGE DISCOVERY APPROACHESAND ITS IMPACT WITH REFERENCE TO COGNI...
A STUDY- KNOWLEDGE DISCOVERY APPROACHESAND ITS IMPACT WITH REFERENCE TO COGNI...
 
Ijet v5 i6p12
Ijet v5 i6p12Ijet v5 i6p12
Ijet v5 i6p12
 
Semantic Web Mining of Un-structured Data: Challenges and Opportunities
Semantic Web Mining of Un-structured Data: Challenges and OpportunitiesSemantic Web Mining of Un-structured Data: Challenges and Opportunities
Semantic Web Mining of Un-structured Data: Challenges and Opportunities
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
A Review On Data Mining From Past To The Future
A Review On Data Mining From Past To The FutureA Review On Data Mining From Past To The Future
A Review On Data Mining From Past To The Future
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
 
1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
 

More from eSAT Publishing House

Likely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnamLikely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnam
eSAT Publishing House
 
Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...
eSAT Publishing House
 
Hudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnamHudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnam
eSAT Publishing House
 
Groundwater investigation using geophysical methods a case study of pydibhim...
Groundwater investigation using geophysical methods  a case study of pydibhim...Groundwater investigation using geophysical methods  a case study of pydibhim...
Groundwater investigation using geophysical methods a case study of pydibhim...
eSAT Publishing House
 
Flood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, indiaFlood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, india
eSAT Publishing House
 
Enhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity buildingEnhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity building
eSAT Publishing House
 
Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...
eSAT Publishing House
 
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
eSAT Publishing House
 
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...
eSAT Publishing House
 
Shear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a reviewShear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a review
eSAT Publishing House
 
Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...
eSAT Publishing House
 
Risk analysis and environmental hazard management
Risk analysis and environmental hazard managementRisk analysis and environmental hazard management
Risk analysis and environmental hazard management
eSAT Publishing House
 
Review study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear wallsReview study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear walls
eSAT Publishing House
 
Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...
eSAT Publishing House
 
Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...
eSAT Publishing House
 
Coastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of indiaCoastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of india
eSAT Publishing House
 
Can fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structuresCan fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structures
eSAT Publishing House
 
Assessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildingsAssessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildings
eSAT Publishing House
 
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
eSAT Publishing House
 
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
eSAT Publishing House
 

More from eSAT Publishing House (20)

Likely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnamLikely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnam
 
Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...
 
Hudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnamHudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnam
 
Groundwater investigation using geophysical methods a case study of pydibhim...
Groundwater investigation using geophysical methods  a case study of pydibhim...Groundwater investigation using geophysical methods  a case study of pydibhim...
Groundwater investigation using geophysical methods a case study of pydibhim...
 
Flood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, indiaFlood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, india
 
Enhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity buildingEnhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity building
 
Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...
 
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
 
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...
 
Shear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a reviewShear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a review
 
Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...
 
Risk analysis and environmental hazard management
Risk analysis and environmental hazard managementRisk analysis and environmental hazard management
Risk analysis and environmental hazard management
 
Review study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear wallsReview study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear walls
 
Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...
 
Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...
 
Coastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of indiaCoastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of india
 
Can fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structuresCan fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structures
 
Assessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildingsAssessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildings
 
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
 
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
 

Recently uploaded

A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
DharmaBanothu
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
uqyfuc
 
Introduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.pptIntroduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.ppt
Dwarkadas J Sanghvi College of Engineering
 
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
OKORIE1
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
drshikhapandey2022
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
DharmaBanothu
 
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
nonods
 
Ericsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.pptEricsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.ppt
wafawafa52
 
Object Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOADObject Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOAD
PreethaV16
 
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
IJCNCJournal
 
Northrop Grumman - Aerospace Structures Overvi.pdf
Northrop Grumman - Aerospace Structures Overvi.pdfNorthrop Grumman - Aerospace Structures Overvi.pdf
Northrop Grumman - Aerospace Structures Overvi.pdf
takipo7507
 
Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
Kamal Acharya
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
aryanpankaj78
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
Kamal Acharya
 
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call GirlCall Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
sapna sharmap11
 
Advancements in Automobile Engineering for Sustainable Development.pdf
Advancements in Automobile Engineering for Sustainable Development.pdfAdvancements in Automobile Engineering for Sustainable Development.pdf
Advancements in Automobile Engineering for Sustainable Development.pdf
JaveedKhan59
 
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
DharmaBanothu
 
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICSUNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
vmspraneeth
 
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
upoux
 
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
nedcocy
 

Recently uploaded (20)

A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
Introduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.pptIntroduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.ppt
 
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
 
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
 
Ericsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.pptEricsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.ppt
 
Object Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOADObject Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOAD
 
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
 
Northrop Grumman - Aerospace Structures Overvi.pdf
Northrop Grumman - Aerospace Structures Overvi.pdfNorthrop Grumman - Aerospace Structures Overvi.pdf
Northrop Grumman - Aerospace Structures Overvi.pdf
 
Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
 
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call GirlCall Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
 
Advancements in Automobile Engineering for Sustainable Development.pdf
Advancements in Automobile Engineering for Sustainable Development.pdfAdvancements in Automobile Engineering for Sustainable Development.pdf
Advancements in Automobile Engineering for Sustainable Development.pdf
 
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
 
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICSUNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
 
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
 

A comprehensive survey on data mining

  • 1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 185 A COMPREHENSIVE SURVEY ON DATA MINING Kautkar Rohit A1 1M. Tech Scholar, Computer science and technology, Maharashtra Institute of Technology (MIT), Aurangabad, Maharashtra, India Abstract Now a day’s internet is a significant place for interchanging of data like text, images, audio, and video and for share-out information preferably in digital form. The usage of internet leads to accessing the immense amount of data. Data may be unstructured data, structure data, and semi-structured data. So we are storing and processing such vast amount of data having gigantic complexity. Researchers from the University of Berkeley estimate that every year about one Exabyte (= 1 Million Terabyte) of data brought forth, of which a large portion is in digital form. For ex., 100 hours of videos are uploaded on YouTube every minute (statistics from YouTube). The query is how to analyze such ample data effectively and efficiently? Answer is data mining. Data mining pertains to the process of analyzing, studying such declamatory quantity of data for witnessing useful patterns and knowledge. Today we have accumulation of large amount of data but deficiency of knowledge. In this paper our focusing on surveillance of a nimble arising field data mining which is also known as knowledge discovery from data (KDD). We are constituting fundamentals of data mining, also several strategies for analyzing data like classification, estimation, prediction, association rules, clustering etc., data mining process, its types like web, content and structure mining. After understating basics, we are presenting the different data mining models like decision tree, neural network with their bedrocks. Also presents real world applications and future scope of the dynamic and extraordinary discipline. Keywords: - Internet, Data, Unstructured Data, Structure Data, Semi-Structured Data, Data Mining, Knowledge Discovery from Data (KDD) --------------------------------------------------------------------***---------------------------------------------------------------------- 1. INTRODUCTION Now day’s internet is a crucial place for interchanging the information and for sharing, dealing out the digital contents like text, audio, video, images etc. In that social networking sites acts crucial role like face book (www.facebook.com), LinkedIn micro blogging sites like Twitter (www.twitter.com) for sharing digital contents. So they produce, process huge and gigantic quantity of data every day in fact every minute. For example According to public statistics more than 48 hours of video contents are uploaded every minute and billions views are rendered every day on YouTube (www.youtube.com). Researchers from the University of Berkeley estimate that every year about 1 Exabyte (= 1 Million Terabyte) of data are generated, of which a large portion is available in digital form [3]. For goes on updates it connects to social networking sites like Face book, Twitter etc. so amount of data produce is quite vast and may have size in millions megabytes (or more) and complex in structure. So it leads to use highly efficient, advanced tools and techniques for intension of analyzing, processing such data. Analyzing and processing of data allow comprehending useful information and knowledge about the data. The term "Data mining" is acquainted in 1990s [12]. So inquisitory for knowledge in data is nothing but data mining [13]. Mining is important since it grant learning about versatile trends lives in data [18]. For example, Google search engine which accept and process huge amount of information every minute. If it just have a large accumulation of data and do not have knowledge regarding it? Then it’s pathetic. On other hand, if we employ data mining tools and techniques then we may find practicable patterns and trends. Google demonstrates different current trends according to geographical area which is example of gaining knowledge from user submitted queries. Search engines and social networking sites like Google and face book produce, store and process huge amount of data in terms of volume, velocity and variety denoted as big data. Big data integrates structured, semi-structured and structured data. So in reality data mining go through big data which in huge in size and complex in structure. The ambitious matter is that, it is continuously and rapidly increasing. There are several reasons for increasing the data on internet in rapid fashion. The use of internet leads to access, process and creation of data. For example, the use of face book leads sharing or uploading any images, videos. It’s nothing but creation of data. Another example is use of twitter. When we post any twit then it contributes to creation of Data. The Company’s which deals with huge amount of data everyday for them the field of data mining is vital.
  • 2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 186 The data over internet may available in different types. Figure 1 depicted its types like structure, semi-structure and structure. When we have vast amount of data then it produces big data which creates another challenge in every aspect. The sources of data may be internal or external. Such data analyzed by predefined tools and technique referred as data mining. It allows discovering utile trends and patterns in data. Fig 1: Types of Data The simplest example of pattern, milk can be purchased with bread or computer can be purchased with Antivirus software [13]. According to Gartner group (www.gartner.com), data mining is the process of discovering meaningful correlations, patterns and trends by shifting large amount of data store in repositories, using pattern recognition technologies as well as statistical and mathematical techniques [12]. We have large amount of data available but don’t have knowledge concerning it. So the data mining imparts a way for experiencing knowledge from data. For example, Wall mart a megastore, which records millions of transaction every day, give rise to creation of huge data [13]. So it we don’t have a satisfying way to canvass it then its utility will be atrophied. So the usefulness can be enhanced by analyzing it with different tools and techniques. At present, the data mining established its grandness in field of information retrieval and processing and there is no uncertainty that in near future, it will keep arising. Data mining also called Knowledge discovery in Database (KDD) i.e. the process of exploring useful patterns and knowledge from data. Knowledge discovery is the treatment of analyzing data for future projection [21]. To gratify the demand of analyzing the available large volume data sets, more efficiently the data mining employs diverse fields like Machine learning, Information retrieval, Statistics, and visualization. Figure 2 shows; there are two primal intentions of data mining process- prediction and description. Prediction permit to influence the stranger or future values of pursuit where as description grant noticing patterns and describing the data [6]. The Meta Group estimates that the market size for data mining market will grow from $50 million in 1996 to $800 million by 2000 [11]. As shown in figure 3, the consolidation of diverse fields leads to new dynamic, encouraged area data mining. The data mining construct is actually formulated from statistics and machine learning. It also consolidates artificial intelligence, Information retrieval, visualization etc. We can conflate data mining with machine learning techniques for deceitful usage of cellular telephones grounded on profiling customers [29]. Fundamentally machines ascertain from data collected in past like humans learns from past experience and every time uses yesteryear experience to perform more beneficial. It is similar to supervised learning also referred as classification. It uses classification algorithms like decision tree or naive bayes [14]. When genuine utile pattern detected, then its usefulness can be determined with assist of statistics. We may visualize our data that to analyzed, the data visualization is ministrant for discovering interesting patterns easily. Basically computers cater easy mode to view data in more powerful and efficient manner. Now there are more powerful visualization tools are available [18]. The visualization tools can help for discovering approximate idea about the kind of data and its quality, also to anticipate where the useful patterns will situate [30]. At start-of paper, we stated, the internet is place for interchanging the digital contents over web. So it having gigantic size of information available and to explore accurate information in which user is interested is challenge. The search is most prominent practical application of web. The field which is participating in this problem is Text mining. The target of Information retrieval system is to elicit documents pertained to user query (a set of keywords) from huge amount of available information. The user’s information demand can be provide in form of query to Search engines [14]. Text analysis is performed for ordering documents. After exploring documents, their degree of matching keyword query is used to rank those [19]. Different commercial algorithms used by search engines. Basically Information Retrieval (IR) Systems are employs Information agents [30] which search for appropriate documents. The challenge is retrieve relevant documents to user query with fast response. Also the web pages are not just text; they also with hyperlinks and anchored text [14]. Data mining system can be very complex or simple as it integrates different arenas. All subfields are important in data mining as they grant constructing solution to a greater extent complex problem. It also support for miscellany of data mining system [13]. So data mining system can be class based on measures like kind of database used for mining, kinds of knowledge mined and levels of abstraction of knowledge mined. Data mining system can be constructed based on what type of pattern we are exploring, regular or irregular [13]. Statistics are useful for assess data. The data may continuous or discrete. So different themes used to measure continuous data like average, mean, median, mode. To measure discrete data, histograms can be used which basically represent single values [31].
  • 3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 187 Fig 2: Data Mining Process Goal Fig 3: Association of data mining with other fields 2. RELATED WORK Various data mining techniques that allow extracting unidentified relationships among the data items from large data collection that are useful for decision making [24]. Techniques for performing clustering on supervised data proposed [23]. Data mining techniques can play an important role in rule refinement even if the sample size is limited [8]. Artificial neural network models are proposed for imitate the human brain model. Decision tree allow segmenting the data in to appropriate groups. Both artificial neural network and decision tree have idea where to look for patterns in data. The use of market basket analysis & clustering techniques does not require any knowledge about relationships in the data, knowledge is discovered when these techniques are applied to the data. Market basket analysis tools sift through data to let retailers know what products are being purchased together. Clusters prove to be most useful when they are integrated into a marketing strategy [8]. Different marketing schemes shows that the firms profit can be increased by focusing on different marketing resources [25]. Apriori Algorithm [26] for data mining for frequent pattern mining proposed which uses prior knowledge about data. Different approaches to automatic construction of web pages by using log file data proposed [27]. The expected location of web pages can be handle page caching by browser. For this, algorithm automatically discovers pages in a website whose location is different from where visitors expect to find them [28]. The concept of a KDDM process model was primitively devised in 1989. Initial KD systems provided only a single DM technique, such as a decision tree or clustering algorithm, with a very imperfect support for the overall process framework. Initially it was using exclusive Data mining technique like association rule or clustering or decision tree. In Advances Knowledge Discovery and Data Mining define the process as being highly interactive and complex [24]. They also hint that KDDM process may comprise aggregation of Data mining techniques for processing and observing more complex patterns [10]. 3. DATA MINING STRATEGIES AND PROCESS 3.1 Data Mining Strategies There are following strategies for analyzing data items- 1. Classification in which items are ordered /classified in to usable target classes. 2. Estimation allows ascertaining unidentified output variables. 3. Prediction allows determining the future consequence. It is similar to classification and estimation. 4. Association Rules allow analyzing the big data sets for detecting useful patterns and relationships between data items. Masses of applications are consume association rule concept. 5. Clustering allow grouping of items according to alike properties and behaviors. There are various algorithms are available for clustering like K-means. 3.2. Data Mining Process Exploring utile patterns and knowledge from available data invokes fuse of steps as depicted in figure 4, like [13]- 1. Data Cleaning As we go through, data is collected from internal or external sources and may contain noise and inconsistent information. Data cleaning phase allow getting rid of noise and inconsistent data. When data set is large, then data cleaning is time demanding process. 2. Data integration The data may available in scattered structure. So data integration phase allow to blend it from different sources. Phase 1 and 2 are considered as preprocessing phases and accompanying data may be stored in to data warehouse. (Not mandatory).
  • 4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 188 Fig 4: Data Mining Process [22] 3. Data selection Allow retrieving data from database for analyzing. According to problem domain, the different data sources can be selected. 4. Data transformation The data transformed into the form which will be appropriate for processing and analyzing. 5. Data mining Actual grasping of useful knowledge and patterns in data is achieved by data mining. 6. Pattern evaluation In this, noticing truly interesting patterns, various standards like lift, support, confidence etc. can be used. 7. Knowledge presentation Visualizing knowledge and representation technique can be used to constitute results. 4. DATA MINING MODELS There are many different data models are available for figuring out the data mining problems. The models are like decision tree, neural network, naive bayes, classifiers, lazy learners etc. 4.1 Decision Trees It is also mentioned as classification tree. The decision tree allows distinguishing data and classifying it into different subdivisions or groups. As shown in Figure 5, it follows the divide and conquers scheme. As it is decision tree it permits drawing decisions by trialing different potential considerations. The node of tree depicting the testing of attributes values. It allows testing of condition with several relational and comparison operators. Decision tree can be used for both description and prediction. Various algorithms are available like ID5, C4.5 and C5 etc. Decision tree act upon data that can be presents using table where each row is keyed out as instance shown in Table 1. Various useful patterns can be extracted from table for learning and knowledge intention as shown in figure 5. Fig 5: Decision Tree Decision trees are interesting as they are based on rules. They are used to explore relationships amongst the data items. The decision trees are structure used to divide large data set in to small sets based on rules. Rules can be written into English language [31]. Table 1: Table with Different Instance of Data Sr.No Employee Name Age Gender Salary Item Buy 1 Ram 30 M 40000 TV 2 Chetan 35 M 52000 {TV - >VCD Player} 3 Prasad 25 M 25000 {Laptop- >Antivirus Software} 4 Sangeeta 45 F 45000 {Laptop- >Antivirus software} 4.2 Neural Network The concept of neural network is fundamentally inherited from the construction of human brain. Basically the neural networks are used for acquiring knowledge from various data sets and amend the performance of system. In Neural Network the knowledge is presented using interconnected processors addressed neurons. Each neuron consist weight value and with defined function. [1, 5] As indicated in figure 6, there are different input vectors(x1, x2…..xm) with consorted weights (w1, w2…..wm) and possible outcome positive or negative. The neurons are cultivated to reacts each input vector. The vectors from training set is applied over Neural Network, if the output is right then no alteration is made otherwise the weights and bias are adjusted. When the NN is fully trained, then it will classify the applied input vector by using its knowledge. Different classifiers can be used to classify the output.
  • 5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 189 Fig 6: Neural Network 4.3 Association Rules Association rules basically give the relation between different attributes and their values. Gobs of useful associations can be explored by analyzing the data sets. Association rule utilizes the concept of support and confidence to determine the usefulness of rule. Patterns are practicable for probing the buying habits of customer for business analysis. Support for items X1 and Y1 for particular transaction can be given by percentage of transactions from data set which let in both items i.e. X1 U Y1. Confidence for association rule X1->Y1 is given by percentage of transaction from data set containing X1 also with Y1 [13]. The rule is genuinely useful if it has support and confidence value above the defined support count for that dataset. The support count can fix by domain expert [13]. 4.4 Naive Bayes The naive bayes constructs model that predict the probability of particular outcome. The possible outcomes can be predicted by finding patterns and correlations between the data instances. Afterward it picks out the data mining model for representation of the patterns and relations. 4.5 Machine Learning and Statistics As we know the data mining field extensively uses the concepts of machine learning and statistics for mining complex patterns. The statistics and machine learning are really much pertained arenas and in much sense depend on each other. The statistics are basically used for constructing the hypothesis where the hypothesis is used in the machine learning for generalization process. 5. TYPES OF MINING 5.1 Web Mining The web mining grants eliciting information from web documents and services using data mining techniques. Web mining term devised by Etzioni [4]. Internet comprises immense volume of data and is the most declamatory data source for acquiring and apportioning information. The web mining pertains to the discovering useful information from the web hyperlink structure, web page contents and web usage data. There are three classes as shown in figure 7 [4]- Fig 7: Web Mining Types 5.1.1 Web Structure Mining Web structure mining has to do with discovering the useful knowledge is from the hyperlinks. Basically, hyperlinks represent structure of web. Here, we are interested in structure of web documents and also in structure of hyperlinks [4]. It as well applies concept of social network analysis in which the graph can be used to represent the structure. For example, on twitter, the graph based database (FLOCK DB) can be utilized to represent users with their followers. 5.1.2 Web Content Mining Basically it grants to extract the useful patterns and knowledge from the web page contents. It is based on traditional mining process which allows classifying and clustering web pages (referred as items) according to their types of contents [4].There are two different standpoints are available in concern with web content mining- 1) Information Retrieval 2) DB View Basically information retrieval is for semi structured and unstructured data. It permits querying the data and finding useful information. Database view is basically for the structured information having proper organization for storage and queries. 5.1.3 Web Usage Mining Web usage mining rivets the technique that could predict user behavior whenever the user interacts with web. The data is viewed in terms of interactivity with web application. The web usage mining basically uses the web logs and server logs for analyzing it. The usage data can be depicted with relational tables or graphs. The usage mining is applies concepts like association rules, machine learning etc. for mining aim. The usage mining is important at many places like web application construction, marketing etc.
  • 6. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 190 The web usage mining is useful in learning the user profile for providing better personalized experience. It is also helpful in the dynamic recommendation systems. Web usage mining permits to study user access patterns for giving better experience [19]. The issue of privacy arises; it exposes the privacy because the user profile data is used for clustering and analyzing the user profiles. 6. APPLICATIONS OF DATA MINING The data mining is applicable in panoptic and diverse areas. The data mining tools and techniques are really play important role in fields where data is integral, highly sensitive and important [3]. 1. Retail Industry Huge amount of data is generated and stored. Data may comprise transaction details of sales, customer shopping etc. The stored data can be utilized for ascertaining the buying habits of customers, ameliorating the business, supervising the shelf space etc. 2. Telecommunication Industry In telecommunication industry it can be useful for improving service quality, making better use of resources etc. 3. Biological Data Analysis The data mining is useful here, to interpret the biological sequence and structure. 4. Semantic web The web pages are analyzed and organized according to their types of contents. It utilizes the proficiency referred as Resource description framework (RDF) for classifying web pages. The RDF is implemented at lots of websites like orkut, face book for tagging purpose. 5. Business Trends To serve customer more accurately, quickly and efficiently the data mining can be used. 6. Financial Data Analysis Data collected from different sources of financial sectors and analyzed using data mining. 7. Sports Many games played worldwide. So on each day many games are planned and played. It causes the vast amount of data creation. The data about each game and player can be analyzed and used to predict the performance of player. 8. Manufacturing Process In manufacturing the data mining is useful for finding faults in manufacturing process and those faults can be amended. The data is collected from the manufacturing system. 7. CONCLUSIONS This study paper canvassed the extremely dynamic and substantial area data mining. In this paper we confronted, the basics of data mining, why data mining is important, different strategies for data mining like classification, estimation, prediction, clustering, and association rules. Also we studied that many applications now day’s based on strategies association rule for mining various trends. Also data mining process describes different phases in order to find the useful patterns and knowledge. It too introduces various data mining models like decision tree which allow making decision, NN which mimic the human brain structure and used for mining patterns more accurately, association rules for exploring relationships between data items. Naive bayes, machine learning and statistics are original ancestral of data mining field. The important portion of paper gives the type of web mining like structure mining, usage mining and content mining with their description. At long last it presents the diverse areas uses the data mining. We also resolve that analyzing the huge amount of data with prominent complexity is challenging chore and data mining supplies a direction to deal with such data efficiently and effectively. The data mining is one of the spectacular and challenging field today’s date and no doubts that, hereafter it will play vital primal in many areas. ACKNOWLEDGEMENTS I am enormously grateful to Prof. Mrs. K.V. Bhosale, Prof. V. Kala, Prof. R.B. Mapari, and Prof. Sonavane of Computer science and technology Department, MIT, Aurangabad. As well thankful to Prof. M. M. Goswami, Prof. P. Ghotkar, Prof. Mrs. R.D. Kalambe, Prof. Mrs. M.S. Arade, Prof. Miss. N. S. Nikale, Prof. Mrs. S. P. Dudhe of Government Polytechnic, Nashik. I genuinely thankful to Prof. J. Gorane,Prof. N. A. Anwat who directed me in right way for penning above work. REFERENCES [1] Singh, Y., & Chauhan, A. S. (2009). Neural Networks In Data Mining. Journal Of Theoretical And Applied Information Technology, 5(6), Pages 36-42. [2] Keim, D. A. (2002). Information Visualization and Visual Data Mining. Visualization and Computer Graphics, IEEE Transactions On, Volume 8(1), Pages 1-8. [3] Silwattananusarn, T., & Tuamsuk, K. (2012). Data Mining and Its Applications for Knowledge Management: A Literature Review from 2007 to 2012. Arxiv Preprint Arxiv: 1210.2872. [4] Kosala, R., & Blockeel, H. (2000). Web Mining Research: A Survey. ACM Sigkdd Explorations Newsletter, Volume 2(1), Pages 1-15. [5] Lu, H., Setiono, R., & Liu, H. (1996). Effective Data Mining Using Neural Networks. Knowledge and Data Engineering, IEEE Transactions On, Volume 8(6), Pages 957-961. [6] Chaudhary, R., Singh, P., & Mahajan, R. A SURVEY ON DATA MINING TECHNIQUES. [7] Jain, N., & Srivastava, V. (2013). DATA MINING TECHNIQUES: A SURVEY PAPER. IJRET: International Journal of Research in Engineering and Technology, Volume 2(11).
  • 7. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 191 [8] Hilage, T. A., & Kulkarni, R. V. (2012). Review of Literature on Data Mining. IJRRAS, Volume 10(1), Pages 107-114. [9] Mann, A. K., & Kaur, N. (2013). Survey Paper on Clustering Techniques. International Journal of Science, Engineering and Technology Research, Volume 2(4), Pp-0803. [10] Kurgan, L. A., & Musilek, P. (2006). A Survey Of Knowledge Discovery And Data Mining Process Models. The Knowledge Engineering Review, Volume 21(01), Pages 1-24. [11] Goebel, M., & Gruenwald, L. (1999). A Survey of Data Mining and Knowledge Discovery Software Tools. ACM SIGKDD Explorations Newsletter, Volume 1(1), Pages 20-33. [12] Sharma, M. Data Mining: A Literature Survey. [13] Han, J., & Kamber, M. (2006). Data Mining, Southeast Asia Edition: Concepts and Techniques. Morgan Kaufmann. [14] Liu, B. (2007). Web Data Mining. Springer-Verlag Berlin Heidelberg. [15] Bramer, M. (2013). Principles of Data Mining. Springer. [16] Witten, I. H., & Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann. [17] Wattenhofer, M., Wattenhofer, R., & Zhu, Z. (2012, June). The Youtube Social Network. In ICWSM. [18] Hand, D. J., Mannila, H., & Smyth, P. (2001). Principles of Data Mining. MIT Press. [19] Markov, Z., & Larose, D. T. (2007). Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage. John Wiley & Sons. [20] Larose, D. T. (2006). Data Mining Methods & Models. John Wiley & Sons. [21] Bhosle, K. (2010). KDS for Sericulture Cocoon Production. International Journal of Computer Applications, Volume 1(18), Pages 86-90. [22] Raval, K. M. Data Mining Techniques. International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2(10), Pages 439-442. [23] Ahmed, S., Coenen, F., & Leng, P. (2006). Tree- based partitioning of date for association rule mining. Knowledge and information systems, Volume 10(3), Pages 315-331. [24] Fayyad, U. M. (1996). Data mining and knowledge discovery: Making sense out of data. IEEE Intelligent Systems, Volume 11(5), Pages 20-25. [25] Peppers, D. (1993). The one to one future: Building relationship one customer at a time. [26] Agrawal, R., & Srikant, R. (1994, September). Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB (Volume. 1215, Pages. 487-499). [27] Perkowitz, M., & Etzioni, O. (1999, July). Adaptive web sites: Conceptual cluster mining. In IJCAI (Volume. 99, Pages. 264-269). [28] Srikant, R., & Yang, Y. (2001, April). Mining web logs to improve website organization. In Proceedings of the 10th international conference on World Wide Web (Pages. 430-437). ACM. [29] Fawcett, T., & Provost, F. J. (1996, August). Combining Data Mining and Machine Learning for Effective User Profiling. In KDD (Pages. 8-13). [30] Sumathi, S., & Sivanandam, S. N. (2006). Introduction to data mining and its applications. Springer. [31] Berry, M. J., & Linoff, G. (1997). Data mining techniques: for marketing, sales, and customer support. John Wiley & Sons, Inc. BIOGRAPHIES Kautkar Rohit A, Pursuing M. Tech From MIT, Aurangabad. Since last three years he is assisting as a lecturer in Computer Technology at Government Polytechnic, Nasik. He has accomplished B. Tech (Computer science and Engineering) from SGGSIE&T, Nanded. He also completed Diploma in Information Technology from Government Polytechnic, Nasik. His domain of pursuit is Data Mining as well Big data, Natural Language Processing and web associated fields.