20120140506004

52
-1

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
52
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

20120140506004

  1. 1. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 23-26 © IAEME 23 DATA MINING PREDICTION USING DATA MINING EXTENSIONS (DMX): A CASE STUDY ON E-GOVERNANCE BIRTH REGISTRATION DATA MINING MODEL Pushpal Desai1 1 (M.Sc. (I.T.) Programme, VNSGU, Surat, India) ABSTRACT In this work, implementation of Data Mining Extensions (DMX) query on various Data Mining Models is discussed. In last few years, many private companies have extensively used Data Mining for prediction analysis. Similarly, in this paper, implementation of DMX prediction queries on Data Mining Models for e-governance data is discussed. The results derived from DMX predication queries indicate that prediction analysis could be used by administrators for future planning and decision making. KEYWORDS: Data Mining Extensions (DMX), Prediction Query, Microsoft SQL Server Analysis Services. I. INTRODUCTION Data Mining is successfully implemented in several domains such as Banking, Insurance, Credit Card Fraud Detection, Loan Approval, Customer Relationship Management, Weather Forecasting, Oil and Gas Exploration, Mining, Network Security, Telecommunication, Medical Science etc…Depending upon the problem, different Data Mining approaches like Clustering, Classification, Association Rules Mining, Time Series Analysis, Regression, Sequence Analysis. Besides data mining algorithm, Data Mining Extensions (DMX) is successfully implemented in different areas like “Heart disease decision support system using data mining classification modeling techniques” [6], “Risk assessment of complication of arterial high blood pressure” [7], “Prediction control strategies for industrial processes” [8] etc…Similarly, in this work, DMX is used on Birth registration e-governance data mining model. INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET) ISSN 0976 - 6480 (Print) ISSN 0976 - 6499 (Online) Volume 5, Issue 6, June (2014), pp. 23-26 © IAEME: http://www.iaeme.com/IJARET.asp Journal Impact Factor (2014): 7.8273 (Calculated by GISI) www.jifactor.com IJARET © I A E M E
  2. 2. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 23-26 © IAEME 24 II. METHODOLOGY In this work, Data Mining is used to make prediction based on different Data Mining Models. Data Mining Extensions (DMX) language is specially designed to work with Microsoft SQL Server Analysis Services. We can use DMX language for creating new data mining model, train data mining model, browse data mining model and predict from data mining model [1]. There are mainly two types of DMX statements. The data definition statements allow creating new data mining structure and models and drop existing data mining structure and models. The data manipulation statements work with existing data mining models and structures. The data manipulation statements allow browsing and prediction from the existing data mining models [2]. In this work, DMX data manipulation statements are considered for making prediction from the existing data mining models. We can use DMX prediction query for "Prediction join", "Natural prediction join", "Empty prediction join" and "Singleton query" [3]. In this work, “Empty prediction join” DMX queries are implemented and "Prediction join", "Natural prediction join" and "Singleton query" DMX queries are not considered. The DMX empty prediction join query is used for most likely prediction from the content of the mining model [3]. Typically, prediction queries are used to predict unknown column values [3]. However, we can use regular prediction query to create prediction from the cases from the data sources [3]. In this type of DMX query, we do not pass any information to the mining model input columns and mining model returns the most likely prediction [5]. The “Predict” function is used to predict “Delivery Method ID” state and “PredictProbability” function to predict probability for different states from the data mining model. The Association Rules model for Birth Registration e-governance data contains various input fields such as Religion, Father Education, Mother Education, Year and Delivery Method ID and Delivery Attention ID as predict only fields. In DMX Query 1.1, Association Rules model for Birth Registration e-governance data is used to predict most likely “Delivery Method ID” state. Many times, besides the most likely outcome, the data owners are also interested in knowing probability of other states of particular attribute. In this scenario PredictProbability function can be utilized [4]. In the same query, PredictProbability function is used to predict probability of various states such as Delivery Method =1 for Caesarean, Delivery Method = 2 for Forceps / Vaccum and Delivery Method =3 for Natural. DMX Query 1.1 SELECT Predict([AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict].[Delivery Method ID]) as [Delivery Method ID], PredictProbability([AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict].[Delivery Method ID],1) as [Method 1: Caesarean], PredictProbability([AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict].[Delivery Method ID],2) as [Method 2: Forceps/Vaccum], PredictProbability([AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict].[Delivery Method ID],3) as [Method 3: Natural] From [AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict] Similarly, in the DMX Query 1.2, the Association Rules mining model is used to predict most likely “Delivery Attention ID” along with different states. In the same query PredictProbability function is used to predict probability of various states such as Delivery Attention =1 for Doctor, Nurse or Trained Midwife, Delivery Attention=2 for Institutional-Government, Delivery Attention =3 for Institutional-Private or Non-Government, Delivery Attention = 4 for Relatives or Other, and Delivery Attention = 5 for Traditional Birth Attendant.
  3. 3. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 23-26 © IAEME 25 DMX Query 1.2 SELECT Predict([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID]) as [Delivery Attention ID], PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],1) as [Method1:Doctor, Nurse or Trained Midwife], PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],2) as [Method2:Institutional-Government], PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],3) as [Method3:Institutional-Private or Non-Government], PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],4) as [Method4:Relatives or Other], PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],5) as [Method5:Traditional Birth Attendant] From [Asso_FT_MT_EDU_DEL_METHOD] III. RESULTS The data mining prediction queries were executed on data mining models. These DMX query were executed by using “Predict” and “PredictProbability” functions. The “Predict” function returns predicted values or set of values for a specified column and “PredictProbability” functions returns probability of specified state. In both DMX queries, scalar column is given to the predict function and its result is also the scalar value [4]. The result of DMX Query 1.1 predicted most likely value “3” for “Delivery Method ID” attribute. The result indicates that the most likely delivery methods as “Natural” with “0.77” probability. Fig 1: The result of DMX Query 1.1 The result of DMX Query 1.2 predicted value “3” for “Delivery Attention ID” attribute. The result indicates that the most likely delivery attention method “Institutional-Private or Non - Government” with “0.54” probability. Fig 2: The result of DMX Query 1.2
  4. 4. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 23-26 © IAEME 26 IV. CONCLUSION This work demonstrates the use DMX query for making prediction from existing data mining models. The predictions derived from DMX queries can be utilized by top level management for planning and decision making. The work presented in the paper is very limited considering full potential of DMX queries. However, in future, other DMX query types like "Prediction join", "Natural prediction join" and "Singleton query" can be considered to take full advantage of DMX and extend research areas. V. ACKNOWLEDGEMENT AND LIMITATIONS All results are based on data provided by the municipal corporation for the research purpose only. Hence results may change, if data mining algorithms and DMX queries are applied on actual data sets. VI. REFERENCES [1] Data Mining Extensions (DMX) References SQL Server 2012 Books Online, Microsoft. [2] http://technet.microsoft.com/en-us/library/ms132058.aspx, Last access date: 15th April, 2014. [3] http://technet.microsoft.com/en-us/library/ms131992.aspx, Last access date: 15th April, 2014. [4] Jamie MacLennan, ZhaoHui Tang and Bogdan Crivat, Data Mining with SQL Server 2008, Wiley Publication [5] Brian Larson, Delivering Business Intelligence with Microsoft SQL Server 2008 [6] Sellappan Palaniappan and Rafiah Awang, "Web-Based Heart Disease Decision Support System using Data Mining Classification Modeling Techniques" in the proceedings of iiWAS2007, pp. 157-167. [7] MBUYI MUKENDI Eugène, KAFUNDA KATALAYI Pierre and MBELU MUTOBABevi, DATA MINING AND NEURAL NETWORKS II DMX USE FOR RISK ASSESSMENT OF COMPLICATIONS OF ARTERIAL HIGH BLOOD PRESSURE, IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 5, No 1, September 2012, ISSN (Online): 1694-0814, pp. 377-383. [8] Waldemar and Konrad, “The use of Data Mining Approach to Predict Control Strategies for Industrial Process”, Automatuka, 2007, pp 287-293. [9] P.N.Santosh Kumar, Dr. C.Venugopal and Dr. C.Sunil Kumar, “Applications of Data Mining in Medical Databases”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 6, 2013, pp. 284 - 289, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [10] Vijay Arputharaj J and Dr.R.Manicka Chezian, “Data Mining with Human Genetics to Enhance Gene Based Algorithm and DNA Database Security”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 3, 2013, pp. 176 - 181, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [11] Chaitrali S. Dangare and Dr. Sulabha S. Apte, “A Data Mining Approach For Prediction of Heart Disease using Neural Networks”, International Journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 3, 2012, pp. 30 - 40, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.

×