SlideShare a Scribd company logo
1 of 22
Download to read offline
April 2020: Top Read Articles In
Data Mining & Knowledge
Management Process Research
Articles
International Journal of Data Mining & Knowledge
Management Process (IJDKP)
ISSN : 2230 - 9608[Online] ; 2231 - 007X [Print]
http://airccse.org/journal/ijdkp/ijdkp.html
A WEB REPOSITORY SYSTEM FOR DATA MINING IN
DRUG DISCOVERY
Jiali Tang, Jack Wang and Ahmad Reza Hadaegh
Department of Computer Science and Information System, California State University
San Marcos, San Marcos, USA
ABSTRACT
This project is to produce a repository database system of drugs, drug features
(properties), and drug targets where data can be mined and analyzed. Drug targets are
different proteins that drugs try to bind to stop the activities of the protein. Users can
utilize the database to mine useful data to predict the specific chemical properties that will
have the relative efficacy of a specific target and the coefficient for each chemical
property. This database system can be equipped with different data mining
approaches/algorithms such as linear, non-linear, and classification types of data
modelling. The data models have enhanced with the Genetic Evolution (GE) algorithms.
This paper discusses implementation with the linear data models such as Multiple Linear
Regression (MLR), Partial Least Square Regression (PLSR), and Support Vector Machine
(SVM).
.
KEYWORDS
Data Mining, Drug Discovery, Drug Description, Chemoinformatics, and Web
Application
For More Details : https://aircconline.com/ijdkp/V10N1/10120ijdkp01.pdf
Volume Link : http://airccse.org/journal/ijdkp/vol10.html
REFERENCES
[1] Ko, Gene, Reddy, Srinivas, Garg, Rajni, Kumar, Sunil, & Hadaegh, Ahmad, (2012) “Computational
Modelling Methods for QSAR Studies on HIV-1 Integrase Inhibitors (2005-2010),”. Curr Comput
Aided Drug Des. Vol. 8, No 4, pp 255-270.
[2] Thakor, Falguni, Hadaegh, Ahmad, & Zhang, Xiaoyu, (2017), ” Comparative study of Differential
Evolutionary-Binary Particle Swarm Optimization (DE-BPSO) algorithm as a feature selection
technique with different linear regression models for analysis of HIV-1 Integrase Inhibition features of
Aryl β-Diketo Acids”, Proceedings of 9th International Conference on Bioinformatics and
Computational Biology, Honolulu, Hawaii, USA, ISBN: 978–1–943436–07–1, pp 179-184.
[3] Kane Ian, & Hadaegh Ahmad, “Non-linear Quantitative Structure-Activity Relationship (QSAR)
Models for the Prediction of HIV Drug Performance”, (2015), 24th International Conference on
Software Engineering and Data Engineering, pp 63-68. Vol 1, ISBN: 9781510812277, San Diego, CA.
[4] Galvan Richard, Kashani, Maninatalsadat, & Hadaegh, Ahmad, “Improving Pharmacological Research
of HIV-1 Integrase Inhibition Using Differential Evolution-Binary Particle Swarm Optimization and
Non-Linear Adaptive Boosting Random Forest Regression”,(2015), IEEE International Workshop on
Data Integration and Mining San Francisco, Information Reuse and Integration (IRI), IEEE
International Conference, pp 485-490, DOI: 10.1109/IRI.2015.80. INSPEC Accession Number:
15556631. San Francisco, CA.
[5] Kashani, Maninatalsadat, Galvan Richard, & Hadaegh Ahmad, “Improving the Feature Selection for
the Development of Linear Model for Discovery of HIV-1 Integrase Inhibitors”, (2015) ABDA'15
International Conference on Advances in Big Data Analytics. In Proceeding of the 2015 International
Conferences on Advances on Big Data Analyses, pp 150-154. ISBN: 1-60132-411-1, Las Vegas,
Nevada.
[6] Ko, Gene, Garg, Rajni, Kumar, Sunil, Kumar, Bailey, Barbara, & Hadaegh Ahmad, “A Hybridized
Evolutionary Algorithm for Feature Selection of Chemical Descriptors for Computational QSAR
Modeling of HIV-1 Integrase Inhibitors”, (2013), Computational Science Curriculum Development
Forum and Applied Computational Science and Engineering Student Support for Industry, San Diego
State University.
[7] Ko, Gene, Garg, Rajni, Kumar, Sunil, Bailey, Barbara, & Hadaegh Ahmad, “Differential Evolution-
Binary Particle Swarm Optimization for the Analysis of Aryl b-Kiketo Acids for HIV-1 Integres
Inhibition, (2012), WCCI 2012 IEEE World Congress on Computational Intelligence. Brisbane
Australia, pp 1849-1855.
[8] Ko, Gene, Reddy, Srinivas, Kumar, Kumar, Bailey, Barbara, Garg, Rajni, & Hadaegh, Ahmad,
“Evolutionary Computational Modelling of β-Diketo Acids for Virtual Screening of HIV-1 Integrase
Inhibitors”, (2012), IEEE World Congress on Computational Intelligence, Brisbane, Australia.
[9] Ko, Gene, Reddy, Srinivas, Kumar, Kumar, Garg, Rajni, & Hadaegh, Ahmad “Evolutionary
Computational Modelling of β-Diketo Acids for Virtual Screening of HIV-1 Integrase Inhibitors”,
(2012), 243rd National Meeting of the American Chemical Society, San Diego, CA.
[10] Gonzales, Miguel, Turner, Chris, Ko, Gene, & Hadaegh, Ahmad, “Binary Particle Swarm
Optimization Model of Dimeric Aryl Diketo Acid Inhibitors for HIV-1 Integrase” (2012), 243rd
National Meeting of the American Chemical Society, San Diego, CA.
[11] Ko, Gene, Reddy, Srinivas, Kumar, Sunil, Garg, Rajni, & Hadaegh, Ahmad, “Analysis of HIV-1
Integrase Inhibitors Using Computational QSAR Modelling”, (2012), Computational Science
Curriculum Development Forum and Applied Computational Science and Engineering Student
Support for Industry, San Diego State University.
[12] Garg Rajni, Reddy Srinivas, Zhang Xiaoyu, & Hadaegh Ahmad, “MUT-HIV: Mutation database of
HIV proteases”, (2007), American Chemical Society (ACS) 234th National Meeting & Exposition,
Boston, MA USA CINF 42.
[13] MLR: http://www.stat.yale.edu/Courses/1997-98/101/linmult.htm
[14] PLSR: https://www.mathworks.com/help/stats/plsregress.html
[15] https://techdifferences.com/difference-between-descriptive-and-predictive-data-mining.html
[16] Zhong et al. Artificial intelligence in drug design. Sci China Life Sci. 2018 Jul 18. doi:
10.1007/s11427-018-9342-2. [Epub ahead of print]
[17] Varsou Dimitra-Danai, Nikolakopoulos, Spyridon, Tsoumanis Andreas, Melagraki Georgia, &
Afantitis, Antreas, “New Cheminformatics Platform for Drug Discovery and Computational
Toxicology”, (2018), Methods Mol Biol. 2018; 1800:287-311. doi: 10.1007/978-1-4939-7899-1_14
[18] Ekins, Sean, Clark, Alex, Dole, Krishna, Gregory, Kellan, Mcnutt, Andrew, Spektor, Anna,
Weatherall, Charlie, & Litterman, Nadia, “Data Mining and Computational Modeling of High-
Throughput Screening Datasets”, (2018), Methods Mol Biol, 1755:197-221. doi: 10.1007/978-1-4939-
7724-6_14.
[19] Sam Elizabeth, & Athri Prashanth, “Web-based drug repurposing tools: a survey. Brief Bioinform”,
(2017), Oct 6. doi: 10.1093/bib/bbx125. [Epub ahead of print].
[20] Kaur, Charanpreet, & Bhardwaj, Shweta, “DRUG Discovery Using Data Mining International Journal
of Information and Computation Technology”, (2014), ISSN 0974-2239 Volume 4, Number 4, pp 335-
342 © International Research Publications House http://www. irphouse.com /ijict.htm
[21] Minaei-Bidgoli, Behrouz, & Punch, William, “Using Genetic Algorithms for Data Mining
Optimization in an Educational Web-Based System, (2003), Genetic Algorithms Research and
Applications Group (GARAGe) Department of Computer Science & Engineering Michigan State
University 2340 Engineering Building East Lansing, MI 48824.
[22] https://chm.kode-solutions.net/products_dragon.php
[23] AWS LightSail: https://aws.amazon.com/lightsail/?nc2=h_ql_prod_fs_ls
[24] AWS EC2 Server: https://aws.amazon.com/ec2/
A REVIEW ON EVALUATION METRICS FOR DATA
CLASSIFICATION EVALUATIONS
Hossin, M.1
and Sulaiman, M.N.2
1
Faculty of Computer Science & Information Technology, Universiti Malaysia
Sarawak,
94300 Kota Samarahan, Sarawak, Malaysia
2
Faculty of Computer Science & Information Technology, Universiti Putra Malaysia,
43400 UPM Serdang, Selangor, Malaysia
ABSTRACT
Evaluation metric plays a critical role in achieving the optimal classifier during the
classification training. Thus, a selection of suitable evaluation metric is an important key
for discriminating and obtaining the optimal classifier. This paper systematically reviewed
the related evaluation metrics that are specifically designed as a discriminator for
optimizing generative classifier. Generally, many generative classifiers employ accuracy
as a measure to discriminate the optimal solution during the classification training.
However, the accuracy has several weaknesses which are less distinctiveness, less
discriminability, less in formativeness and bias to majority class data. This paper also
briefly discusses other metrics that are specifically designed for discriminating the optimal
solution. The shortcomings of these alternative metrics are also discussed. Finally, this
paper suggests five important aspects that must be taken into consideration in constructing
a new discriminator metric..
KEYWORDS
Evaluation Metric, Accuracy, Optimized Classifier, Data Classification Evaluation
For More Details : http://aircconline.com/ijdkp/V5N2/5215ijdkp01.pdf
Volume Link : http://airccse.org/journal/ijdkp/vol5.html
REFERENCES
[1] A.A. Cardenas and J.S. Baras, “B-ROC curves for the assessment of classifiers over imbalanced
datasets”, in Proc. of the 21st National Conference on Artificial Intelligence Vol. 2, 2006, pp. 1581-1584
[2] R. Caruana and A. Niculescu-Mizil, “Data mining in metric space: an empirical analysis of
supervisedlearning performance criteria”, in Proc. of the 10th ACM SIGKDD International Conference
onKnowledge Discovery and Data Mining (KDD '04), New York, NY, USA, ACM 2004, pp. 69-78.
[3] N.V. Chawla, N. Japkowicz and A. Kolcz, “Editorial: Special issue on learning from imbalanced
datasets”, SIGKDD Explorations, 6 (2004) 1-6.
[4] T. Fawcett, “An Introduction to ROC Analysis”, Pattern Recognition Letters, 27 (2006) 861-874.
[5] J. Davis and M. Goadrich, “The relationship between precision-recall and ROC curves”, in Proc. Ofthe
23rd International Conference on Machine Learning, 2006, pp. 233-240.
[6] C. Drummond, R.C. Holte, “Cost curves: An Improved method for visualizing classifier
performance”, Mach. Learn. 65 (2006) 95-130.
[7] P.A. Flach, P.A., “The Geometry of ROC Space: understanding Machine Learning Metrics throughROC
Isometrics”, in T. Fawcett and N. Mishra (Eds.) Proc. of the 20th Int. Conference on MachineLearning
(ICML 2003), Washington, DC, USA, AAAI Press, 2003, pp. 194-201.
[8] V. Garcia, R.A. Mollineda and J.S. Sanchez, “A bias correction function for classification
performance assessment in two-class imbalanced problems”, Knowledge-Based Systems, 59(2014) 66-74.
[9] S. Garcia and F. Herrera, “Evolutionary training set selection to optimize C4.5 in imbalance
problems”, in Proc. of 8th Int. Conference on Hybrid Intelligent Systems (HIS 2008), Washington,DC, USA,
IEEE Computer Society, 2008, pp.567-572.
[10] N. Garcia-Pedrajas, J. A. Romero del Castillo and D. Ortiz-Boyer, “A cooperative
coevolutionaryalgorithm for instance selection for instance-based learning”. Machine Learning (2010), 78
(2010)381-420.
[11] Q. Gu, L. Zhu and Z. Cai, “Evaluation Measures of the Classification Performance of
ImbalancedDatasets”, in Z. Cai et al. (Eds.) ISICA 2009, CCIS 51. Berlin, Heidelberg: Springer-Verlag,
2009,pp. 461-471.
[12] S. Han, B. Yuan and W. Liu, “Rare Class Mining: Progress and Prospect”, in Proc. of
ChineseConference on Pattern Recognition (CCPR 2009), 2009, pp. 1-5
[13] D. J. Hand and R. J. Till, “A simple generalization of the area under the ROC curve to multiple
classclassification problems”, Machine Learning, 45 (2001) 171-186.
[14] M. Hossin, M. N. Sulaiman, A. Mustapha, and N. Mustapha, “A Novel Performance Metric forBuilding
an Optimized Classifier”, Journal of Computer Science, 7(4) (2011) 582-509.
[15] M. Hossin, M. N. Sulaiman, A. Mustapha, N. Mustapha and R. W. Rahmat, “ OAERP: a BetterMeasure
than Accuracy in Discriminating a Better Solution for Stochastic Classification Training”,Journal of
Artificial Intelligence, 4(3) (2011) 187-196.
[16] M. Hossin, M. N. Sulaiman, A. Mustapha, N. Mustapha and R. W. Rahmat, “A Hybrid
EvaluationMetric for Optimizing Classifier”, in Data Mining and Optimization (DMO), 2011 3rd
Conference on,2011, pp. 165-170.
[17] J. Huang and C. X. Ling, “Using AUC and accuracy in evaluating learning algorithms”,
IEEETransactions on Knowledge Data Engineering, 17 (2005) 299-310.
[18] J. Huang and C. X. Ling, “Constructing new and better evaluation measures for machine learning”, inR.
Sangal, H. Mehta and R. K. Bagga (Eds.) Proc. of the 20th International Joint Conference onArtificial
Intelligence (IJCAI 2007), San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.,2007, pp.859-864.
[19] N. Japkowicz, “Assessment metrics for imbalanced learning”, in Imbalanced Learning:
Foundations,Algorithms, and Applications, Wiley IEEE Press, 2013, pp. 187-210
[20] M. V. Joshi, “On evaluating performance of classifiers for rare classes”, in Proceedings of the
2002IEEE Int. Conference on Data Mining (ICDN 2002) ICDM’02, Washington, D. C., USA:
IEEEComputer Society, 2002, pp. 641-644.
[21] T. Kohonen, Self-Organizing Maps, 3rd ed., Berlin Heidelberg: Springer-Verlag, 2001.
[22] L. I. Kuncheva and J. C. Bezdek, “Nearest Prototype Classification: Clustering, Genetic Algorithms,or
Random Search?” IEEE Transactions on Systems, Man, and Cybernetics-Part C: Application andReviews,
28(1) (1998) 160-164.
[23] N. Lavesson, and P. Davidsson, “Generic Methods for Multi-Criteria Evaluation”, in Proc. of theSiam
Int. Conference on Data Mining, Atlanta, Georgia, USA: SIAM Press, 2008, pp. 541-546.
[24] P. Lingras, and C. J. Butz, “Precision and recall in rough support vector machines”, in Proc. of the2007
IEEE Int. Conference on Granular Computing (GRC 2007), Washington, DC, USA: IEEEComputer Society,
2007, pp.654-654.
[25] D. J. C. MacKay, Information, Theory, Inference and Learning Algorithms. Cambridge, UK:Cambridge
University Press, 2003.
[26] T. M. Mitchell, Machine Learning, USA: MacGraw-Hill, 1997.
[27] R. Prati, G. Batista, and M. Monard, “A survery on graphical methods for classification predictive
performance evaluation”, IEEE Trans. Knowl. Data Eng. 23(2011) 1601-1618.
[28] F. Provost, and P. Domingos, “Tree induction for probability-based ranking”. Machine Learning,
52(2003) 199-215.
[29] A. Rakotomamonyj, “Optimizing area under ROC with SVMs”, in J. Hernandez-Orallo, C. Ferri,
N.Lachiche and P. A. Flach (Eds.) 1st Int. Workshop on ROC Analysis in Artificial Intelligence(ROCAI
2004), Valencia, Spain, 2004, pp. 71-80.
[30] R. Ranawana, and V. Palade, “Optimized precision-A new measure for classifier
performanceevaluation”, in Proc. of the IEEE World Congress on Evolutionary Computation (CEC 2006),
2006,pp. 2254-2261.
[31] S. Rosset, “Model selection via AUC”, in C. E. Brodley (Ed.) Proc. of the 21st Int. Conference
onMachine Learning (ICML 2004), New York, NY, USA: ACM, 2004, pp. 89.
[32] D.B. Skalak, “Prototype and feature selection by sampling and random mutation hill
climbingalgorithm”, in W. W. Cohen and H. Hirsh (Eds.) Proc. of the 11th Int. Conference on
MachineLearning (ICML 1994), New Brunswick, NJ, USA: Morgan Kaufmann, 1994, pp.293-301.
[33] M. Sokolova and G. Lapame, “A systematic analysis of performance measures for classificationtasks”,
Information Processing and Management, 45(2009) 427-437.
[34] P. N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Boston, USA: Pearson
AddisonWesley, 2006.
[35] M. Vuk and T. Curk, “ROC curve, lift chart and calibration plot”, Metodoloˇskizvezki, 3(1) (2006)89-
108.
[36] H. Wallach, “Evaluation metrics for hard classifiers”. Technical Report. (Ed.: Wallach,
2006)http://www.inference.phy.cam.ac.uk/hmw26/papers
[37] S. W. Wilson, “Mining oblique data with XCS”, in P. L. Lanzi, W. Stolzmann and S. W. Wilson(Eds.)
Advances in Learning Classifier Systems: Third Int. Workshop (IWLCS 2000), Berlin,Heidelberg: Springer-
Verlag, 2001, pp. 283-290.
[38] H. Zhang and G. Sun, “Optimal reference subset selection for nearest neighbor classification by
tabusearch”, Pattern Recognition, 35(7) (2002) 1481-1490.
A BUSINESS INTELLIGENCE PLATFORM
IMPLEMENTED IN A BIG DATA SYSTEM
EMBEDDING DATA MINING: A CASE OF STUDY
Alessandro Massaro, Valeria Vitti, Palo Lisco, Angelo Galiano and Nicola Savino
Dyrecta Lab, IT Research Laboratory, Via VescovoSimplicio, 45, 70014 Conversano
(BA), Italy.
(in collaboration with ACI Global S.p.A., VialeSarca, 336 - 20126 Milano, Via Stanislao
Cannizzaro, 83/a - 00156 Roma, Italy)
ABSTRACT
In this work is discussed a case study of a business intelligence –BI- platform developed
within the framework of an industry project by following research and development –
R&D- guidelines of ‘Frascati’. The proposed results are a part of the output of different
jointed projects enabling the BI of the industry ACI Global working mainly in roadside
assistance services. The main project goal is to upgrade the information system, the
knowledge base –KB- and industry processes activating data mining algorithms and big
data systems able to provide gain of knowledge. The proposed work concerns the
development of the highly performing Cassandra big data system collecting data of two
industry location. Data are processed by data mining algorithms in order to formulate a
decision making system oriented on call center human resources optimization and on
customer service improvement. Correlation Matrix, Decision Tree and Random Forest
Decision Tree algorithms have been applied for the testing of the prototype system by
finding a good accuracy of the output solutions. The Rapid Miner tool has been adopted
for the data processing. The work describes all the system architectures adopted for the
design and for the testing phases, providing information about Cassandra performance and
showing some results of data mining processes matching with industry BI strategies..
KEYWORDS
Big Data Systems, Cassandra Big Data, Data Mining, Correlation Matrix, Decision Tree,
Frascati Guideline
Full Text :http://aircconline.com/ijdkp/V9N1/9119ijdkp01.pdf
.
REFERNCES
[1] Khan, R. A. &Quadri, S. M. K. (2012) “Business Intelligence: an Integrated Approach”,
BusinessIntelligence Journal, Vol.5 No.1, pp 64-70.
[2] Chen, H., Chiang, R. H. L. & Storey V. C. (2012) “Business Intelligence and Analytics: from Bi Data to
Big Impact”, MIS Quarterly, Vol. 36, No. 4, pp 1165-1188.
[3] Andronie, M. (2015) “Airline Applications of Business Intelligence Systems”, Incas Bulletin, Vol. 7,No.
3, pp 153 – 160.
[4] Iankoulova, I. (2012) “Business Intelligence for Horizontal Cooperation”, Master Thesis, Univesitiet
Twente.[Online].Availablehttps://www.utwente.nl/en/mbit/finalproject/example_excellent_master_thesi/
master_thesis_bit/IankoulovaID.pdf
[5] Nunes, A. A., Galvão, T. & Cunha, J. F. (2014) “Urban Public Transport Service Co-creation:Leveraging
Passenger’s Knowledge to Enhance Travel Experience”, Procedia Social and BehavioralSciences, Vol.
111, pp 577 – 585.
[6] Fitriana, R., Eriyatno, Djatna, T. (2011) “Progress in Business Intelligence System research: Aliterature
Review”, International Journal of Basic & Applied Sciences IJBAS-IJENS, Vol. 11, No. 03,pp 96-105.
[7] Lia, M. (2015) "Customer Data Analysis Model using Business Intelligence Tools inTelecommunication
Companies", Database Systems Journal, Vol. 6, No. 2, pp 39-43.
[8] Habul, A., Pilav-Velić, A. &Kremić, E. (2012) “Customer Relationship Management and
BusinessIntelligence”, Intech book 2012: Advances in Customer Relationship Management, chapter 2.
[9] Kemper,H.-G., Baars, H. &Lasi, H. (2013) “An Integrated Business Intelligence Framework Closingthe
Gap Between IT Support for Management and for Production”, Springer: Business Intelligenceand
Performance Management Part of the series Advanced Information and Knowledge Processing,pp 13-26,
Chapter 2.
[10] Bara, A., Botha, I., Diaconiţa, V., Lungu, I., Velicanu, A., Velicanu, M. (2009) “A Model forBusiness
Intelligence Systems’ Development”, InformaticaEconomică, Vol. 13, No. 4, pp 99-108.
[11] Negash, S. (2004) “Business Intelligence”, Communications of the Association for
InformationSystems, Vol. 13, pp 177-195.
[12] Nofal, M. I. &Yusof, Z. M. (2013) “Integration of Business Intelligence and Enterprise
ResourcePlanning within Organizations”, Procedia Technology, Vol. 11 ( 2013 ), pp. 658 – 665.
[13] Williams, S. & Williams, N. (2003) “The Business Value of Business Intelligence”,
BusinessIntelligence Journal, FALL 2003, pp 1-11.
[14] Lečić, D. &Kupusinac, A. (2013) “The Impact of ERP Systems on Business Decision-Making”,TEM
Journal, Vol. 2, No. 4, pp 323-326.
[15] Ong, L., Siew, P. H. & Wong, S. F. (2011) “A Five-Layered Business Intelligence
Architecture”,IBIMA Publishing, Communications of the IBIMA,Vol. 2011, Article ID 695619, pp 1-11.
[16] Raymond T. Ng, Arocena, P. C., Barbosa, D., Carenini, G., Gomes, L., Jou, S., Leung, R. A., Milios,E.,
Miller, R. J., Mylopoulos, J., Pottinger, R. A., Tompa, F. & Yu, E. (2013) “Perspectives onBusiness
Intelligence”, A Publication in the Morgan & Claypool Publishers series Synthesis Lectureson Data
Management.
[17] “NTT DATA Connected Car Report: A brief insight on the connected car market, showingpossibilities
and challenges for third-party service providers by means of an application case study”
[Online].Available:https://emea.nttdata.com/fileadmin/web_data/country/de/documents/Manufacturing/S
tudien/2015_Connected_Car_Report_NTT_DATA_ENG.pdf
[18] “Cognizant report: Exploring the Connected Car Cognizant 20-20” [Online].
Available:https://www.cognizant.com/InsightsWhitepapers/Exploring-the-Connected-Car.pdf
International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.9, No.1, January
201919
[19] Sarangi, P. K., Bano, S., Pant, M. (2014) “Future Trend in Indian Automobile Industry: A
StatisticalApproach”, Journal of Management Sciences And Technology, Vol. 2, No. 1, pp. 28-32.
[20] Bates, H. &Holweg, M. (2007) “Motor Vehicle Recalls: Trends, Patterns and Emerging Issues”,Omega,
Vol. 35, No. 2, pp 202–210.
[21] D’Aloia, M., Russo, M. R., Cice G., Montingelli, A., Frulli, G., Frulli, E., Mancini, F., Rizzi, M.,Longo,
A. (2017) “Big Data Performance and Comparison with Different DB Systems”, InternationalJournal of
Computer Science and Information Technologies, Vol. 8, No. 1, pp 59-63.
[22] Wimmer, H. & Powell, L. M. (2015) “A Comparison of Open Source Tools for Data
Science”,Proceedings of the Conference on Information Systems Applied Research, Wilmington,
NorthCarolina USA.
[23] Al-Khoder, A. &Harmouch, H. (2014) “Evaluating four of the most popular Open Source and FreeData
Mining Tools,” IJASR International Journal of Academic Scientific Research, Vol. 3, No. 1, pp13-23.
[24] Antonio Gulli, Sujit Pal, “Deep Learning with Keras- Implement neural networks with Keras onTheano
and TensorFlow,” BIRMINGHAM – MUMBAI Packt book, ISBN 978-1-78712-842-2, 2017.
[25] Kovalev V., Kalinovsky A., and Kovalev S. Deep Learning with Theano, Torch, Caffe, TensorFlow,and
deeplearning4j: which one is the best in speed and accuracy? In: XIII Int. Conf. on PatternRecognition
and Information Processing, 3-5 October, Minsk, Belarus State University, 2016, pp. 99-103.
[26] Frascati Manual 2015: The Measurement of Scientific, Technological and Innovation
ActivitieGuidelines for Collecting and Reporting Data on Research and Experimental Development.
OECD(2015), ISBN 978-926423901-2 (PDF).
[27] Massaro, A. Maritati, V., Galiano, A., Birardi, V. &Pellicani, L. (2018) “ESB Platform
IntegratinKNIME Data Mining Tool oriented on Industry 4.0 Based on Artificial Neural Network
PredictiveMaintenance”, International Journal of Artificial Intelligence and Applications (IJAIA),
Vol.9,No.3,
[28] Massaro, A., Calicchio, A., Maritati, V., Galiano, A., Birardi, V., Pellicani, L., Gutierrez Millan,
M.,DallaTezza, B., Bianchi, M., Vertua, G., Puggioni, A. (2018) “A Case Study of Innovation of
anInformation Communication system and Upgrade of the Knowledge Base in Industry by
ESB,Artificial Intelligence, and Big Data System Integration”, International Journal of
ArtificialIntelligence and Applications (IJAIA), Vol. 9, No.5, pp. 27-43.
[29] “WSO2” [Online]. Available: https://wso2.com/products/enterprise-service-bus/
[30] “Ubuntu” [Online]. Available: https://www.ubuntu.com/
[31] “Apache Cassandra” [Online]. Available: http://cassandra.apache.org/
[32]“DataStaxEnterpriseOpsCenter”[Online].Available:
https://www.datastax.com/products/datastaxopscenter
[33] “About DataStaxDevCenter” [Online]. Available:
https://docs.datastax.com/en/developer/devcenter/doc/devcenter/dcAbout.html
[34] “Knowi” [Online]. Available: https://www.knowi.com/
[35] “JFreeChart” [Online]. Available: http://www.jfree.org/jfreechart/samples.html
[36] “PuTTY” [Online]. Available: https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html
[37] “Lightning Fast Data Science for Teams” [Online]. Available: https://rapidminer.com/
[38] Massaro, A., Meuli, G. &Galiano, A. (2018) “Intelligent Electrical Multi Outlets Controlled an
Activated by a Data Mining Engine Oriented to Building Electrical Management”, InternationalJournal
on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.7, No.4, pp 1-20.
[39] Myers, J. L. & Well, A. D. (2003) “Research Design and Statistical Analysis”, (2nd ed.)
LawrenceErlbaum.
.
AUTHOR
Alessandro Massaro: Research & Development Chief of Dyrecta Lab s.r.l.
INCREASED PREDICTION ACCURACY IN THE GAME
OF CRICKETUSING MACHINE LEARNING
Kalpdrum Passi and Niravkumar Pandey
Department of Mathematics and Computer Science
Laurentian University, Sudbury, Canada
ABSTRACT
Player selection is one the most important tasks for any sport and cricket is no exception.
The performance of the players depends on various factors such as the opposition team,
the venue, his current form etc. The team management, the coach and the captain select 11
players for each match from a squad of 15 to 20 players. They analyze different
characteristics and the statistics of the players to select the best playing 11 for each match.
Each batsman contributes by scoring maximum runs possible and each bowler contributes
by taking maximum wickets and conceding minimum runs. This paper attempts to predict
the performance of players as how many runs will each batsman score and how many
wickets will each bowler take for both the teams. Both the problems are targeted as
classification problems where number of runs and number ofwickets are classified in
different ranges. We used naïve bayes, random forest, multiclass SVM and decision tree
classifiers to generate the prediction models for both the problems. Random Forest
classifier wasfound to be the most accurate for both the problems.
KEYWORDS
Naïve Bayes, Random Forest, Multiclass SVM, Decision Trees, Cricket
Full Text : http://aircconline.com/ijdkp/V8N2/8218ijdkp03.pdf
.
REFERNCES
[1] S. Muthuswamy and S. S. Lam, "Bowler Performance Prediction for One-day International Cricket
Using Neural Networks," in Industrial Engineering Research Conference, 2008.
[2] I. P. Wickramasinghe, "Predicting the performance of batsmen in test cricket," Journal of Human Sport
& Excercise, vol. 9, no. 4, pp. 744-751, May 2014.
[3] G. D. I. Barr and B. S. Kantor, "A Criterion for Comparing and Selecting Batsmen in Limited Overs
Cricket," Operational Research Society, vol. 55, no. 12, pp. 1266-1274, December 2004.
[4] S. R. Iyer and R. Sharda, "Prediction of athletes performance using neural networks: An application in
cricket team selection," Expert Systems with Applications, vol. 36, pp. 5510-5522, April 2009.
[5] M. G. Jhanwar and V. Pudi, "Predicting the Outcome of ODI Cricket Matches: A Team Composition
Based Approach," in European Conference on Machine Learning and Principles and Practice of
Knowledge Discovery in Databases (ECMLPKDD 2016 2016), 2016.
[6] H. H. Lemmer, "The combined bowling rate as a measure of bowling performance in cricket," South
African Journal for Research in Sport, Physical Education and Recreation, vol. 24, no. 2, pp. 37-44,
January 2002.
[7] D. Bhattacharjee and D. G. Pahinkar, "Analysis of Performance of Bowlers using Combined Bowling
Rate," International Journal of Sports Science and Engineering, vol. 6, no. 3, pp. 1750-9823, 2012.
[8] S. Mukherjee, "Quantifying individual performance in Cricket - A network analysis of batsmen and
bowlers," Physica A: Statistical Mechanics and its Applications, vol. 393, pp. 624-637, 2014.
[9] P. Shah, "New performance measure in Cricket," ISOR Journal of Sports and Physical Education, vol.
4, no. 3, pp. 28-30, 2017.
[10] D. Parker, P. Burns and H. Natarajan, "Player valuations in the Indian Premier League," Frontier
Economics, vol. 116, October 2008.
[11] C. D. Prakash, C. Patvardhan and C. V. Lakshmi, "Data Analytics based Deep Mayo Predictor for
IPL9," International Journal of Computer Applications, vol. 152, no. 6, pp. 6-10, October 2016.
[12] M. Ovens and B. Bukiet, "A Mathematical Modelling Approach to One-Day Cricket Batting
Orders,"Journal of Sports Science and Medicine, vol. 5, pp. 495-502, 15 December 2006.
[13] R. P. Schumaker, O. K. Solieman and H. Chen, "Predictive Modeling for Sports and Gaming," in
Sports Data Mining, vol. 26, Boston, Massachusetts: Springer, 2010.
[14] M. Haghighat, H. Ratsegari and N. Nourafza, "A Review of Data Mining Techniques for Result
Prediction in Sports," Advances in Computer Science : an International Journal, vol. 2, no. 5, pp. 7-
12,November 2013.
[15] J. Hucaljuk and A. Rakipovik, "Predicting football scores using machine learning techniques," in
International Convention MIPRO, Opatija, 2011.
[16] J. McCullagh, "Data Mining in Sport: A Neural Network Approach," International Journal of Sports
Science and Engineering, vol. 4, no. 3, pp. 131-138, 2012.
[17] "Free web scraping - Download the most powerful web scraper | ParseHub," parsehub, [Online].
Available: https://www.parsehub.com.
[18] "Import.io | Extract data from the web," Import.io, [Online]. Available: https://www.import.io.
[19] T. L. Saaty, The Analytic Hierarchy Process, New York: McGrow Hill, 1980.
[20] T. L. Saaty, "A scaling method for priorities in a hierarchichal structure," Mathematical Psychology,
vol. 15, 1977.
[21] N. V. Chavla, K. W. Bowyer, L. O. Hall and P. W. Kegelmeyer, "SMOTE: Synthetic Minority
Oversampling Technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, June 2002.
[22] J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, 3rd Edition ed., Waltham:
Elsevier, 2012.
[23] J. R. Quinlan, "Induction of Decision Trees," Machine learning, vol. 1, no. 1, pp. 81-106, 1986.
[24] J. R. Quinlan, C4.5: Programs for Machine Learning, Elsevier, 2015.
[25] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[26] T. K. Ho, "The Random Subspace Method for Constructing Decision Forests," IEEE transactions on
pattern analysis and machine intelligence, vol. 20, no. 8, pp. 832-844, August 1998.
[27] L. Breiman, J. Friedman, C. J. Stone and R. A. Olshen, Classification and regression trees, CRC
Press,1984.
[28] B. E. Boser, I. M. Guyon and V. N. Vapnik, "A Training Algorithm for Optimal Margin Classifiers," in
Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, 1992.
[29] C.-C. Chang and C.-J. Lin, "LIBSVM: A library for support vector machines," ACM Transactions on
Intelligent Systems and Technology, vol. 2, no. 3, April 2011.
[30] T. L. Saaty, "A scaling method for priorities in a hierarchichal structure," Mathematical Psychology,
vol. 15, pp. 234-281, 1977.
[31] T. L. Saaty, The Analytical Hierarchy Process, New York: McGraw-Hill, 1980.
[32] N. V. Chavla, K. W. Bowyer, L. O. Hall and W. P. Kegelmeyer, "SMOTE: Synthetic Minority
Oversampling Technique," Artificial Intelligence Research, vol. 16, p. 321–357, June 2002.
AUTHORS
Kalpdrum Passi received his Ph.D. in Parallel Numerical Algorithms from Indian
Institute of Technology, Delhi, India in 1993. He is an Associate Professor,
Department of Mathematics & Co mputer Science, at Laurentian University, Ontario,
Canada. He has published many papers on Parallel Numerical Algorithms in
international journals and conferences. He has collaborative work with faculty in
Canada and US and the work was tested on the CRAY XMP’s and CRAY YMP’s. He
transitioned his research to web technology, and more recently has been involved in machine
learning and data mining applications in bioinformatics, social media and other data science areas.
He obtained funding from NSERC and Laurentian University for his research. He is a member of
the ACM and IEEE Computer Society.
Niravkumar Pandey is pursuing M.Sc. in Computational Science at Laurentian
University, Ontario, Canada. He received his Bachelor of Engineering degree from
Gujarat Technological University, Gujarat, India. Data mining and machine learning are
his primary areas of interest. He is also a cricket enthusiast and is studying applicationsof
machine learning and data mining in cricket analytics for his M.Sc. thesis
Data Mining and Its Applications for Knowledge Management :
A Literature Review from 2007 to 2012
Tipawan Silwattananusarn1and Assoc.Prof.Dr. KulthidaTuamsuk2
1Ph.D. Student in Information Studies Program,
KhonKaen University, Thailand
2Head, Information & Communication Management Program,
KhonKaen University, Thailand
ABSTRACT
Data mining is one of the most important steps of the knowledge discovery in databases
process and is considered as significant subfield in knowledge management. Research in
data mining continues growing in business and in learning organization over coming
decades. This review paper explores the applications of data mining techniques which
have been developed to support knowledge management process. The journal articles
indexed in ScienceDirect Database from 2007 to 2012 are analyzed and classified. The
discussion on the findings is divided into 4 topics: (i) knowledge resource; (ii) knowledge
types and/or knowledge datasets; (iii) data mining tasks; and (iv) data mining techniques
and applications used in knowledge management. The article first briefly describes the
definition of data mining and data mining functionality. Then the knowledge management
rationale and major knowledge management tools integrated in knowledge management
cycle are described. Finally, the applications of data mining techniques in the process of
knowledge management are summarized and discussed..
KEYWORDS
Data mining; Data mining applications; Knowledge management.
For More Details : http://aircconline.com/ijdkp/V2N5/2512ijdkp02.pdf
Volume Link :http://airccse.org/journal/ijdkp/vol2.htmll
REFERENCES
[1] An, X. & Wang, W. (2010). Knowledge management technologies and applications: A literaturereview.
IEEE, 138-141. doi:10.1109/ICAMS.2010.5553046
[2] Berson, A., Smith, S.J. &Thearling, K. (1999). Building Data Mining Applications for CRM. New York:
McGraw-Hill.
[3] Cantú, F.J. &Ceballos, H.G. (2010). A multiagent knowledge and information network approach for
managing research assets. Expert Systems with Applications, 37(7), 5272-
5284.doi:10.1016/j.eswa.2010.01.012
[4] Cheng, H., Lu, Y. &Sheu, C. (2009). An ontology-based business intelligence application in a financial
knowledge management system.Expert Systems with Applications, 36, 3614–
3622.Doi:10.1016/j.eswa.2008.02.047
[5] Dalkir, K. (2005). Knowledge Management in Theory and Practice. Boston: Butterworth-Heinemann.
[6] Dawei, J. (2011). The Application of Date Mining in Knowledge Management.2011 International
Conference on Management of e-Commerce and e-Government, IEEE Computer Society, 7-9.doi:
10.1109/ICMeCG.2011.58
[7] Fayyad, U., Piatetsky-Shapiro, G. & Smyth, P. (1996). From Data Mining to Knowledge Discovery in
Databases.AI Magazine, 17(3), 37-54.
[8] Gorunescu, F. (2011). Data Mining: Concepts, Models, and Techniques. India: Springer.
[9] Han, J. &Kamber, M. (2012). Data Mining: Concepts and Techniques. 3rd.ed. Boston: Morgan
Kaufmann Publishers.
[10] Hwang, H.G., Chang, I.C., Chen, F.J. & Wu, S.Y. (2008). Investigation of the application of KMS for
diseases classifications: A study in a Taiwanese hospital. Expert Systems with Applications, 34(1),
725-733. doi:10.1016/j.eswa.2006.10.018
[11] Lavrac, N., Bohanec, M., Pur, A., Cestnik, B., Debeljak, M. &Kobler, A. (2007).Data mining and
visualization for decision support and modeling of public health-care resources.Journal of Biomedical
Informatics, 40, 438-447. doi:10.1016/j.jbi.2006.10.003
[12] Li, X., Zhu, Z. & Pan, X. (2010). Knowledge cultivating for intelligent decision making in small
&middle businesses.Procedia Computer Science, 1(1), 2479-2488. doi:10.1016/j.procs.2010.04.280
[13] Li, Y., Kramer, M.R., Beulens, A.J.M., Van Der Vorst, J.G.A.J. (2010). A framework for earlywarning
and proactive control systems in food supply chain networks. Computers in Industry, 61, 852–862.
Doi:101.016/j.compind.2010.07.010
[14] Liao, S.H., Chen, C.M., Wu, C.H. (2008). Mining customer knowledge for product line and brand
extension in retailing.Expert Systems with Applications, 34(3), 1763-
1776.doi:10.1016/j.eswa.2007.01.036
[15] Liao, S. (2003). Knowledge management technologies and applications-literature review from 1995 to
2002. Expert Systems with Applications, 25, 155-164. doi:10.1016/S0957-4174(03)00043-5
[16] Liu, D.R. & Lai, C.H. (2011). Mining group-based knowledge flows for sharing task knowledge.
Decision Support Systems,50(2), 370-386. doi:10.1016/j.dss.2010.09.004
[17] Lee, M.R. & Chen, T.T. (2011). Revealing research themes and trends in knowledge management:
From 1995 to 2010. Knowledge-Based Systems.doi:10.1016/j.knosys.2011.11.016
[18] McInerney, C.R. & Koenig, M.E. (2011). Knowledge Management (KM) Processes in Organizations:
Theoretical Foundations and Practice. USA: Morgan & Claypool Publishers.
doi:10.2200/S00323ED1V01Y201012ICR018
[19] McInerney, C. (2002). Knowledge Management and the Dynamic Nature of Knowledge.Journal of the
American Society for Information Science and Technology, 53(12), 1009-1018. doi:10.1002/asi.10109
[20] Ngai, E., Xiu, L. &Chau, D. (2009). Application of data mining techniques in customer relationship
management: A literature review and classification. Expert Systems with Applications, 36, 2592- 2602.
doi:10.1016/j.eswa.2008.02.021
[21] Ruggles, R.L. (ed.). (1997). Knowledge Management Tools. Boston: Butterworth-Heinemann.
[22] Sher, P.J. & Lee, V.C. (2004). Information technology as a facilitator for enhancing dynamic
capabilities through knowledge management.Information& Management, 41, 933-
945.doi:10.1016/j.im.2003.06.004
[23] Tseng, S.M. (2008). The effects of information technology on knowledge management
systems.Expert Systems with Applications, 35, 150-160. doi:10.1016/j.eswa.2007.06.011
[24] Ur-Rahman, N. & Harding, J.A. (2012). Textual data mining for industrial knowledge management and
text classification: A business oriented approach. Expert Systems with Applications, 39, 4729- 4739.
doi:10.1016/j.eswa.2011.09.124
[25] Wang, F. & Fan, H. (2008). Investigation on Technology Systems for Knowledge Management.IEEE,
1-4. doi:10.1109/WiCom.2008.2716
[26] Wang, H. & Wang, S. (2008). A knowledge management approach to data mining process for business
intelligence. Industrial Management & Data Systems, 108(5), 622-634.
[27] Wu, W., Lee, Y.T., Tseng, M.L. & Chiang, Y.H. (2010). Data mining for exploring hidden patterns
between KM and its performance.Knowledge-Based Systems, 23, 397-
401.doi:10.1016/j.knosys.2010.01.014[15] C.-C.Chang, T. D. Kieu, and Y.-C.Chou, "A High
Payload Steganographic Scheme Based on (7, 4) Hamming Code for Digital Images," Proc. of the 2008
International Symposium on Electronic Commerce and Security, pp.16-21, August 2008.
APPLICATION OF SPATIOTEMPORAL ASSOCIATION
RULES ON SOLAR DATA TO SUPPORT SPACE
WEATHER FORECASTING
Carlos Roberto Silveira Junior1
, José Roberto Cecatto2
, Marilde Terezinha Prado
Santos1
and Marcela Xavier Ribeiro1
1
Department of Computing, Federal University of São Carlos, São Carlos, Brazil
2
National Institute of Space Research, São José dos Campos, Brazil
ABSTRACT
It is well known that solar energetic phenomena influence the Space Weather, in special
those directed to the Earth environment. In this context, the analysis of Solar Data is a
challenging task, particularly when are composed of Satellite Image Time Series (SITS). It
is a multidisciplinary domain that generates a massive amount of data (several Gigabytes
per year). It includes image processing, spatiotemporal characteristics, and the processing
of semantic data. Aiming to enhance the SITS analysis, we propose an algorithm called
"Miner of Thematic Spatiotemporal Associations for Images" (MiTSAI), which is anb
extractor of Thematic Spatiotemporal Association Rules (TSARs) from Solar SITS. Here,
a description is given about the details of the modern algorithm MiTSAI, which is an
extractor of Thematic Spatiotemporal Association Rules (TSARs) from solar Satellite
Image Time Series (SITS). In addition, its adaptation to the Space Weather and discussion
about the specific use in favor of forecasting activities are presented. Finally, some results
of its application specifically to solar flare forecasting are also presented. MiTSAI has to
extract interesting new patterns compared with the art-state algorithms.
KEYWORDS
Data Mining, Drug Discovery, Drug Description, Chemoinformatics, and Web
Application
For More Details : https://aircconline.com/ijdkp/V10N2/10220ijdkp01.pdf
Volume Link : http://airccse.org/journal/ijdkp/vol10.html
REFERENCES
[[1] T Abirami-Kongu, P Thangaraj, and P Priakanth-Kongu. Wireless sensor networks fault identification
using data association. Journal of Computer Science, 8(9):1501–1505, 2012.
[2] Sultan Alamri, David Taniar, and Maytham Safar. A taxonomy for moving object queries in
spatialdatabases. Future Generation Computer Systems, 37:232 – 242, 2014. ISSN 0167-739X. doi:
https://doi.org/10.1016/j.future.2014.02.007.
[3] Shadi A. Aljawarneh, Radhakrishna Vangipuram, Veereswara Kumar Puligadda, and Janaki
Vinjamuri. G-spamine: An approach to discover temporal association patterns and trends in internet of
things. Future Generation Computer Systems, 2017. ISSN 0167-739X. doi: https://doi.org/10.1016/
j.future.2017.01.013.
[4] Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. Surf: Speeded up robust features. pages 404–417,
2006.
[5] Monica G Bobra and Sebastien Couvidat. Solar flare prediction using sdo/hmi vector magnetic field
data with a machine-learning algorithm. The Astrophysical Journal, 798(2):135–172, 2015.
[6] I. Burbey and T.L. Martin. A survey on predicting personal mobility. International Journal of
Pervasive Computing and Communications, 8(1):5 – 22, 2012.
[7] M. Chen, S. Mao, and Y. Liu. Big data: A survey. Mobile Networks and Applications, 19(2):171–
209,2014.
[8] Paolo Compieta, Sergio Di Martino, Michela Bertolotto, Filomena Ferrucci, and T Kechadi.
Exploratory spatio-temporal data mining and visualization. Journal of Visual Languages & Computing,
18(3):255–279, 2007.
[9] G Fang and Y Wu. Frequent spatiotemporal association patterns mining based on granular computing.
Informatica (Slovenia), 37(4):443–453, 2013.
[10] A. Hana, Y.T. Sami, and F. Sami. Mining spatiotemporal associations using queries. 2012 International
Conference on Information Technology and e-Services, ICITeS 2012, 2012
[11] R. M. Haralick, K. Shanmugam, and I. Dinstein. Textural features for image classification. IEEE
Transactions on Systems, Man, and Cybernetics, SMC-3(6):610–621, Nov 1973. ISSN 0018-9472.
doi: 10.1109/TSMC.1973.4309314.
[12] J. Huo, J. Zhang, and X. Meng. On co-occurrence pattern discovery from spatio-temporal event stream.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and
Lecture Notes in Bioinformatics), 8181 LNCS(PART 2):385–395, 2013.
[13] J. Kawale, S. Liess, V. Kumar, U. Lall, and A. Ganguly. Mining time-lagged relationships in
spatiotemporal climate data. pages 130–135, 2012.
[14] Xiangjie Kong, Zhenzhen Xu, Guojiang Shen, Jinzhong Wang, Qiuyuan Yang, and Benshi Zhang.
Urban traffic congestion estimation and prediction based on floating car trajectory data. Future
Generation Computer Systems, 61:97 – 107, 2016. ISSN 0167-739X. doi:
https://doi.org/10.1016/j.future. 2015.11.013.
[15] T.C.W. Landgrebe, A. Merdith, A. Dutkiewicz, and R.D. Mafaler. Relationships between
palaeogeography and opal occurrence in australia: A data-mining approach. Computers and
Geosciences, 56: 76–82, 2013.
[16] Angela Lausch, Andreas Schmidt, and Lutz Tischendorf. Data mining and linked open data: New
perspectives for data analysis in environmental research. Ecological Modelling, 295(0):5 – 17, 2015.
ISSN 0304-3800. doi: http://dx.doi.org/10.1016/j.ecolmodel.2014.09.018.
[17] A. Madraky, Z.A. Othman, and A.R. Hamdan. Analytic methods for spatio-temporal data in a
natureinspired data model. International Review on Computers and Software, 9(3):547–556, 2014.
[18] A.a Mohan and P.Z.b Revesz. Applications of spatio-temporal data mining to north platter river
reservoirs. ACM International Conference Proceeding Series, pages 306–309, 2014.
[19] NOAA. www.solarmonitor.org, April 2016. Last Access April 13, 2016.
[20] K.G. Pillai, R.A. Angryk, J.M. Banda, M.A. Schuh, and T. Wylie. Spatio-temporal co-occurrence
pattern mining in data sets with evolving regions. Proceedings - 12th IEEE International Conference
on Data Mining Workshops, ICDMW 2012, pages 805–812, 2012.
[21] K.G.a Pillai, R.A.b Angryk, and B.b Aydin. A filter-and-refine approach to mine spatiotemporal
cooccurrences. pages 104–113, 2013.
[22] Vangipuram Radhakrishna, Shadi A. Aljawarneh, P.V. Kumar, and V. Janaki. A novel fuzzy similarity
measure and prevalence estimation approach for similarity profiled temporal association pattern
mining. Future Generation Computer Systems, 2017. ISSN 0167-739X. doi: https://doi.org/10.1016/j.
future.2017.03.016.
[23] K Venkateswara Rao, A Govardhan, and KV Chalapati Rao. Spatiotemporal data mining: Issues, tasks
and applications. International Journal of Computer Science & Engineering Survey (IJCSES) Vol,
3:39–52, 2012.
[24] Md. Mamunur Rashid, Iqbal Gondal, and Joarder Kamruzzaman. A technique for parallel sharefrequent
sensor pattern mining from wireless sensor networks. Procedia Computer Science, 29(0):124 – 133,
2014. ISSN 1877-0509. doi: http://dx.doi.org/10.1016/j.procs.2014.05.012. 2014 International
Conference on Computational Science.
[25] Md.M. Rashid, I. Gondal, and J. Kamruzzaman. Mining associated sensor patterns for data stream of
wireless sensor networks. pages 91–98, 2013.
[26] Marcela Xavier Ribeiro, Agma J. M. Traina, and Caetano Traina, Jr. A new algorithm for data
discretization and feature selection. In Proceedings of the 2008 ACM symposium on Applied
computing, SAC ’08, pages 953–954, New York, NY, USA, 2008. ACM. ISBN 978-1-59593-753-7.
doi: 10.1145/1363686.1363905.
[27] James A. Rodger. Toward reducing failure risk in an integrated vehicle health maintenance system: A
fuzzy multi-sensor data fusion kalman filter approach for ivhms. Expert Systems with Applications,
39(10):9821 – 9836, 2012. ISSN 0957-4174. doi: https://doi.org/10.1016/j.eswa.2012.02.171.
[28] W. Sammouri, E. Cafame, L. Oukhellou, and P. Aknin. Mining floating train data sequences for
temporal association rules within a predictive maintenance framework. Lecture Notes in Computer
subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7987
LNAI:112–126, 2013.
[29] T. Scheffer. Finding association rules that trade support optimally against confidence. pages 9: 381–
395, 1995.
[30] Carlos Roberto Silveira-Junior. github.com/carlossilveirajr/mitsai, April 2017. Last Access April 17,
2017.
[31] Carlos Roberto Silveira-Junior, Danilo Codeco Carvalho, Marilde Terezinha Prado Santos, and
Marcela Xavier Ribeiro. Incremental mining of frequent sequences in environmental sensor data. In
The Twenty-Sixth International FLAIRS Conference (2015), pages 452–455, .
[32] Carlos Roberto Silveira-Junior, Marilde Prado Santos, and Marcela Ribeiro. Stretchy time pattern
mining: A deeper analysis of environment sensor data. pages 468–473, .
[33] Carlos Roberto Silveira-Junior, Marcela Xavier Ribeiro, and Marilde Terezinha Prado Santos. A
flexible architecture to integrate the solar satellite image time series data - the setl architecture. Pages
1–14, 2017. No publish by the time of this paper preparation.
[34] Olga Spatenkovaa and Kirsi Virrantausb. Discovering spatio-temporal relationships in the distribution
of building fires. Fire Safety Journal, 62, Part A:49 – 63, 2013. ISSN 0379-7112. doi: http://dx.
doi.org/10.1016/j.firesaf.2013.07.001. Special Issue on Spatial Analytical Approaches in Urban Fire
Management.
[35] Fenzhen Su, Chenghu Zhou, and Wenzhoung Shi. Geoevent association rule discovery model based on
rough set with marine fishery application. In Geoscience and Remote Sensing Symposium, 2004.
IGARSS ’04. Proceedings. 2004 IEEE International, volume 2, pages 1455–1458 vol.2, Sept 2004.
doi: 10.1109/IGARSS.2004.1368694.
[36] Edi Winarko and John F. Roddick. Armada: An algorithm for discovering richer relative temporal
association rules from interval-based data. Data & Knowledge Engineering, 63(1):76 – 90, 2007 ISSN
0169-023X. doi: http://dx.doi.org/10.1016/j.datak.2006.10.009. Data Warehouse an Knowledge
Discovery, 7th International Congress on Data Warehouse and Knowledge Discovery.
[37] J.S. Yoo and M. Bow. Mining spatial colocation patterns: A different framework. Data Mining and
Knowledge Discovery, 24(1):159–194, 2012.
[38] B. Zaragoza, A. Rabasa, J.J. Rodriguez-Sala, J.T. Navarro, A. Belda, and A. Ramon. Modelling
farmland abandonment: A study combining gis and data mining techniques. volume 155, pages 124 –
132, 2012. doi: http://dx.doi.org/10.1016/j.agee.2012.03.019.
AUTHORS
Carlos Roberto Silveira Junior is a System Architect at Ericsson expert in software development for
telecom systems. He is a Ph.D. Student at Federal University of São Carlos (UFSCar) working with Data
Mining -Artificial Intelligence- with images processing and parallelism. Holds a Master Degree in Data
Mining and ontology at UFSCar, graduated Computer Science at UFSCar. Main areas of interest: data
mining, programming, and distributed system.
José Roberto Cecatto holds a BS in Physics from the Pontifical Catholic University of São Paulo, a
Master’s and a PhD in Astrophysics from the National Institute for Space Research (INPE). He is currently a
researcher at INPE. He has experience in Astronomy, with an emphasis on Radio Astronomy and
instrumental development, acting mainly on the following themes: solar, radio, spectroscopy and
interferometry. Currently, working in Space Climate with activities and development of tools related to the
forecast of solar phenomena.
Marilde Terezinha Prado Santos is an associate professor at the Center for Exact Sciences and
Technology / Department of Computing at the Federal University of São Carlos. PhD in Science with
emphasis in Computational Physics from the University of São Paulo (2000). Master in Computer Science
from the Federal University of São Carlos (1994) and Bachelor in Computer Science from the Pontifical
Catholic University of Rio Grande do Sul (1991). Main areas of interest: engineering and applications of
crisp and fuzzy ontologies, information retrieval, data mining, data integration and semantic web.

More Related Content

What's hot

dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET
 
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSINGMETA DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSINGIJCSEIT Journal
 
iOMICS Clinical & Omnia
iOMICS Clinical & OmniaiOMICS Clinical & Omnia
iOMICS Clinical & OmniaInterpretOmics
 
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition MethodsDecision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition MethodsEMMAIntl
 
Data Mining in Health Care
Data Mining in Health CareData Mining in Health Care
Data Mining in Health CareShahDhruv21
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata managementPistoia Alliance
 
Big Data in Disease Management
Big Data in Disease ManagementBig Data in Disease Management
Big Data in Disease ManagementInterpretOmics
 
Synthesizing Big Data into Actionable Knowledge
Synthesizing Big Data into Actionable KnowledgeSynthesizing Big Data into Actionable Knowledge
Synthesizing Big Data into Actionable KnowledgeMaria-Esther Vidal
 
Big data in Healthcare & Life Sciences
Big data in Healthcare & Life SciencesBig data in Healthcare & Life Sciences
Big data in Healthcare & Life SciencesMatthias Vallaey
 
NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)Lance K. Manning
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Big Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
Big Data in Healthcare: Hype and Hope on the Path to Personalized MedicineBig Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
Big Data in Healthcare: Hype and Hope on the Path to Personalized MedicineNew York eHealth Collaborative
 
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid CrisisReveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid CrisisNeo4j
 
IRJET- A Review of Data Cleaning and its Current Approaches
IRJET- A Review of Data Cleaning and its Current ApproachesIRJET- A Review of Data Cleaning and its Current Approaches
IRJET- A Review of Data Cleaning and its Current ApproachesIRJET Journal
 
Big data for healthcare analytics final -v0.3 miz
Big data for healthcare analytics   final -v0.3 mizBig data for healthcare analytics   final -v0.3 miz
Big data for healthcare analytics final -v0.3 mizYusuf Brima
 
A SURVEY OF LINK MINING AND ANOMALIES DETECTION
A SURVEY OF LINK MINING AND ANOMALIES DETECTIONA SURVEY OF LINK MINING AND ANOMALIES DETECTION
A SURVEY OF LINK MINING AND ANOMALIES DETECTIONIJDKP
 
Big Data applications in Health Care
Big Data applications in Health CareBig Data applications in Health Care
Big Data applications in Health CareLeo Barella
 
DataPharmaNovember2016
DataPharmaNovember2016DataPharmaNovember2016
DataPharmaNovember2016Pfizer
 

What's hot (20)

dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
 
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSINGMETA DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
 
iOMICS Clinical & Omnia
iOMICS Clinical & OmniaiOMICS Clinical & Omnia
iOMICS Clinical & Omnia
 
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition MethodsDecision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
 
Data Mining in Health Care
Data Mining in Health CareData Mining in Health Care
Data Mining in Health Care
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 
Big Data in Disease Management
Big Data in Disease ManagementBig Data in Disease Management
Big Data in Disease Management
 
Synthesizing Big Data into Actionable Knowledge
Synthesizing Big Data into Actionable KnowledgeSynthesizing Big Data into Actionable Knowledge
Synthesizing Big Data into Actionable Knowledge
 
Big data in Healthcare & Life Sciences
Big data in Healthcare & Life SciencesBig data in Healthcare & Life Sciences
Big data in Healthcare & Life Sciences
 
NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Big Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
Big Data in Healthcare: Hype and Hope on the Path to Personalized MedicineBig Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
Big Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
 
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid CrisisReveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
 
IRJET- A Review of Data Cleaning and its Current Approaches
IRJET- A Review of Data Cleaning and its Current ApproachesIRJET- A Review of Data Cleaning and its Current Approaches
IRJET- A Review of Data Cleaning and its Current Approaches
 
Big data for healthcare analytics final -v0.3 miz
Big data for healthcare analytics   final -v0.3 mizBig data for healthcare analytics   final -v0.3 miz
Big data for healthcare analytics final -v0.3 miz
 
A SURVEY OF LINK MINING AND ANOMALIES DETECTION
A SURVEY OF LINK MINING AND ANOMALIES DETECTIONA SURVEY OF LINK MINING AND ANOMALIES DETECTION
A SURVEY OF LINK MINING AND ANOMALIES DETECTION
 
Sara Gerke: "AI in Drug Discovery and Clinical Trials"
Sara Gerke: "AI in Drug Discovery and Clinical Trials"Sara Gerke: "AI in Drug Discovery and Clinical Trials"
Sara Gerke: "AI in Drug Discovery and Clinical Trials"
 
Big Data applications in Health Care
Big Data applications in Health CareBig Data applications in Health Care
Big Data applications in Health Care
 
DataPharmaNovember2016
DataPharmaNovember2016DataPharmaNovember2016
DataPharmaNovember2016
 
research publication
research publicationresearch publication
research publication
 

Similar to April 2020 top read articles in data mining & knowledge management process research articles

A Comparative Study on Mushroom Classification using Supervised Machine Learn...
A Comparative Study on Mushroom Classification using Supervised Machine Learn...A Comparative Study on Mushroom Classification using Supervised Machine Learn...
A Comparative Study on Mushroom Classification using Supervised Machine Learn...YogeshIJTSRD
 
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERYA WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERYIJDKP
 
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERYA WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERYIJDKP
 
Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Yasel Cruz
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-finalPeter Embi
 
February 2024-: Top Read Articles in Computer Science & Information Technology
February 2024-: Top Read Articles in Computer Science & Information TechnologyFebruary 2024-: Top Read Articles in Computer Science & Information Technology
February 2024-: Top Read Articles in Computer Science & Information TechnologyAIRCC Publishing Corporation
 
January 2024 : Top 10 Downloaded Articles in Computer Science & Information ...
January 2024 :  Top 10 Downloaded Articles in Computer Science & Information ...January 2024 :  Top 10 Downloaded Articles in Computer Science & Information ...
January 2024 : Top 10 Downloaded Articles in Computer Science & Information ...AIRCC Publishing Corporation
 
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION IJCI JOURNAL
 
Comparative Analysis of Classification Algorithms using Weka
Comparative Analysis of Classification Algorithms using WekaComparative Analysis of Classification Algorithms using Weka
Comparative Analysis of Classification Algorithms using Wekaijtsrd
 
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...IRJET Journal
 
IRJET- Review on Knowledge Discovery and Analysis in Healthcare using Dat...
IRJET-  	  Review on Knowledge Discovery and Analysis in Healthcare using Dat...IRJET-  	  Review on Knowledge Discovery and Analysis in Healthcare using Dat...
IRJET- Review on Knowledge Discovery and Analysis in Healthcare using Dat...IRJET Journal
 
Advances in computer aided drug design
Advances in computer aided drug designAdvances in computer aided drug design
Advances in computer aided drug designVikas Soni
 
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONIJDKP
 
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and PrimersGASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and Primersijdmtaiir
 
OPTIMAL EXTRACTION OF INTEGRATED WEB INTERACTIVE CLINICAL INFORMATICS TOWARDS...
OPTIMAL EXTRACTION OF INTEGRATED WEB INTERACTIVE CLINICAL INFORMATICS TOWARDS...OPTIMAL EXTRACTION OF INTEGRATED WEB INTERACTIVE CLINICAL INFORMATICS TOWARDS...
OPTIMAL EXTRACTION OF INTEGRATED WEB INTERACTIVE CLINICAL INFORMATICS TOWARDS...IAEME Publication
 
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...ijcsa
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 

Similar to April 2020 top read articles in data mining & knowledge management process research articles (20)

A Comparative Study on Mushroom Classification using Supervised Machine Learn...
A Comparative Study on Mushroom Classification using Supervised Machine Learn...A Comparative Study on Mushroom Classification using Supervised Machine Learn...
A Comparative Study on Mushroom Classification using Supervised Machine Learn...
 
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERYA WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
 
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERYA WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
 
Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-final
 
February 2024-: Top Read Articles in Computer Science & Information Technology
February 2024-: Top Read Articles in Computer Science & Information TechnologyFebruary 2024-: Top Read Articles in Computer Science & Information Technology
February 2024-: Top Read Articles in Computer Science & Information Technology
 
January 2024 : Top 10 Downloaded Articles in Computer Science & Information ...
January 2024 :  Top 10 Downloaded Articles in Computer Science & Information ...January 2024 :  Top 10 Downloaded Articles in Computer Science & Information ...
January 2024 : Top 10 Downloaded Articles in Computer Science & Information ...
 
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION
 
Comparative Analysis of Classification Algorithms using Weka
Comparative Analysis of Classification Algorithms using WekaComparative Analysis of Classification Algorithms using Weka
Comparative Analysis of Classification Algorithms using Weka
 
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
 
IRJET- Review on Knowledge Discovery and Analysis in Healthcare using Dat...
IRJET-  	  Review on Knowledge Discovery and Analysis in Healthcare using Dat...IRJET-  	  Review on Knowledge Discovery and Analysis in Healthcare using Dat...
IRJET- Review on Knowledge Discovery and Analysis in Healthcare using Dat...
 
Advances in computer aided drug design
Advances in computer aided drug designAdvances in computer aided drug design
Advances in computer aided drug design
 
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
 
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and PrimersGASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
 
OPTIMAL EXTRACTION OF INTEGRATED WEB INTERACTIVE CLINICAL INFORMATICS TOWARDS...
OPTIMAL EXTRACTION OF INTEGRATED WEB INTERACTIVE CLINICAL INFORMATICS TOWARDS...OPTIMAL EXTRACTION OF INTEGRATED WEB INTERACTIVE CLINICAL INFORMATICS TOWARDS...
OPTIMAL EXTRACTION OF INTEGRATED WEB INTERACTIVE CLINICAL INFORMATICS TOWARDS...
 
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
 
D1803012022
D1803012022D1803012022
D1803012022
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
Online Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery SystemsOnline Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery Systems
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 

Recently uploaded

An Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppAn Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppCeline George
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...Nguyen Thanh Tu Collection
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...Nguyen Thanh Tu Collection
 
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportBasic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportDenish Jangid
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024Borja Sotomayor
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽中 央社
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMELOISARIVERA8
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi RajagopalEADTU
 
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...Nguyen Thanh Tu Collection
 
SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code ExamplesPeter Brusilovsky
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFVivekanand Anglo Vedic Academy
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文中 央社
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...Gary Wood
 
ANTI PARKISON DRUGS.pptx
ANTI         PARKISON          DRUGS.pptxANTI         PARKISON          DRUGS.pptx
ANTI PARKISON DRUGS.pptxPoojaSen20
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...EADTU
 
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSean M. Fox
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................MirzaAbrarBaig5
 
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...Nguyen Thanh Tu Collection
 
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community PartnershipsSpring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community Partnershipsexpandedwebsite
 

Recently uploaded (20)

An Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppAn Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge App
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportBasic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopal
 
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
 
SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code Examples
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDF
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
ANTI PARKISON DRUGS.pptx
ANTI         PARKISON          DRUGS.pptxANTI         PARKISON          DRUGS.pptx
ANTI PARKISON DRUGS.pptx
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
 
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................
 
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
 
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community PartnershipsSpring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
 

April 2020 top read articles in data mining & knowledge management process research articles

  • 1. April 2020: Top Read Articles In Data Mining & Knowledge Management Process Research Articles International Journal of Data Mining & Knowledge Management Process (IJDKP) ISSN : 2230 - 9608[Online] ; 2231 - 007X [Print] http://airccse.org/journal/ijdkp/ijdkp.html
  • 2. A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY Jiali Tang, Jack Wang and Ahmad Reza Hadaegh Department of Computer Science and Information System, California State University San Marcos, San Marcos, USA ABSTRACT This project is to produce a repository database system of drugs, drug features (properties), and drug targets where data can be mined and analyzed. Drug targets are different proteins that drugs try to bind to stop the activities of the protein. Users can utilize the database to mine useful data to predict the specific chemical properties that will have the relative efficacy of a specific target and the coefficient for each chemical property. This database system can be equipped with different data mining approaches/algorithms such as linear, non-linear, and classification types of data modelling. The data models have enhanced with the Genetic Evolution (GE) algorithms. This paper discusses implementation with the linear data models such as Multiple Linear Regression (MLR), Partial Least Square Regression (PLSR), and Support Vector Machine (SVM). . KEYWORDS Data Mining, Drug Discovery, Drug Description, Chemoinformatics, and Web Application For More Details : https://aircconline.com/ijdkp/V10N1/10120ijdkp01.pdf Volume Link : http://airccse.org/journal/ijdkp/vol10.html
  • 3. REFERENCES [1] Ko, Gene, Reddy, Srinivas, Garg, Rajni, Kumar, Sunil, & Hadaegh, Ahmad, (2012) “Computational Modelling Methods for QSAR Studies on HIV-1 Integrase Inhibitors (2005-2010),”. Curr Comput Aided Drug Des. Vol. 8, No 4, pp 255-270. [2] Thakor, Falguni, Hadaegh, Ahmad, & Zhang, Xiaoyu, (2017), ” Comparative study of Differential Evolutionary-Binary Particle Swarm Optimization (DE-BPSO) algorithm as a feature selection technique with different linear regression models for analysis of HIV-1 Integrase Inhibition features of Aryl β-Diketo Acids”, Proceedings of 9th International Conference on Bioinformatics and Computational Biology, Honolulu, Hawaii, USA, ISBN: 978–1–943436–07–1, pp 179-184. [3] Kane Ian, & Hadaegh Ahmad, “Non-linear Quantitative Structure-Activity Relationship (QSAR) Models for the Prediction of HIV Drug Performance”, (2015), 24th International Conference on Software Engineering and Data Engineering, pp 63-68. Vol 1, ISBN: 9781510812277, San Diego, CA. [4] Galvan Richard, Kashani, Maninatalsadat, & Hadaegh, Ahmad, “Improving Pharmacological Research of HIV-1 Integrase Inhibition Using Differential Evolution-Binary Particle Swarm Optimization and Non-Linear Adaptive Boosting Random Forest Regression”,(2015), IEEE International Workshop on Data Integration and Mining San Francisco, Information Reuse and Integration (IRI), IEEE International Conference, pp 485-490, DOI: 10.1109/IRI.2015.80. INSPEC Accession Number: 15556631. San Francisco, CA. [5] Kashani, Maninatalsadat, Galvan Richard, & Hadaegh Ahmad, “Improving the Feature Selection for the Development of Linear Model for Discovery of HIV-1 Integrase Inhibitors”, (2015) ABDA'15 International Conference on Advances in Big Data Analytics. In Proceeding of the 2015 International Conferences on Advances on Big Data Analyses, pp 150-154. ISBN: 1-60132-411-1, Las Vegas, Nevada. [6] Ko, Gene, Garg, Rajni, Kumar, Sunil, Kumar, Bailey, Barbara, & Hadaegh Ahmad, “A Hybridized Evolutionary Algorithm for Feature Selection of Chemical Descriptors for Computational QSAR Modeling of HIV-1 Integrase Inhibitors”, (2013), Computational Science Curriculum Development Forum and Applied Computational Science and Engineering Student Support for Industry, San Diego State University. [7] Ko, Gene, Garg, Rajni, Kumar, Sunil, Bailey, Barbara, & Hadaegh Ahmad, “Differential Evolution- Binary Particle Swarm Optimization for the Analysis of Aryl b-Kiketo Acids for HIV-1 Integres Inhibition, (2012), WCCI 2012 IEEE World Congress on Computational Intelligence. Brisbane Australia, pp 1849-1855. [8] Ko, Gene, Reddy, Srinivas, Kumar, Kumar, Bailey, Barbara, Garg, Rajni, & Hadaegh, Ahmad, “Evolutionary Computational Modelling of β-Diketo Acids for Virtual Screening of HIV-1 Integrase Inhibitors”, (2012), IEEE World Congress on Computational Intelligence, Brisbane, Australia. [9] Ko, Gene, Reddy, Srinivas, Kumar, Kumar, Garg, Rajni, & Hadaegh, Ahmad “Evolutionary Computational Modelling of β-Diketo Acids for Virtual Screening of HIV-1 Integrase Inhibitors”, (2012), 243rd National Meeting of the American Chemical Society, San Diego, CA. [10] Gonzales, Miguel, Turner, Chris, Ko, Gene, & Hadaegh, Ahmad, “Binary Particle Swarm Optimization Model of Dimeric Aryl Diketo Acid Inhibitors for HIV-1 Integrase” (2012), 243rd National Meeting of the American Chemical Society, San Diego, CA. [11] Ko, Gene, Reddy, Srinivas, Kumar, Sunil, Garg, Rajni, & Hadaegh, Ahmad, “Analysis of HIV-1 Integrase Inhibitors Using Computational QSAR Modelling”, (2012), Computational Science Curriculum Development Forum and Applied Computational Science and Engineering Student Support for Industry, San Diego State University.
  • 4. [12] Garg Rajni, Reddy Srinivas, Zhang Xiaoyu, & Hadaegh Ahmad, “MUT-HIV: Mutation database of HIV proteases”, (2007), American Chemical Society (ACS) 234th National Meeting & Exposition, Boston, MA USA CINF 42. [13] MLR: http://www.stat.yale.edu/Courses/1997-98/101/linmult.htm [14] PLSR: https://www.mathworks.com/help/stats/plsregress.html [15] https://techdifferences.com/difference-between-descriptive-and-predictive-data-mining.html [16] Zhong et al. Artificial intelligence in drug design. Sci China Life Sci. 2018 Jul 18. doi: 10.1007/s11427-018-9342-2. [Epub ahead of print] [17] Varsou Dimitra-Danai, Nikolakopoulos, Spyridon, Tsoumanis Andreas, Melagraki Georgia, & Afantitis, Antreas, “New Cheminformatics Platform for Drug Discovery and Computational Toxicology”, (2018), Methods Mol Biol. 2018; 1800:287-311. doi: 10.1007/978-1-4939-7899-1_14 [18] Ekins, Sean, Clark, Alex, Dole, Krishna, Gregory, Kellan, Mcnutt, Andrew, Spektor, Anna, Weatherall, Charlie, & Litterman, Nadia, “Data Mining and Computational Modeling of High- Throughput Screening Datasets”, (2018), Methods Mol Biol, 1755:197-221. doi: 10.1007/978-1-4939- 7724-6_14. [19] Sam Elizabeth, & Athri Prashanth, “Web-based drug repurposing tools: a survey. Brief Bioinform”, (2017), Oct 6. doi: 10.1093/bib/bbx125. [Epub ahead of print]. [20] Kaur, Charanpreet, & Bhardwaj, Shweta, “DRUG Discovery Using Data Mining International Journal of Information and Computation Technology”, (2014), ISSN 0974-2239 Volume 4, Number 4, pp 335- 342 © International Research Publications House http://www. irphouse.com /ijict.htm [21] Minaei-Bidgoli, Behrouz, & Punch, William, “Using Genetic Algorithms for Data Mining Optimization in an Educational Web-Based System, (2003), Genetic Algorithms Research and Applications Group (GARAGe) Department of Computer Science & Engineering Michigan State University 2340 Engineering Building East Lansing, MI 48824. [22] https://chm.kode-solutions.net/products_dragon.php [23] AWS LightSail: https://aws.amazon.com/lightsail/?nc2=h_ql_prod_fs_ls [24] AWS EC2 Server: https://aws.amazon.com/ec2/
  • 5. A REVIEW ON EVALUATION METRICS FOR DATA CLASSIFICATION EVALUATIONS Hossin, M.1 and Sulaiman, M.N.2 1 Faculty of Computer Science & Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia 2 Faculty of Computer Science & Information Technology, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia ABSTRACT Evaluation metric plays a critical role in achieving the optimal classifier during the classification training. Thus, a selection of suitable evaluation metric is an important key for discriminating and obtaining the optimal classifier. This paper systematically reviewed the related evaluation metrics that are specifically designed as a discriminator for optimizing generative classifier. Generally, many generative classifiers employ accuracy as a measure to discriminate the optimal solution during the classification training. However, the accuracy has several weaknesses which are less distinctiveness, less discriminability, less in formativeness and bias to majority class data. This paper also briefly discusses other metrics that are specifically designed for discriminating the optimal solution. The shortcomings of these alternative metrics are also discussed. Finally, this paper suggests five important aspects that must be taken into consideration in constructing a new discriminator metric.. KEYWORDS Evaluation Metric, Accuracy, Optimized Classifier, Data Classification Evaluation For More Details : http://aircconline.com/ijdkp/V5N2/5215ijdkp01.pdf Volume Link : http://airccse.org/journal/ijdkp/vol5.html
  • 6. REFERENCES [1] A.A. Cardenas and J.S. Baras, “B-ROC curves for the assessment of classifiers over imbalanced datasets”, in Proc. of the 21st National Conference on Artificial Intelligence Vol. 2, 2006, pp. 1581-1584 [2] R. Caruana and A. Niculescu-Mizil, “Data mining in metric space: an empirical analysis of supervisedlearning performance criteria”, in Proc. of the 10th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD '04), New York, NY, USA, ACM 2004, pp. 69-78. [3] N.V. Chawla, N. Japkowicz and A. Kolcz, “Editorial: Special issue on learning from imbalanced datasets”, SIGKDD Explorations, 6 (2004) 1-6. [4] T. Fawcett, “An Introduction to ROC Analysis”, Pattern Recognition Letters, 27 (2006) 861-874. [5] J. Davis and M. Goadrich, “The relationship between precision-recall and ROC curves”, in Proc. Ofthe 23rd International Conference on Machine Learning, 2006, pp. 233-240. [6] C. Drummond, R.C. Holte, “Cost curves: An Improved method for visualizing classifier performance”, Mach. Learn. 65 (2006) 95-130. [7] P.A. Flach, P.A., “The Geometry of ROC Space: understanding Machine Learning Metrics throughROC Isometrics”, in T. Fawcett and N. Mishra (Eds.) Proc. of the 20th Int. Conference on MachineLearning (ICML 2003), Washington, DC, USA, AAAI Press, 2003, pp. 194-201. [8] V. Garcia, R.A. Mollineda and J.S. Sanchez, “A bias correction function for classification performance assessment in two-class imbalanced problems”, Knowledge-Based Systems, 59(2014) 66-74. [9] S. Garcia and F. Herrera, “Evolutionary training set selection to optimize C4.5 in imbalance problems”, in Proc. of 8th Int. Conference on Hybrid Intelligent Systems (HIS 2008), Washington,DC, USA, IEEE Computer Society, 2008, pp.567-572. [10] N. Garcia-Pedrajas, J. A. Romero del Castillo and D. Ortiz-Boyer, “A cooperative coevolutionaryalgorithm for instance selection for instance-based learning”. Machine Learning (2010), 78 (2010)381-420. [11] Q. Gu, L. Zhu and Z. Cai, “Evaluation Measures of the Classification Performance of ImbalancedDatasets”, in Z. Cai et al. (Eds.) ISICA 2009, CCIS 51. Berlin, Heidelberg: Springer-Verlag, 2009,pp. 461-471. [12] S. Han, B. Yuan and W. Liu, “Rare Class Mining: Progress and Prospect”, in Proc. of ChineseConference on Pattern Recognition (CCPR 2009), 2009, pp. 1-5 [13] D. J. Hand and R. J. Till, “A simple generalization of the area under the ROC curve to multiple classclassification problems”, Machine Learning, 45 (2001) 171-186. [14] M. Hossin, M. N. Sulaiman, A. Mustapha, and N. Mustapha, “A Novel Performance Metric forBuilding an Optimized Classifier”, Journal of Computer Science, 7(4) (2011) 582-509. [15] M. Hossin, M. N. Sulaiman, A. Mustapha, N. Mustapha and R. W. Rahmat, “ OAERP: a BetterMeasure than Accuracy in Discriminating a Better Solution for Stochastic Classification Training”,Journal of Artificial Intelligence, 4(3) (2011) 187-196. [16] M. Hossin, M. N. Sulaiman, A. Mustapha, N. Mustapha and R. W. Rahmat, “A Hybrid EvaluationMetric for Optimizing Classifier”, in Data Mining and Optimization (DMO), 2011 3rd Conference on,2011, pp. 165-170. [17] J. Huang and C. X. Ling, “Using AUC and accuracy in evaluating learning algorithms”, IEEETransactions on Knowledge Data Engineering, 17 (2005) 299-310. [18] J. Huang and C. X. Ling, “Constructing new and better evaluation measures for machine learning”, inR.
  • 7. Sangal, H. Mehta and R. K. Bagga (Eds.) Proc. of the 20th International Joint Conference onArtificial Intelligence (IJCAI 2007), San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.,2007, pp.859-864. [19] N. Japkowicz, “Assessment metrics for imbalanced learning”, in Imbalanced Learning: Foundations,Algorithms, and Applications, Wiley IEEE Press, 2013, pp. 187-210 [20] M. V. Joshi, “On evaluating performance of classifiers for rare classes”, in Proceedings of the 2002IEEE Int. Conference on Data Mining (ICDN 2002) ICDM’02, Washington, D. C., USA: IEEEComputer Society, 2002, pp. 641-644. [21] T. Kohonen, Self-Organizing Maps, 3rd ed., Berlin Heidelberg: Springer-Verlag, 2001. [22] L. I. Kuncheva and J. C. Bezdek, “Nearest Prototype Classification: Clustering, Genetic Algorithms,or Random Search?” IEEE Transactions on Systems, Man, and Cybernetics-Part C: Application andReviews, 28(1) (1998) 160-164. [23] N. Lavesson, and P. Davidsson, “Generic Methods for Multi-Criteria Evaluation”, in Proc. of theSiam Int. Conference on Data Mining, Atlanta, Georgia, USA: SIAM Press, 2008, pp. 541-546. [24] P. Lingras, and C. J. Butz, “Precision and recall in rough support vector machines”, in Proc. of the2007 IEEE Int. Conference on Granular Computing (GRC 2007), Washington, DC, USA: IEEEComputer Society, 2007, pp.654-654. [25] D. J. C. MacKay, Information, Theory, Inference and Learning Algorithms. Cambridge, UK:Cambridge University Press, 2003. [26] T. M. Mitchell, Machine Learning, USA: MacGraw-Hill, 1997. [27] R. Prati, G. Batista, and M. Monard, “A survery on graphical methods for classification predictive performance evaluation”, IEEE Trans. Knowl. Data Eng. 23(2011) 1601-1618. [28] F. Provost, and P. Domingos, “Tree induction for probability-based ranking”. Machine Learning, 52(2003) 199-215. [29] A. Rakotomamonyj, “Optimizing area under ROC with SVMs”, in J. Hernandez-Orallo, C. Ferri, N.Lachiche and P. A. Flach (Eds.) 1st Int. Workshop on ROC Analysis in Artificial Intelligence(ROCAI 2004), Valencia, Spain, 2004, pp. 71-80. [30] R. Ranawana, and V. Palade, “Optimized precision-A new measure for classifier performanceevaluation”, in Proc. of the IEEE World Congress on Evolutionary Computation (CEC 2006), 2006,pp. 2254-2261. [31] S. Rosset, “Model selection via AUC”, in C. E. Brodley (Ed.) Proc. of the 21st Int. Conference onMachine Learning (ICML 2004), New York, NY, USA: ACM, 2004, pp. 89. [32] D.B. Skalak, “Prototype and feature selection by sampling and random mutation hill climbingalgorithm”, in W. W. Cohen and H. Hirsh (Eds.) Proc. of the 11th Int. Conference on MachineLearning (ICML 1994), New Brunswick, NJ, USA: Morgan Kaufmann, 1994, pp.293-301. [33] M. Sokolova and G. Lapame, “A systematic analysis of performance measures for classificationtasks”, Information Processing and Management, 45(2009) 427-437. [34] P. N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Boston, USA: Pearson AddisonWesley, 2006. [35] M. Vuk and T. Curk, “ROC curve, lift chart and calibration plot”, Metodoloˇskizvezki, 3(1) (2006)89- 108. [36] H. Wallach, “Evaluation metrics for hard classifiers”. Technical Report. (Ed.: Wallach, 2006)http://www.inference.phy.cam.ac.uk/hmw26/papers
  • 8. [37] S. W. Wilson, “Mining oblique data with XCS”, in P. L. Lanzi, W. Stolzmann and S. W. Wilson(Eds.) Advances in Learning Classifier Systems: Third Int. Workshop (IWLCS 2000), Berlin,Heidelberg: Springer- Verlag, 2001, pp. 283-290. [38] H. Zhang and G. Sun, “Optimal reference subset selection for nearest neighbor classification by tabusearch”, Pattern Recognition, 35(7) (2002) 1481-1490.
  • 9. A BUSINESS INTELLIGENCE PLATFORM IMPLEMENTED IN A BIG DATA SYSTEM EMBEDDING DATA MINING: A CASE OF STUDY Alessandro Massaro, Valeria Vitti, Palo Lisco, Angelo Galiano and Nicola Savino Dyrecta Lab, IT Research Laboratory, Via VescovoSimplicio, 45, 70014 Conversano (BA), Italy. (in collaboration with ACI Global S.p.A., VialeSarca, 336 - 20126 Milano, Via Stanislao Cannizzaro, 83/a - 00156 Roma, Italy) ABSTRACT In this work is discussed a case study of a business intelligence –BI- platform developed within the framework of an industry project by following research and development – R&D- guidelines of ‘Frascati’. The proposed results are a part of the output of different jointed projects enabling the BI of the industry ACI Global working mainly in roadside assistance services. The main project goal is to upgrade the information system, the knowledge base –KB- and industry processes activating data mining algorithms and big data systems able to provide gain of knowledge. The proposed work concerns the development of the highly performing Cassandra big data system collecting data of two industry location. Data are processed by data mining algorithms in order to formulate a decision making system oriented on call center human resources optimization and on customer service improvement. Correlation Matrix, Decision Tree and Random Forest Decision Tree algorithms have been applied for the testing of the prototype system by finding a good accuracy of the output solutions. The Rapid Miner tool has been adopted for the data processing. The work describes all the system architectures adopted for the design and for the testing phases, providing information about Cassandra performance and showing some results of data mining processes matching with industry BI strategies.. KEYWORDS Big Data Systems, Cassandra Big Data, Data Mining, Correlation Matrix, Decision Tree, Frascati Guideline Full Text :http://aircconline.com/ijdkp/V9N1/9119ijdkp01.pdf .
  • 10. REFERNCES [1] Khan, R. A. &Quadri, S. M. K. (2012) “Business Intelligence: an Integrated Approach”, BusinessIntelligence Journal, Vol.5 No.1, pp 64-70. [2] Chen, H., Chiang, R. H. L. & Storey V. C. (2012) “Business Intelligence and Analytics: from Bi Data to Big Impact”, MIS Quarterly, Vol. 36, No. 4, pp 1165-1188. [3] Andronie, M. (2015) “Airline Applications of Business Intelligence Systems”, Incas Bulletin, Vol. 7,No. 3, pp 153 – 160. [4] Iankoulova, I. (2012) “Business Intelligence for Horizontal Cooperation”, Master Thesis, Univesitiet Twente.[Online].Availablehttps://www.utwente.nl/en/mbit/finalproject/example_excellent_master_thesi/ master_thesis_bit/IankoulovaID.pdf [5] Nunes, A. A., Galvão, T. & Cunha, J. F. (2014) “Urban Public Transport Service Co-creation:Leveraging Passenger’s Knowledge to Enhance Travel Experience”, Procedia Social and BehavioralSciences, Vol. 111, pp 577 – 585. [6] Fitriana, R., Eriyatno, Djatna, T. (2011) “Progress in Business Intelligence System research: Aliterature Review”, International Journal of Basic & Applied Sciences IJBAS-IJENS, Vol. 11, No. 03,pp 96-105. [7] Lia, M. (2015) "Customer Data Analysis Model using Business Intelligence Tools inTelecommunication Companies", Database Systems Journal, Vol. 6, No. 2, pp 39-43. [8] Habul, A., Pilav-Velić, A. &Kremić, E. (2012) “Customer Relationship Management and BusinessIntelligence”, Intech book 2012: Advances in Customer Relationship Management, chapter 2. [9] Kemper,H.-G., Baars, H. &Lasi, H. (2013) “An Integrated Business Intelligence Framework Closingthe Gap Between IT Support for Management and for Production”, Springer: Business Intelligenceand Performance Management Part of the series Advanced Information and Knowledge Processing,pp 13-26, Chapter 2. [10] Bara, A., Botha, I., Diaconiţa, V., Lungu, I., Velicanu, A., Velicanu, M. (2009) “A Model forBusiness Intelligence Systems’ Development”, InformaticaEconomică, Vol. 13, No. 4, pp 99-108. [11] Negash, S. (2004) “Business Intelligence”, Communications of the Association for InformationSystems, Vol. 13, pp 177-195. [12] Nofal, M. I. &Yusof, Z. M. (2013) “Integration of Business Intelligence and Enterprise ResourcePlanning within Organizations”, Procedia Technology, Vol. 11 ( 2013 ), pp. 658 – 665. [13] Williams, S. & Williams, N. (2003) “The Business Value of Business Intelligence”, BusinessIntelligence Journal, FALL 2003, pp 1-11. [14] Lečić, D. &Kupusinac, A. (2013) “The Impact of ERP Systems on Business Decision-Making”,TEM Journal, Vol. 2, No. 4, pp 323-326. [15] Ong, L., Siew, P. H. & Wong, S. F. (2011) “A Five-Layered Business Intelligence Architecture”,IBIMA Publishing, Communications of the IBIMA,Vol. 2011, Article ID 695619, pp 1-11. [16] Raymond T. Ng, Arocena, P. C., Barbosa, D., Carenini, G., Gomes, L., Jou, S., Leung, R. A., Milios,E., Miller, R. J., Mylopoulos, J., Pottinger, R. A., Tompa, F. & Yu, E. (2013) “Perspectives onBusiness Intelligence”, A Publication in the Morgan & Claypool Publishers series Synthesis Lectureson Data Management. [17] “NTT DATA Connected Car Report: A brief insight on the connected car market, showingpossibilities and challenges for third-party service providers by means of an application case study” [Online].Available:https://emea.nttdata.com/fileadmin/web_data/country/de/documents/Manufacturing/S tudien/2015_Connected_Car_Report_NTT_DATA_ENG.pdf
  • 11. [18] “Cognizant report: Exploring the Connected Car Cognizant 20-20” [Online]. Available:https://www.cognizant.com/InsightsWhitepapers/Exploring-the-Connected-Car.pdf International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.9, No.1, January 201919 [19] Sarangi, P. K., Bano, S., Pant, M. (2014) “Future Trend in Indian Automobile Industry: A StatisticalApproach”, Journal of Management Sciences And Technology, Vol. 2, No. 1, pp. 28-32. [20] Bates, H. &Holweg, M. (2007) “Motor Vehicle Recalls: Trends, Patterns and Emerging Issues”,Omega, Vol. 35, No. 2, pp 202–210. [21] D’Aloia, M., Russo, M. R., Cice G., Montingelli, A., Frulli, G., Frulli, E., Mancini, F., Rizzi, M.,Longo, A. (2017) “Big Data Performance and Comparison with Different DB Systems”, InternationalJournal of Computer Science and Information Technologies, Vol. 8, No. 1, pp 59-63. [22] Wimmer, H. & Powell, L. M. (2015) “A Comparison of Open Source Tools for Data Science”,Proceedings of the Conference on Information Systems Applied Research, Wilmington, NorthCarolina USA. [23] Al-Khoder, A. &Harmouch, H. (2014) “Evaluating four of the most popular Open Source and FreeData Mining Tools,” IJASR International Journal of Academic Scientific Research, Vol. 3, No. 1, pp13-23. [24] Antonio Gulli, Sujit Pal, “Deep Learning with Keras- Implement neural networks with Keras onTheano and TensorFlow,” BIRMINGHAM – MUMBAI Packt book, ISBN 978-1-78712-842-2, 2017. [25] Kovalev V., Kalinovsky A., and Kovalev S. Deep Learning with Theano, Torch, Caffe, TensorFlow,and deeplearning4j: which one is the best in speed and accuracy? In: XIII Int. Conf. on PatternRecognition and Information Processing, 3-5 October, Minsk, Belarus State University, 2016, pp. 99-103. [26] Frascati Manual 2015: The Measurement of Scientific, Technological and Innovation ActivitieGuidelines for Collecting and Reporting Data on Research and Experimental Development. OECD(2015), ISBN 978-926423901-2 (PDF). [27] Massaro, A. Maritati, V., Galiano, A., Birardi, V. &Pellicani, L. (2018) “ESB Platform IntegratinKNIME Data Mining Tool oriented on Industry 4.0 Based on Artificial Neural Network PredictiveMaintenance”, International Journal of Artificial Intelligence and Applications (IJAIA), Vol.9,No.3, [28] Massaro, A., Calicchio, A., Maritati, V., Galiano, A., Birardi, V., Pellicani, L., Gutierrez Millan, M.,DallaTezza, B., Bianchi, M., Vertua, G., Puggioni, A. (2018) “A Case Study of Innovation of anInformation Communication system and Upgrade of the Knowledge Base in Industry by ESB,Artificial Intelligence, and Big Data System Integration”, International Journal of ArtificialIntelligence and Applications (IJAIA), Vol. 9, No.5, pp. 27-43. [29] “WSO2” [Online]. Available: https://wso2.com/products/enterprise-service-bus/ [30] “Ubuntu” [Online]. Available: https://www.ubuntu.com/ [31] “Apache Cassandra” [Online]. Available: http://cassandra.apache.org/ [32]“DataStaxEnterpriseOpsCenter”[Online].Available: https://www.datastax.com/products/datastaxopscenter [33] “About DataStaxDevCenter” [Online]. Available: https://docs.datastax.com/en/developer/devcenter/doc/devcenter/dcAbout.html [34] “Knowi” [Online]. Available: https://www.knowi.com/ [35] “JFreeChart” [Online]. Available: http://www.jfree.org/jfreechart/samples.html
  • 12. [36] “PuTTY” [Online]. Available: https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html [37] “Lightning Fast Data Science for Teams” [Online]. Available: https://rapidminer.com/ [38] Massaro, A., Meuli, G. &Galiano, A. (2018) “Intelligent Electrical Multi Outlets Controlled an Activated by a Data Mining Engine Oriented to Building Electrical Management”, InternationalJournal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.7, No.4, pp 1-20. [39] Myers, J. L. & Well, A. D. (2003) “Research Design and Statistical Analysis”, (2nd ed.) LawrenceErlbaum. . AUTHOR Alessandro Massaro: Research & Development Chief of Dyrecta Lab s.r.l.
  • 13. INCREASED PREDICTION ACCURACY IN THE GAME OF CRICKETUSING MACHINE LEARNING Kalpdrum Passi and Niravkumar Pandey Department of Mathematics and Computer Science Laurentian University, Sudbury, Canada ABSTRACT Player selection is one the most important tasks for any sport and cricket is no exception. The performance of the players depends on various factors such as the opposition team, the venue, his current form etc. The team management, the coach and the captain select 11 players for each match from a squad of 15 to 20 players. They analyze different characteristics and the statistics of the players to select the best playing 11 for each match. Each batsman contributes by scoring maximum runs possible and each bowler contributes by taking maximum wickets and conceding minimum runs. This paper attempts to predict the performance of players as how many runs will each batsman score and how many wickets will each bowler take for both the teams. Both the problems are targeted as classification problems where number of runs and number ofwickets are classified in different ranges. We used naïve bayes, random forest, multiclass SVM and decision tree classifiers to generate the prediction models for both the problems. Random Forest classifier wasfound to be the most accurate for both the problems. KEYWORDS Naïve Bayes, Random Forest, Multiclass SVM, Decision Trees, Cricket Full Text : http://aircconline.com/ijdkp/V8N2/8218ijdkp03.pdf .
  • 14. REFERNCES [1] S. Muthuswamy and S. S. Lam, "Bowler Performance Prediction for One-day International Cricket Using Neural Networks," in Industrial Engineering Research Conference, 2008. [2] I. P. Wickramasinghe, "Predicting the performance of batsmen in test cricket," Journal of Human Sport & Excercise, vol. 9, no. 4, pp. 744-751, May 2014. [3] G. D. I. Barr and B. S. Kantor, "A Criterion for Comparing and Selecting Batsmen in Limited Overs Cricket," Operational Research Society, vol. 55, no. 12, pp. 1266-1274, December 2004. [4] S. R. Iyer and R. Sharda, "Prediction of athletes performance using neural networks: An application in cricket team selection," Expert Systems with Applications, vol. 36, pp. 5510-5522, April 2009. [5] M. G. Jhanwar and V. Pudi, "Predicting the Outcome of ODI Cricket Matches: A Team Composition Based Approach," in European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2016 2016), 2016. [6] H. H. Lemmer, "The combined bowling rate as a measure of bowling performance in cricket," South African Journal for Research in Sport, Physical Education and Recreation, vol. 24, no. 2, pp. 37-44, January 2002. [7] D. Bhattacharjee and D. G. Pahinkar, "Analysis of Performance of Bowlers using Combined Bowling Rate," International Journal of Sports Science and Engineering, vol. 6, no. 3, pp. 1750-9823, 2012. [8] S. Mukherjee, "Quantifying individual performance in Cricket - A network analysis of batsmen and bowlers," Physica A: Statistical Mechanics and its Applications, vol. 393, pp. 624-637, 2014. [9] P. Shah, "New performance measure in Cricket," ISOR Journal of Sports and Physical Education, vol. 4, no. 3, pp. 28-30, 2017. [10] D. Parker, P. Burns and H. Natarajan, "Player valuations in the Indian Premier League," Frontier Economics, vol. 116, October 2008. [11] C. D. Prakash, C. Patvardhan and C. V. Lakshmi, "Data Analytics based Deep Mayo Predictor for IPL9," International Journal of Computer Applications, vol. 152, no. 6, pp. 6-10, October 2016. [12] M. Ovens and B. Bukiet, "A Mathematical Modelling Approach to One-Day Cricket Batting Orders,"Journal of Sports Science and Medicine, vol. 5, pp. 495-502, 15 December 2006. [13] R. P. Schumaker, O. K. Solieman and H. Chen, "Predictive Modeling for Sports and Gaming," in Sports Data Mining, vol. 26, Boston, Massachusetts: Springer, 2010. [14] M. Haghighat, H. Ratsegari and N. Nourafza, "A Review of Data Mining Techniques for Result Prediction in Sports," Advances in Computer Science : an International Journal, vol. 2, no. 5, pp. 7- 12,November 2013. [15] J. Hucaljuk and A. Rakipovik, "Predicting football scores using machine learning techniques," in International Convention MIPRO, Opatija, 2011. [16] J. McCullagh, "Data Mining in Sport: A Neural Network Approach," International Journal of Sports Science and Engineering, vol. 4, no. 3, pp. 131-138, 2012. [17] "Free web scraping - Download the most powerful web scraper | ParseHub," parsehub, [Online]. Available: https://www.parsehub.com. [18] "Import.io | Extract data from the web," Import.io, [Online]. Available: https://www.import.io. [19] T. L. Saaty, The Analytic Hierarchy Process, New York: McGrow Hill, 1980.
  • 15. [20] T. L. Saaty, "A scaling method for priorities in a hierarchichal structure," Mathematical Psychology, vol. 15, 1977. [21] N. V. Chavla, K. W. Bowyer, L. O. Hall and P. W. Kegelmeyer, "SMOTE: Synthetic Minority Oversampling Technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, June 2002. [22] J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, 3rd Edition ed., Waltham: Elsevier, 2012. [23] J. R. Quinlan, "Induction of Decision Trees," Machine learning, vol. 1, no. 1, pp. 81-106, 1986. [24] J. R. Quinlan, C4.5: Programs for Machine Learning, Elsevier, 2015. [25] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001. [26] T. K. Ho, "The Random Subspace Method for Constructing Decision Forests," IEEE transactions on pattern analysis and machine intelligence, vol. 20, no. 8, pp. 832-844, August 1998. [27] L. Breiman, J. Friedman, C. J. Stone and R. A. Olshen, Classification and regression trees, CRC Press,1984. [28] B. E. Boser, I. M. Guyon and V. N. Vapnik, "A Training Algorithm for Optimal Margin Classifiers," in Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, 1992. [29] C.-C. Chang and C.-J. Lin, "LIBSVM: A library for support vector machines," ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, April 2011. [30] T. L. Saaty, "A scaling method for priorities in a hierarchichal structure," Mathematical Psychology, vol. 15, pp. 234-281, 1977. [31] T. L. Saaty, The Analytical Hierarchy Process, New York: McGraw-Hill, 1980. [32] N. V. Chavla, K. W. Bowyer, L. O. Hall and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Oversampling Technique," Artificial Intelligence Research, vol. 16, p. 321–357, June 2002. AUTHORS Kalpdrum Passi received his Ph.D. in Parallel Numerical Algorithms from Indian Institute of Technology, Delhi, India in 1993. He is an Associate Professor, Department of Mathematics & Co mputer Science, at Laurentian University, Ontario, Canada. He has published many papers on Parallel Numerical Algorithms in international journals and conferences. He has collaborative work with faculty in Canada and US and the work was tested on the CRAY XMP’s and CRAY YMP’s. He transitioned his research to web technology, and more recently has been involved in machine learning and data mining applications in bioinformatics, social media and other data science areas. He obtained funding from NSERC and Laurentian University for his research. He is a member of the ACM and IEEE Computer Society. Niravkumar Pandey is pursuing M.Sc. in Computational Science at Laurentian University, Ontario, Canada. He received his Bachelor of Engineering degree from Gujarat Technological University, Gujarat, India. Data mining and machine learning are his primary areas of interest. He is also a cricket enthusiast and is studying applicationsof machine learning and data mining in cricket analytics for his M.Sc. thesis
  • 16. Data Mining and Its Applications for Knowledge Management : A Literature Review from 2007 to 2012 Tipawan Silwattananusarn1and Assoc.Prof.Dr. KulthidaTuamsuk2 1Ph.D. Student in Information Studies Program, KhonKaen University, Thailand 2Head, Information & Communication Management Program, KhonKaen University, Thailand ABSTRACT Data mining is one of the most important steps of the knowledge discovery in databases process and is considered as significant subfield in knowledge management. Research in data mining continues growing in business and in learning organization over coming decades. This review paper explores the applications of data mining techniques which have been developed to support knowledge management process. The journal articles indexed in ScienceDirect Database from 2007 to 2012 are analyzed and classified. The discussion on the findings is divided into 4 topics: (i) knowledge resource; (ii) knowledge types and/or knowledge datasets; (iii) data mining tasks; and (iv) data mining techniques and applications used in knowledge management. The article first briefly describes the definition of data mining and data mining functionality. Then the knowledge management rationale and major knowledge management tools integrated in knowledge management cycle are described. Finally, the applications of data mining techniques in the process of knowledge management are summarized and discussed.. KEYWORDS Data mining; Data mining applications; Knowledge management. For More Details : http://aircconline.com/ijdkp/V2N5/2512ijdkp02.pdf Volume Link :http://airccse.org/journal/ijdkp/vol2.htmll
  • 17. REFERENCES [1] An, X. & Wang, W. (2010). Knowledge management technologies and applications: A literaturereview. IEEE, 138-141. doi:10.1109/ICAMS.2010.5553046 [2] Berson, A., Smith, S.J. &Thearling, K. (1999). Building Data Mining Applications for CRM. New York: McGraw-Hill. [3] Cantú, F.J. &Ceballos, H.G. (2010). A multiagent knowledge and information network approach for managing research assets. Expert Systems with Applications, 37(7), 5272- 5284.doi:10.1016/j.eswa.2010.01.012 [4] Cheng, H., Lu, Y. &Sheu, C. (2009). An ontology-based business intelligence application in a financial knowledge management system.Expert Systems with Applications, 36, 3614– 3622.Doi:10.1016/j.eswa.2008.02.047 [5] Dalkir, K. (2005). Knowledge Management in Theory and Practice. Boston: Butterworth-Heinemann. [6] Dawei, J. (2011). The Application of Date Mining in Knowledge Management.2011 International Conference on Management of e-Commerce and e-Government, IEEE Computer Society, 7-9.doi: 10.1109/ICMeCG.2011.58 [7] Fayyad, U., Piatetsky-Shapiro, G. & Smyth, P. (1996). From Data Mining to Knowledge Discovery in Databases.AI Magazine, 17(3), 37-54. [8] Gorunescu, F. (2011). Data Mining: Concepts, Models, and Techniques. India: Springer. [9] Han, J. &Kamber, M. (2012). Data Mining: Concepts and Techniques. 3rd.ed. Boston: Morgan Kaufmann Publishers. [10] Hwang, H.G., Chang, I.C., Chen, F.J. & Wu, S.Y. (2008). Investigation of the application of KMS for diseases classifications: A study in a Taiwanese hospital. Expert Systems with Applications, 34(1), 725-733. doi:10.1016/j.eswa.2006.10.018 [11] Lavrac, N., Bohanec, M., Pur, A., Cestnik, B., Debeljak, M. &Kobler, A. (2007).Data mining and visualization for decision support and modeling of public health-care resources.Journal of Biomedical Informatics, 40, 438-447. doi:10.1016/j.jbi.2006.10.003 [12] Li, X., Zhu, Z. & Pan, X. (2010). Knowledge cultivating for intelligent decision making in small &middle businesses.Procedia Computer Science, 1(1), 2479-2488. doi:10.1016/j.procs.2010.04.280 [13] Li, Y., Kramer, M.R., Beulens, A.J.M., Van Der Vorst, J.G.A.J. (2010). A framework for earlywarning and proactive control systems in food supply chain networks. Computers in Industry, 61, 852–862. Doi:101.016/j.compind.2010.07.010 [14] Liao, S.H., Chen, C.M., Wu, C.H. (2008). Mining customer knowledge for product line and brand extension in retailing.Expert Systems with Applications, 34(3), 1763- 1776.doi:10.1016/j.eswa.2007.01.036 [15] Liao, S. (2003). Knowledge management technologies and applications-literature review from 1995 to 2002. Expert Systems with Applications, 25, 155-164. doi:10.1016/S0957-4174(03)00043-5 [16] Liu, D.R. & Lai, C.H. (2011). Mining group-based knowledge flows for sharing task knowledge. Decision Support Systems,50(2), 370-386. doi:10.1016/j.dss.2010.09.004 [17] Lee, M.R. & Chen, T.T. (2011). Revealing research themes and trends in knowledge management: From 1995 to 2010. Knowledge-Based Systems.doi:10.1016/j.knosys.2011.11.016 [18] McInerney, C.R. & Koenig, M.E. (2011). Knowledge Management (KM) Processes in Organizations:
  • 18. Theoretical Foundations and Practice. USA: Morgan & Claypool Publishers. doi:10.2200/S00323ED1V01Y201012ICR018 [19] McInerney, C. (2002). Knowledge Management and the Dynamic Nature of Knowledge.Journal of the American Society for Information Science and Technology, 53(12), 1009-1018. doi:10.1002/asi.10109 [20] Ngai, E., Xiu, L. &Chau, D. (2009). Application of data mining techniques in customer relationship management: A literature review and classification. Expert Systems with Applications, 36, 2592- 2602. doi:10.1016/j.eswa.2008.02.021 [21] Ruggles, R.L. (ed.). (1997). Knowledge Management Tools. Boston: Butterworth-Heinemann. [22] Sher, P.J. & Lee, V.C. (2004). Information technology as a facilitator for enhancing dynamic capabilities through knowledge management.Information& Management, 41, 933- 945.doi:10.1016/j.im.2003.06.004 [23] Tseng, S.M. (2008). The effects of information technology on knowledge management systems.Expert Systems with Applications, 35, 150-160. doi:10.1016/j.eswa.2007.06.011 [24] Ur-Rahman, N. & Harding, J.A. (2012). Textual data mining for industrial knowledge management and text classification: A business oriented approach. Expert Systems with Applications, 39, 4729- 4739. doi:10.1016/j.eswa.2011.09.124 [25] Wang, F. & Fan, H. (2008). Investigation on Technology Systems for Knowledge Management.IEEE, 1-4. doi:10.1109/WiCom.2008.2716 [26] Wang, H. & Wang, S. (2008). A knowledge management approach to data mining process for business intelligence. Industrial Management & Data Systems, 108(5), 622-634. [27] Wu, W., Lee, Y.T., Tseng, M.L. & Chiang, Y.H. (2010). Data mining for exploring hidden patterns between KM and its performance.Knowledge-Based Systems, 23, 397- 401.doi:10.1016/j.knosys.2010.01.014[15] C.-C.Chang, T. D. Kieu, and Y.-C.Chou, "A High Payload Steganographic Scheme Based on (7, 4) Hamming Code for Digital Images," Proc. of the 2008 International Symposium on Electronic Commerce and Security, pp.16-21, August 2008.
  • 19. APPLICATION OF SPATIOTEMPORAL ASSOCIATION RULES ON SOLAR DATA TO SUPPORT SPACE WEATHER FORECASTING Carlos Roberto Silveira Junior1 , José Roberto Cecatto2 , Marilde Terezinha Prado Santos1 and Marcela Xavier Ribeiro1 1 Department of Computing, Federal University of São Carlos, São Carlos, Brazil 2 National Institute of Space Research, São José dos Campos, Brazil ABSTRACT It is well known that solar energetic phenomena influence the Space Weather, in special those directed to the Earth environment. In this context, the analysis of Solar Data is a challenging task, particularly when are composed of Satellite Image Time Series (SITS). It is a multidisciplinary domain that generates a massive amount of data (several Gigabytes per year). It includes image processing, spatiotemporal characteristics, and the processing of semantic data. Aiming to enhance the SITS analysis, we propose an algorithm called "Miner of Thematic Spatiotemporal Associations for Images" (MiTSAI), which is anb extractor of Thematic Spatiotemporal Association Rules (TSARs) from Solar SITS. Here, a description is given about the details of the modern algorithm MiTSAI, which is an extractor of Thematic Spatiotemporal Association Rules (TSARs) from solar Satellite Image Time Series (SITS). In addition, its adaptation to the Space Weather and discussion about the specific use in favor of forecasting activities are presented. Finally, some results of its application specifically to solar flare forecasting are also presented. MiTSAI has to extract interesting new patterns compared with the art-state algorithms. KEYWORDS Data Mining, Drug Discovery, Drug Description, Chemoinformatics, and Web Application For More Details : https://aircconline.com/ijdkp/V10N2/10220ijdkp01.pdf Volume Link : http://airccse.org/journal/ijdkp/vol10.html
  • 20. REFERENCES [[1] T Abirami-Kongu, P Thangaraj, and P Priakanth-Kongu. Wireless sensor networks fault identification using data association. Journal of Computer Science, 8(9):1501–1505, 2012. [2] Sultan Alamri, David Taniar, and Maytham Safar. A taxonomy for moving object queries in spatialdatabases. Future Generation Computer Systems, 37:232 – 242, 2014. ISSN 0167-739X. doi: https://doi.org/10.1016/j.future.2014.02.007. [3] Shadi A. Aljawarneh, Radhakrishna Vangipuram, Veereswara Kumar Puligadda, and Janaki Vinjamuri. G-spamine: An approach to discover temporal association patterns and trends in internet of things. Future Generation Computer Systems, 2017. ISSN 0167-739X. doi: https://doi.org/10.1016/ j.future.2017.01.013. [4] Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. Surf: Speeded up robust features. pages 404–417, 2006. [5] Monica G Bobra and Sebastien Couvidat. Solar flare prediction using sdo/hmi vector magnetic field data with a machine-learning algorithm. The Astrophysical Journal, 798(2):135–172, 2015. [6] I. Burbey and T.L. Martin. A survey on predicting personal mobility. International Journal of Pervasive Computing and Communications, 8(1):5 – 22, 2012. [7] M. Chen, S. Mao, and Y. Liu. Big data: A survey. Mobile Networks and Applications, 19(2):171– 209,2014. [8] Paolo Compieta, Sergio Di Martino, Michela Bertolotto, Filomena Ferrucci, and T Kechadi. Exploratory spatio-temporal data mining and visualization. Journal of Visual Languages & Computing, 18(3):255–279, 2007. [9] G Fang and Y Wu. Frequent spatiotemporal association patterns mining based on granular computing. Informatica (Slovenia), 37(4):443–453, 2013. [10] A. Hana, Y.T. Sami, and F. Sami. Mining spatiotemporal associations using queries. 2012 International Conference on Information Technology and e-Services, ICITeS 2012, 2012 [11] R. M. Haralick, K. Shanmugam, and I. Dinstein. Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, SMC-3(6):610–621, Nov 1973. ISSN 0018-9472. doi: 10.1109/TSMC.1973.4309314. [12] J. Huo, J. Zhang, and X. Meng. On co-occurrence pattern discovery from spatio-temporal event stream. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8181 LNCS(PART 2):385–395, 2013. [13] J. Kawale, S. Liess, V. Kumar, U. Lall, and A. Ganguly. Mining time-lagged relationships in spatiotemporal climate data. pages 130–135, 2012. [14] Xiangjie Kong, Zhenzhen Xu, Guojiang Shen, Jinzhong Wang, Qiuyuan Yang, and Benshi Zhang. Urban traffic congestion estimation and prediction based on floating car trajectory data. Future Generation Computer Systems, 61:97 – 107, 2016. ISSN 0167-739X. doi: https://doi.org/10.1016/j.future. 2015.11.013. [15] T.C.W. Landgrebe, A. Merdith, A. Dutkiewicz, and R.D. Mafaler. Relationships between palaeogeography and opal occurrence in australia: A data-mining approach. Computers and Geosciences, 56: 76–82, 2013. [16] Angela Lausch, Andreas Schmidt, and Lutz Tischendorf. Data mining and linked open data: New perspectives for data analysis in environmental research. Ecological Modelling, 295(0):5 – 17, 2015. ISSN 0304-3800. doi: http://dx.doi.org/10.1016/j.ecolmodel.2014.09.018.
  • 21. [17] A. Madraky, Z.A. Othman, and A.R. Hamdan. Analytic methods for spatio-temporal data in a natureinspired data model. International Review on Computers and Software, 9(3):547–556, 2014. [18] A.a Mohan and P.Z.b Revesz. Applications of spatio-temporal data mining to north platter river reservoirs. ACM International Conference Proceeding Series, pages 306–309, 2014. [19] NOAA. www.solarmonitor.org, April 2016. Last Access April 13, 2016. [20] K.G. Pillai, R.A. Angryk, J.M. Banda, M.A. Schuh, and T. Wylie. Spatio-temporal co-occurrence pattern mining in data sets with evolving regions. Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012, pages 805–812, 2012. [21] K.G.a Pillai, R.A.b Angryk, and B.b Aydin. A filter-and-refine approach to mine spatiotemporal cooccurrences. pages 104–113, 2013. [22] Vangipuram Radhakrishna, Shadi A. Aljawarneh, P.V. Kumar, and V. Janaki. A novel fuzzy similarity measure and prevalence estimation approach for similarity profiled temporal association pattern mining. Future Generation Computer Systems, 2017. ISSN 0167-739X. doi: https://doi.org/10.1016/j. future.2017.03.016. [23] K Venkateswara Rao, A Govardhan, and KV Chalapati Rao. Spatiotemporal data mining: Issues, tasks and applications. International Journal of Computer Science & Engineering Survey (IJCSES) Vol, 3:39–52, 2012. [24] Md. Mamunur Rashid, Iqbal Gondal, and Joarder Kamruzzaman. A technique for parallel sharefrequent sensor pattern mining from wireless sensor networks. Procedia Computer Science, 29(0):124 – 133, 2014. ISSN 1877-0509. doi: http://dx.doi.org/10.1016/j.procs.2014.05.012. 2014 International Conference on Computational Science. [25] Md.M. Rashid, I. Gondal, and J. Kamruzzaman. Mining associated sensor patterns for data stream of wireless sensor networks. pages 91–98, 2013. [26] Marcela Xavier Ribeiro, Agma J. M. Traina, and Caetano Traina, Jr. A new algorithm for data discretization and feature selection. In Proceedings of the 2008 ACM symposium on Applied computing, SAC ’08, pages 953–954, New York, NY, USA, 2008. ACM. ISBN 978-1-59593-753-7. doi: 10.1145/1363686.1363905. [27] James A. Rodger. Toward reducing failure risk in an integrated vehicle health maintenance system: A fuzzy multi-sensor data fusion kalman filter approach for ivhms. Expert Systems with Applications, 39(10):9821 – 9836, 2012. ISSN 0957-4174. doi: https://doi.org/10.1016/j.eswa.2012.02.171. [28] W. Sammouri, E. Cafame, L. Oukhellou, and P. Aknin. Mining floating train data sequences for temporal association rules within a predictive maintenance framework. Lecture Notes in Computer subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7987 LNAI:112–126, 2013. [29] T. Scheffer. Finding association rules that trade support optimally against confidence. pages 9: 381– 395, 1995. [30] Carlos Roberto Silveira-Junior. github.com/carlossilveirajr/mitsai, April 2017. Last Access April 17, 2017. [31] Carlos Roberto Silveira-Junior, Danilo Codeco Carvalho, Marilde Terezinha Prado Santos, and Marcela Xavier Ribeiro. Incremental mining of frequent sequences in environmental sensor data. In The Twenty-Sixth International FLAIRS Conference (2015), pages 452–455, . [32] Carlos Roberto Silveira-Junior, Marilde Prado Santos, and Marcela Ribeiro. Stretchy time pattern mining: A deeper analysis of environment sensor data. pages 468–473, . [33] Carlos Roberto Silveira-Junior, Marcela Xavier Ribeiro, and Marilde Terezinha Prado Santos. A flexible architecture to integrate the solar satellite image time series data - the setl architecture. Pages 1–14, 2017. No publish by the time of this paper preparation. [34] Olga Spatenkovaa and Kirsi Virrantausb. Discovering spatio-temporal relationships in the distribution
  • 22. of building fires. Fire Safety Journal, 62, Part A:49 – 63, 2013. ISSN 0379-7112. doi: http://dx. doi.org/10.1016/j.firesaf.2013.07.001. Special Issue on Spatial Analytical Approaches in Urban Fire Management. [35] Fenzhen Su, Chenghu Zhou, and Wenzhoung Shi. Geoevent association rule discovery model based on rough set with marine fishery application. In Geoscience and Remote Sensing Symposium, 2004. IGARSS ’04. Proceedings. 2004 IEEE International, volume 2, pages 1455–1458 vol.2, Sept 2004. doi: 10.1109/IGARSS.2004.1368694. [36] Edi Winarko and John F. Roddick. Armada: An algorithm for discovering richer relative temporal association rules from interval-based data. Data & Knowledge Engineering, 63(1):76 – 90, 2007 ISSN 0169-023X. doi: http://dx.doi.org/10.1016/j.datak.2006.10.009. Data Warehouse an Knowledge Discovery, 7th International Congress on Data Warehouse and Knowledge Discovery. [37] J.S. Yoo and M. Bow. Mining spatial colocation patterns: A different framework. Data Mining and Knowledge Discovery, 24(1):159–194, 2012. [38] B. Zaragoza, A. Rabasa, J.J. Rodriguez-Sala, J.T. Navarro, A. Belda, and A. Ramon. Modelling farmland abandonment: A study combining gis and data mining techniques. volume 155, pages 124 – 132, 2012. doi: http://dx.doi.org/10.1016/j.agee.2012.03.019. AUTHORS Carlos Roberto Silveira Junior is a System Architect at Ericsson expert in software development for telecom systems. He is a Ph.D. Student at Federal University of São Carlos (UFSCar) working with Data Mining -Artificial Intelligence- with images processing and parallelism. Holds a Master Degree in Data Mining and ontology at UFSCar, graduated Computer Science at UFSCar. Main areas of interest: data mining, programming, and distributed system. José Roberto Cecatto holds a BS in Physics from the Pontifical Catholic University of São Paulo, a Master’s and a PhD in Astrophysics from the National Institute for Space Research (INPE). He is currently a researcher at INPE. He has experience in Astronomy, with an emphasis on Radio Astronomy and instrumental development, acting mainly on the following themes: solar, radio, spectroscopy and interferometry. Currently, working in Space Climate with activities and development of tools related to the forecast of solar phenomena. Marilde Terezinha Prado Santos is an associate professor at the Center for Exact Sciences and Technology / Department of Computing at the Federal University of São Carlos. PhD in Science with emphasis in Computational Physics from the University of São Paulo (2000). Master in Computer Science from the Federal University of São Carlos (1994) and Bachelor in Computer Science from the Pontifical Catholic University of Rio Grande do Sul (1991). Main areas of interest: engineering and applications of crisp and fuzzy ontologies, information retrieval, data mining, data integration and semantic web.