Scope & Topics
Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. There is an urgent need for a new generation of computational theories and tools to assist researchers in extracting useful information from the rapidly growing volumes of digital data.
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
April 2020 top read articles in data mining & knowledge management process research articles
1. April 2020: Top Read Articles In
Data Mining & Knowledge
Management Process Research
Articles
International Journal of Data Mining & Knowledge
Management Process (IJDKP)
ISSN : 2230 - 9608[Online] ; 2231 - 007X [Print]
http://airccse.org/journal/ijdkp/ijdkp.html
2. A WEB REPOSITORY SYSTEM FOR DATA MINING IN
DRUG DISCOVERY
Jiali Tang, Jack Wang and Ahmad Reza Hadaegh
Department of Computer Science and Information System, California State University
San Marcos, San Marcos, USA
ABSTRACT
This project is to produce a repository database system of drugs, drug features
(properties), and drug targets where data can be mined and analyzed. Drug targets are
different proteins that drugs try to bind to stop the activities of the protein. Users can
utilize the database to mine useful data to predict the specific chemical properties that will
have the relative efficacy of a specific target and the coefficient for each chemical
property. This database system can be equipped with different data mining
approaches/algorithms such as linear, non-linear, and classification types of data
modelling. The data models have enhanced with the Genetic Evolution (GE) algorithms.
This paper discusses implementation with the linear data models such as Multiple Linear
Regression (MLR), Partial Least Square Regression (PLSR), and Support Vector Machine
(SVM).
.
KEYWORDS
Data Mining, Drug Discovery, Drug Description, Chemoinformatics, and Web
Application
For More Details : https://aircconline.com/ijdkp/V10N1/10120ijdkp01.pdf
Volume Link : http://airccse.org/journal/ijdkp/vol10.html
3. REFERENCES
[1] Ko, Gene, Reddy, Srinivas, Garg, Rajni, Kumar, Sunil, & Hadaegh, Ahmad, (2012) “Computational
Modelling Methods for QSAR Studies on HIV-1 Integrase Inhibitors (2005-2010),”. Curr Comput
Aided Drug Des. Vol. 8, No 4, pp 255-270.
[2] Thakor, Falguni, Hadaegh, Ahmad, & Zhang, Xiaoyu, (2017), ” Comparative study of Differential
Evolutionary-Binary Particle Swarm Optimization (DE-BPSO) algorithm as a feature selection
technique with different linear regression models for analysis of HIV-1 Integrase Inhibition features of
Aryl β-Diketo Acids”, Proceedings of 9th International Conference on Bioinformatics and
Computational Biology, Honolulu, Hawaii, USA, ISBN: 978–1–943436–07–1, pp 179-184.
[3] Kane Ian, & Hadaegh Ahmad, “Non-linear Quantitative Structure-Activity Relationship (QSAR)
Models for the Prediction of HIV Drug Performance”, (2015), 24th International Conference on
Software Engineering and Data Engineering, pp 63-68. Vol 1, ISBN: 9781510812277, San Diego, CA.
[4] Galvan Richard, Kashani, Maninatalsadat, & Hadaegh, Ahmad, “Improving Pharmacological Research
of HIV-1 Integrase Inhibition Using Differential Evolution-Binary Particle Swarm Optimization and
Non-Linear Adaptive Boosting Random Forest Regression”,(2015), IEEE International Workshop on
Data Integration and Mining San Francisco, Information Reuse and Integration (IRI), IEEE
International Conference, pp 485-490, DOI: 10.1109/IRI.2015.80. INSPEC Accession Number:
15556631. San Francisco, CA.
[5] Kashani, Maninatalsadat, Galvan Richard, & Hadaegh Ahmad, “Improving the Feature Selection for
the Development of Linear Model for Discovery of HIV-1 Integrase Inhibitors”, (2015) ABDA'15
International Conference on Advances in Big Data Analytics. In Proceeding of the 2015 International
Conferences on Advances on Big Data Analyses, pp 150-154. ISBN: 1-60132-411-1, Las Vegas,
Nevada.
[6] Ko, Gene, Garg, Rajni, Kumar, Sunil, Kumar, Bailey, Barbara, & Hadaegh Ahmad, “A Hybridized
Evolutionary Algorithm for Feature Selection of Chemical Descriptors for Computational QSAR
Modeling of HIV-1 Integrase Inhibitors”, (2013), Computational Science Curriculum Development
Forum and Applied Computational Science and Engineering Student Support for Industry, San Diego
State University.
[7] Ko, Gene, Garg, Rajni, Kumar, Sunil, Bailey, Barbara, & Hadaegh Ahmad, “Differential Evolution-
Binary Particle Swarm Optimization for the Analysis of Aryl b-Kiketo Acids for HIV-1 Integres
Inhibition, (2012), WCCI 2012 IEEE World Congress on Computational Intelligence. Brisbane
Australia, pp 1849-1855.
[8] Ko, Gene, Reddy, Srinivas, Kumar, Kumar, Bailey, Barbara, Garg, Rajni, & Hadaegh, Ahmad,
“Evolutionary Computational Modelling of β-Diketo Acids for Virtual Screening of HIV-1 Integrase
Inhibitors”, (2012), IEEE World Congress on Computational Intelligence, Brisbane, Australia.
[9] Ko, Gene, Reddy, Srinivas, Kumar, Kumar, Garg, Rajni, & Hadaegh, Ahmad “Evolutionary
Computational Modelling of β-Diketo Acids for Virtual Screening of HIV-1 Integrase Inhibitors”,
(2012), 243rd National Meeting of the American Chemical Society, San Diego, CA.
[10] Gonzales, Miguel, Turner, Chris, Ko, Gene, & Hadaegh, Ahmad, “Binary Particle Swarm
Optimization Model of Dimeric Aryl Diketo Acid Inhibitors for HIV-1 Integrase” (2012), 243rd
National Meeting of the American Chemical Society, San Diego, CA.
[11] Ko, Gene, Reddy, Srinivas, Kumar, Sunil, Garg, Rajni, & Hadaegh, Ahmad, “Analysis of HIV-1
Integrase Inhibitors Using Computational QSAR Modelling”, (2012), Computational Science
Curriculum Development Forum and Applied Computational Science and Engineering Student
Support for Industry, San Diego State University.
5. A REVIEW ON EVALUATION METRICS FOR DATA
CLASSIFICATION EVALUATIONS
Hossin, M.1
and Sulaiman, M.N.2
1
Faculty of Computer Science & Information Technology, Universiti Malaysia
Sarawak,
94300 Kota Samarahan, Sarawak, Malaysia
2
Faculty of Computer Science & Information Technology, Universiti Putra Malaysia,
43400 UPM Serdang, Selangor, Malaysia
ABSTRACT
Evaluation metric plays a critical role in achieving the optimal classifier during the
classification training. Thus, a selection of suitable evaluation metric is an important key
for discriminating and obtaining the optimal classifier. This paper systematically reviewed
the related evaluation metrics that are specifically designed as a discriminator for
optimizing generative classifier. Generally, many generative classifiers employ accuracy
as a measure to discriminate the optimal solution during the classification training.
However, the accuracy has several weaknesses which are less distinctiveness, less
discriminability, less in formativeness and bias to majority class data. This paper also
briefly discusses other metrics that are specifically designed for discriminating the optimal
solution. The shortcomings of these alternative metrics are also discussed. Finally, this
paper suggests five important aspects that must be taken into consideration in constructing
a new discriminator metric..
KEYWORDS
Evaluation Metric, Accuracy, Optimized Classifier, Data Classification Evaluation
For More Details : http://aircconline.com/ijdkp/V5N2/5215ijdkp01.pdf
Volume Link : http://airccse.org/journal/ijdkp/vol5.html
6. REFERENCES
[1] A.A. Cardenas and J.S. Baras, “B-ROC curves for the assessment of classifiers over imbalanced
datasets”, in Proc. of the 21st National Conference on Artificial Intelligence Vol. 2, 2006, pp. 1581-1584
[2] R. Caruana and A. Niculescu-Mizil, “Data mining in metric space: an empirical analysis of
supervisedlearning performance criteria”, in Proc. of the 10th ACM SIGKDD International Conference
onKnowledge Discovery and Data Mining (KDD '04), New York, NY, USA, ACM 2004, pp. 69-78.
[3] N.V. Chawla, N. Japkowicz and A. Kolcz, “Editorial: Special issue on learning from imbalanced
datasets”, SIGKDD Explorations, 6 (2004) 1-6.
[4] T. Fawcett, “An Introduction to ROC Analysis”, Pattern Recognition Letters, 27 (2006) 861-874.
[5] J. Davis and M. Goadrich, “The relationship between precision-recall and ROC curves”, in Proc. Ofthe
23rd International Conference on Machine Learning, 2006, pp. 233-240.
[6] C. Drummond, R.C. Holte, “Cost curves: An Improved method for visualizing classifier
performance”, Mach. Learn. 65 (2006) 95-130.
[7] P.A. Flach, P.A., “The Geometry of ROC Space: understanding Machine Learning Metrics throughROC
Isometrics”, in T. Fawcett and N. Mishra (Eds.) Proc. of the 20th Int. Conference on MachineLearning
(ICML 2003), Washington, DC, USA, AAAI Press, 2003, pp. 194-201.
[8] V. Garcia, R.A. Mollineda and J.S. Sanchez, “A bias correction function for classification
performance assessment in two-class imbalanced problems”, Knowledge-Based Systems, 59(2014) 66-74.
[9] S. Garcia and F. Herrera, “Evolutionary training set selection to optimize C4.5 in imbalance
problems”, in Proc. of 8th Int. Conference on Hybrid Intelligent Systems (HIS 2008), Washington,DC, USA,
IEEE Computer Society, 2008, pp.567-572.
[10] N. Garcia-Pedrajas, J. A. Romero del Castillo and D. Ortiz-Boyer, “A cooperative
coevolutionaryalgorithm for instance selection for instance-based learning”. Machine Learning (2010), 78
(2010)381-420.
[11] Q. Gu, L. Zhu and Z. Cai, “Evaluation Measures of the Classification Performance of
ImbalancedDatasets”, in Z. Cai et al. (Eds.) ISICA 2009, CCIS 51. Berlin, Heidelberg: Springer-Verlag,
2009,pp. 461-471.
[12] S. Han, B. Yuan and W. Liu, “Rare Class Mining: Progress and Prospect”, in Proc. of
ChineseConference on Pattern Recognition (CCPR 2009), 2009, pp. 1-5
[13] D. J. Hand and R. J. Till, “A simple generalization of the area under the ROC curve to multiple
classclassification problems”, Machine Learning, 45 (2001) 171-186.
[14] M. Hossin, M. N. Sulaiman, A. Mustapha, and N. Mustapha, “A Novel Performance Metric forBuilding
an Optimized Classifier”, Journal of Computer Science, 7(4) (2011) 582-509.
[15] M. Hossin, M. N. Sulaiman, A. Mustapha, N. Mustapha and R. W. Rahmat, “ OAERP: a BetterMeasure
than Accuracy in Discriminating a Better Solution for Stochastic Classification Training”,Journal of
Artificial Intelligence, 4(3) (2011) 187-196.
[16] M. Hossin, M. N. Sulaiman, A. Mustapha, N. Mustapha and R. W. Rahmat, “A Hybrid
EvaluationMetric for Optimizing Classifier”, in Data Mining and Optimization (DMO), 2011 3rd
Conference on,2011, pp. 165-170.
[17] J. Huang and C. X. Ling, “Using AUC and accuracy in evaluating learning algorithms”,
IEEETransactions on Knowledge Data Engineering, 17 (2005) 299-310.
[18] J. Huang and C. X. Ling, “Constructing new and better evaluation measures for machine learning”, inR.
7. Sangal, H. Mehta and R. K. Bagga (Eds.) Proc. of the 20th International Joint Conference onArtificial
Intelligence (IJCAI 2007), San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.,2007, pp.859-864.
[19] N. Japkowicz, “Assessment metrics for imbalanced learning”, in Imbalanced Learning:
Foundations,Algorithms, and Applications, Wiley IEEE Press, 2013, pp. 187-210
[20] M. V. Joshi, “On evaluating performance of classifiers for rare classes”, in Proceedings of the
2002IEEE Int. Conference on Data Mining (ICDN 2002) ICDM’02, Washington, D. C., USA:
IEEEComputer Society, 2002, pp. 641-644.
[21] T. Kohonen, Self-Organizing Maps, 3rd ed., Berlin Heidelberg: Springer-Verlag, 2001.
[22] L. I. Kuncheva and J. C. Bezdek, “Nearest Prototype Classification: Clustering, Genetic Algorithms,or
Random Search?” IEEE Transactions on Systems, Man, and Cybernetics-Part C: Application andReviews,
28(1) (1998) 160-164.
[23] N. Lavesson, and P. Davidsson, “Generic Methods for Multi-Criteria Evaluation”, in Proc. of theSiam
Int. Conference on Data Mining, Atlanta, Georgia, USA: SIAM Press, 2008, pp. 541-546.
[24] P. Lingras, and C. J. Butz, “Precision and recall in rough support vector machines”, in Proc. of the2007
IEEE Int. Conference on Granular Computing (GRC 2007), Washington, DC, USA: IEEEComputer Society,
2007, pp.654-654.
[25] D. J. C. MacKay, Information, Theory, Inference and Learning Algorithms. Cambridge, UK:Cambridge
University Press, 2003.
[26] T. M. Mitchell, Machine Learning, USA: MacGraw-Hill, 1997.
[27] R. Prati, G. Batista, and M. Monard, “A survery on graphical methods for classification predictive
performance evaluation”, IEEE Trans. Knowl. Data Eng. 23(2011) 1601-1618.
[28] F. Provost, and P. Domingos, “Tree induction for probability-based ranking”. Machine Learning,
52(2003) 199-215.
[29] A. Rakotomamonyj, “Optimizing area under ROC with SVMs”, in J. Hernandez-Orallo, C. Ferri,
N.Lachiche and P. A. Flach (Eds.) 1st Int. Workshop on ROC Analysis in Artificial Intelligence(ROCAI
2004), Valencia, Spain, 2004, pp. 71-80.
[30] R. Ranawana, and V. Palade, “Optimized precision-A new measure for classifier
performanceevaluation”, in Proc. of the IEEE World Congress on Evolutionary Computation (CEC 2006),
2006,pp. 2254-2261.
[31] S. Rosset, “Model selection via AUC”, in C. E. Brodley (Ed.) Proc. of the 21st Int. Conference
onMachine Learning (ICML 2004), New York, NY, USA: ACM, 2004, pp. 89.
[32] D.B. Skalak, “Prototype and feature selection by sampling and random mutation hill
climbingalgorithm”, in W. W. Cohen and H. Hirsh (Eds.) Proc. of the 11th Int. Conference on
MachineLearning (ICML 1994), New Brunswick, NJ, USA: Morgan Kaufmann, 1994, pp.293-301.
[33] M. Sokolova and G. Lapame, “A systematic analysis of performance measures for classificationtasks”,
Information Processing and Management, 45(2009) 427-437.
[34] P. N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Boston, USA: Pearson
AddisonWesley, 2006.
[35] M. Vuk and T. Curk, “ROC curve, lift chart and calibration plot”, Metodoloˇskizvezki, 3(1) (2006)89-
108.
[36] H. Wallach, “Evaluation metrics for hard classifiers”. Technical Report. (Ed.: Wallach,
2006)http://www.inference.phy.cam.ac.uk/hmw26/papers
8. [37] S. W. Wilson, “Mining oblique data with XCS”, in P. L. Lanzi, W. Stolzmann and S. W. Wilson(Eds.)
Advances in Learning Classifier Systems: Third Int. Workshop (IWLCS 2000), Berlin,Heidelberg: Springer-
Verlag, 2001, pp. 283-290.
[38] H. Zhang and G. Sun, “Optimal reference subset selection for nearest neighbor classification by
tabusearch”, Pattern Recognition, 35(7) (2002) 1481-1490.
9. A BUSINESS INTELLIGENCE PLATFORM
IMPLEMENTED IN A BIG DATA SYSTEM
EMBEDDING DATA MINING: A CASE OF STUDY
Alessandro Massaro, Valeria Vitti, Palo Lisco, Angelo Galiano and Nicola Savino
Dyrecta Lab, IT Research Laboratory, Via VescovoSimplicio, 45, 70014 Conversano
(BA), Italy.
(in collaboration with ACI Global S.p.A., VialeSarca, 336 - 20126 Milano, Via Stanislao
Cannizzaro, 83/a - 00156 Roma, Italy)
ABSTRACT
In this work is discussed a case study of a business intelligence –BI- platform developed
within the framework of an industry project by following research and development –
R&D- guidelines of ‘Frascati’. The proposed results are a part of the output of different
jointed projects enabling the BI of the industry ACI Global working mainly in roadside
assistance services. The main project goal is to upgrade the information system, the
knowledge base –KB- and industry processes activating data mining algorithms and big
data systems able to provide gain of knowledge. The proposed work concerns the
development of the highly performing Cassandra big data system collecting data of two
industry location. Data are processed by data mining algorithms in order to formulate a
decision making system oriented on call center human resources optimization and on
customer service improvement. Correlation Matrix, Decision Tree and Random Forest
Decision Tree algorithms have been applied for the testing of the prototype system by
finding a good accuracy of the output solutions. The Rapid Miner tool has been adopted
for the data processing. The work describes all the system architectures adopted for the
design and for the testing phases, providing information about Cassandra performance and
showing some results of data mining processes matching with industry BI strategies..
KEYWORDS
Big Data Systems, Cassandra Big Data, Data Mining, Correlation Matrix, Decision Tree,
Frascati Guideline
Full Text :http://aircconline.com/ijdkp/V9N1/9119ijdkp01.pdf
.
10. REFERNCES
[1] Khan, R. A. &Quadri, S. M. K. (2012) “Business Intelligence: an Integrated Approach”,
BusinessIntelligence Journal, Vol.5 No.1, pp 64-70.
[2] Chen, H., Chiang, R. H. L. & Storey V. C. (2012) “Business Intelligence and Analytics: from Bi Data to
Big Impact”, MIS Quarterly, Vol. 36, No. 4, pp 1165-1188.
[3] Andronie, M. (2015) “Airline Applications of Business Intelligence Systems”, Incas Bulletin, Vol. 7,No.
3, pp 153 – 160.
[4] Iankoulova, I. (2012) “Business Intelligence for Horizontal Cooperation”, Master Thesis, Univesitiet
Twente.[Online].Availablehttps://www.utwente.nl/en/mbit/finalproject/example_excellent_master_thesi/
master_thesis_bit/IankoulovaID.pdf
[5] Nunes, A. A., Galvão, T. & Cunha, J. F. (2014) “Urban Public Transport Service Co-creation:Leveraging
Passenger’s Knowledge to Enhance Travel Experience”, Procedia Social and BehavioralSciences, Vol.
111, pp 577 – 585.
[6] Fitriana, R., Eriyatno, Djatna, T. (2011) “Progress in Business Intelligence System research: Aliterature
Review”, International Journal of Basic & Applied Sciences IJBAS-IJENS, Vol. 11, No. 03,pp 96-105.
[7] Lia, M. (2015) "Customer Data Analysis Model using Business Intelligence Tools inTelecommunication
Companies", Database Systems Journal, Vol. 6, No. 2, pp 39-43.
[8] Habul, A., Pilav-Velić, A. &Kremić, E. (2012) “Customer Relationship Management and
BusinessIntelligence”, Intech book 2012: Advances in Customer Relationship Management, chapter 2.
[9] Kemper,H.-G., Baars, H. &Lasi, H. (2013) “An Integrated Business Intelligence Framework Closingthe
Gap Between IT Support for Management and for Production”, Springer: Business Intelligenceand
Performance Management Part of the series Advanced Information and Knowledge Processing,pp 13-26,
Chapter 2.
[10] Bara, A., Botha, I., Diaconiţa, V., Lungu, I., Velicanu, A., Velicanu, M. (2009) “A Model forBusiness
Intelligence Systems’ Development”, InformaticaEconomică, Vol. 13, No. 4, pp 99-108.
[11] Negash, S. (2004) “Business Intelligence”, Communications of the Association for
InformationSystems, Vol. 13, pp 177-195.
[12] Nofal, M. I. &Yusof, Z. M. (2013) “Integration of Business Intelligence and Enterprise
ResourcePlanning within Organizations”, Procedia Technology, Vol. 11 ( 2013 ), pp. 658 – 665.
[13] Williams, S. & Williams, N. (2003) “The Business Value of Business Intelligence”,
BusinessIntelligence Journal, FALL 2003, pp 1-11.
[14] Lečić, D. &Kupusinac, A. (2013) “The Impact of ERP Systems on Business Decision-Making”,TEM
Journal, Vol. 2, No. 4, pp 323-326.
[15] Ong, L., Siew, P. H. & Wong, S. F. (2011) “A Five-Layered Business Intelligence
Architecture”,IBIMA Publishing, Communications of the IBIMA,Vol. 2011, Article ID 695619, pp 1-11.
[16] Raymond T. Ng, Arocena, P. C., Barbosa, D., Carenini, G., Gomes, L., Jou, S., Leung, R. A., Milios,E.,
Miller, R. J., Mylopoulos, J., Pottinger, R. A., Tompa, F. & Yu, E. (2013) “Perspectives onBusiness
Intelligence”, A Publication in the Morgan & Claypool Publishers series Synthesis Lectureson Data
Management.
[17] “NTT DATA Connected Car Report: A brief insight on the connected car market, showingpossibilities
and challenges for third-party service providers by means of an application case study”
[Online].Available:https://emea.nttdata.com/fileadmin/web_data/country/de/documents/Manufacturing/S
tudien/2015_Connected_Car_Report_NTT_DATA_ENG.pdf
11. [18] “Cognizant report: Exploring the Connected Car Cognizant 20-20” [Online].
Available:https://www.cognizant.com/InsightsWhitepapers/Exploring-the-Connected-Car.pdf
International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.9, No.1, January
201919
[19] Sarangi, P. K., Bano, S., Pant, M. (2014) “Future Trend in Indian Automobile Industry: A
StatisticalApproach”, Journal of Management Sciences And Technology, Vol. 2, No. 1, pp. 28-32.
[20] Bates, H. &Holweg, M. (2007) “Motor Vehicle Recalls: Trends, Patterns and Emerging Issues”,Omega,
Vol. 35, No. 2, pp 202–210.
[21] D’Aloia, M., Russo, M. R., Cice G., Montingelli, A., Frulli, G., Frulli, E., Mancini, F., Rizzi, M.,Longo,
A. (2017) “Big Data Performance and Comparison with Different DB Systems”, InternationalJournal of
Computer Science and Information Technologies, Vol. 8, No. 1, pp 59-63.
[22] Wimmer, H. & Powell, L. M. (2015) “A Comparison of Open Source Tools for Data
Science”,Proceedings of the Conference on Information Systems Applied Research, Wilmington,
NorthCarolina USA.
[23] Al-Khoder, A. &Harmouch, H. (2014) “Evaluating four of the most popular Open Source and FreeData
Mining Tools,” IJASR International Journal of Academic Scientific Research, Vol. 3, No. 1, pp13-23.
[24] Antonio Gulli, Sujit Pal, “Deep Learning with Keras- Implement neural networks with Keras onTheano
and TensorFlow,” BIRMINGHAM – MUMBAI Packt book, ISBN 978-1-78712-842-2, 2017.
[25] Kovalev V., Kalinovsky A., and Kovalev S. Deep Learning with Theano, Torch, Caffe, TensorFlow,and
deeplearning4j: which one is the best in speed and accuracy? In: XIII Int. Conf. on PatternRecognition
and Information Processing, 3-5 October, Minsk, Belarus State University, 2016, pp. 99-103.
[26] Frascati Manual 2015: The Measurement of Scientific, Technological and Innovation
ActivitieGuidelines for Collecting and Reporting Data on Research and Experimental Development.
OECD(2015), ISBN 978-926423901-2 (PDF).
[27] Massaro, A. Maritati, V., Galiano, A., Birardi, V. &Pellicani, L. (2018) “ESB Platform
IntegratinKNIME Data Mining Tool oriented on Industry 4.0 Based on Artificial Neural Network
PredictiveMaintenance”, International Journal of Artificial Intelligence and Applications (IJAIA),
Vol.9,No.3,
[28] Massaro, A., Calicchio, A., Maritati, V., Galiano, A., Birardi, V., Pellicani, L., Gutierrez Millan,
M.,DallaTezza, B., Bianchi, M., Vertua, G., Puggioni, A. (2018) “A Case Study of Innovation of
anInformation Communication system and Upgrade of the Knowledge Base in Industry by
ESB,Artificial Intelligence, and Big Data System Integration”, International Journal of
ArtificialIntelligence and Applications (IJAIA), Vol. 9, No.5, pp. 27-43.
[29] “WSO2” [Online]. Available: https://wso2.com/products/enterprise-service-bus/
[30] “Ubuntu” [Online]. Available: https://www.ubuntu.com/
[31] “Apache Cassandra” [Online]. Available: http://cassandra.apache.org/
[32]“DataStaxEnterpriseOpsCenter”[Online].Available:
https://www.datastax.com/products/datastaxopscenter
[33] “About DataStaxDevCenter” [Online]. Available:
https://docs.datastax.com/en/developer/devcenter/doc/devcenter/dcAbout.html
[34] “Knowi” [Online]. Available: https://www.knowi.com/
[35] “JFreeChart” [Online]. Available: http://www.jfree.org/jfreechart/samples.html
12. [36] “PuTTY” [Online]. Available: https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html
[37] “Lightning Fast Data Science for Teams” [Online]. Available: https://rapidminer.com/
[38] Massaro, A., Meuli, G. &Galiano, A. (2018) “Intelligent Electrical Multi Outlets Controlled an
Activated by a Data Mining Engine Oriented to Building Electrical Management”, InternationalJournal
on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.7, No.4, pp 1-20.
[39] Myers, J. L. & Well, A. D. (2003) “Research Design and Statistical Analysis”, (2nd ed.)
LawrenceErlbaum.
.
AUTHOR
Alessandro Massaro: Research & Development Chief of Dyrecta Lab s.r.l.
13. INCREASED PREDICTION ACCURACY IN THE GAME
OF CRICKETUSING MACHINE LEARNING
Kalpdrum Passi and Niravkumar Pandey
Department of Mathematics and Computer Science
Laurentian University, Sudbury, Canada
ABSTRACT
Player selection is one the most important tasks for any sport and cricket is no exception.
The performance of the players depends on various factors such as the opposition team,
the venue, his current form etc. The team management, the coach and the captain select 11
players for each match from a squad of 15 to 20 players. They analyze different
characteristics and the statistics of the players to select the best playing 11 for each match.
Each batsman contributes by scoring maximum runs possible and each bowler contributes
by taking maximum wickets and conceding minimum runs. This paper attempts to predict
the performance of players as how many runs will each batsman score and how many
wickets will each bowler take for both the teams. Both the problems are targeted as
classification problems where number of runs and number ofwickets are classified in
different ranges. We used naïve bayes, random forest, multiclass SVM and decision tree
classifiers to generate the prediction models for both the problems. Random Forest
classifier wasfound to be the most accurate for both the problems.
KEYWORDS
Naïve Bayes, Random Forest, Multiclass SVM, Decision Trees, Cricket
Full Text : http://aircconline.com/ijdkp/V8N2/8218ijdkp03.pdf
.
14. REFERNCES
[1] S. Muthuswamy and S. S. Lam, "Bowler Performance Prediction for One-day International Cricket
Using Neural Networks," in Industrial Engineering Research Conference, 2008.
[2] I. P. Wickramasinghe, "Predicting the performance of batsmen in test cricket," Journal of Human Sport
& Excercise, vol. 9, no. 4, pp. 744-751, May 2014.
[3] G. D. I. Barr and B. S. Kantor, "A Criterion for Comparing and Selecting Batsmen in Limited Overs
Cricket," Operational Research Society, vol. 55, no. 12, pp. 1266-1274, December 2004.
[4] S. R. Iyer and R. Sharda, "Prediction of athletes performance using neural networks: An application in
cricket team selection," Expert Systems with Applications, vol. 36, pp. 5510-5522, April 2009.
[5] M. G. Jhanwar and V. Pudi, "Predicting the Outcome of ODI Cricket Matches: A Team Composition
Based Approach," in European Conference on Machine Learning and Principles and Practice of
Knowledge Discovery in Databases (ECMLPKDD 2016 2016), 2016.
[6] H. H. Lemmer, "The combined bowling rate as a measure of bowling performance in cricket," South
African Journal for Research in Sport, Physical Education and Recreation, vol. 24, no. 2, pp. 37-44,
January 2002.
[7] D. Bhattacharjee and D. G. Pahinkar, "Analysis of Performance of Bowlers using Combined Bowling
Rate," International Journal of Sports Science and Engineering, vol. 6, no. 3, pp. 1750-9823, 2012.
[8] S. Mukherjee, "Quantifying individual performance in Cricket - A network analysis of batsmen and
bowlers," Physica A: Statistical Mechanics and its Applications, vol. 393, pp. 624-637, 2014.
[9] P. Shah, "New performance measure in Cricket," ISOR Journal of Sports and Physical Education, vol.
4, no. 3, pp. 28-30, 2017.
[10] D. Parker, P. Burns and H. Natarajan, "Player valuations in the Indian Premier League," Frontier
Economics, vol. 116, October 2008.
[11] C. D. Prakash, C. Patvardhan and C. V. Lakshmi, "Data Analytics based Deep Mayo Predictor for
IPL9," International Journal of Computer Applications, vol. 152, no. 6, pp. 6-10, October 2016.
[12] M. Ovens and B. Bukiet, "A Mathematical Modelling Approach to One-Day Cricket Batting
Orders,"Journal of Sports Science and Medicine, vol. 5, pp. 495-502, 15 December 2006.
[13] R. P. Schumaker, O. K. Solieman and H. Chen, "Predictive Modeling for Sports and Gaming," in
Sports Data Mining, vol. 26, Boston, Massachusetts: Springer, 2010.
[14] M. Haghighat, H. Ratsegari and N. Nourafza, "A Review of Data Mining Techniques for Result
Prediction in Sports," Advances in Computer Science : an International Journal, vol. 2, no. 5, pp. 7-
12,November 2013.
[15] J. Hucaljuk and A. Rakipovik, "Predicting football scores using machine learning techniques," in
International Convention MIPRO, Opatija, 2011.
[16] J. McCullagh, "Data Mining in Sport: A Neural Network Approach," International Journal of Sports
Science and Engineering, vol. 4, no. 3, pp. 131-138, 2012.
[17] "Free web scraping - Download the most powerful web scraper | ParseHub," parsehub, [Online].
Available: https://www.parsehub.com.
[18] "Import.io | Extract data from the web," Import.io, [Online]. Available: https://www.import.io.
[19] T. L. Saaty, The Analytic Hierarchy Process, New York: McGrow Hill, 1980.
15. [20] T. L. Saaty, "A scaling method for priorities in a hierarchichal structure," Mathematical Psychology,
vol. 15, 1977.
[21] N. V. Chavla, K. W. Bowyer, L. O. Hall and P. W. Kegelmeyer, "SMOTE: Synthetic Minority
Oversampling Technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, June 2002.
[22] J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, 3rd Edition ed., Waltham:
Elsevier, 2012.
[23] J. R. Quinlan, "Induction of Decision Trees," Machine learning, vol. 1, no. 1, pp. 81-106, 1986.
[24] J. R. Quinlan, C4.5: Programs for Machine Learning, Elsevier, 2015.
[25] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[26] T. K. Ho, "The Random Subspace Method for Constructing Decision Forests," IEEE transactions on
pattern analysis and machine intelligence, vol. 20, no. 8, pp. 832-844, August 1998.
[27] L. Breiman, J. Friedman, C. J. Stone and R. A. Olshen, Classification and regression trees, CRC
Press,1984.
[28] B. E. Boser, I. M. Guyon and V. N. Vapnik, "A Training Algorithm for Optimal Margin Classifiers," in
Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, 1992.
[29] C.-C. Chang and C.-J. Lin, "LIBSVM: A library for support vector machines," ACM Transactions on
Intelligent Systems and Technology, vol. 2, no. 3, April 2011.
[30] T. L. Saaty, "A scaling method for priorities in a hierarchichal structure," Mathematical Psychology,
vol. 15, pp. 234-281, 1977.
[31] T. L. Saaty, The Analytical Hierarchy Process, New York: McGraw-Hill, 1980.
[32] N. V. Chavla, K. W. Bowyer, L. O. Hall and W. P. Kegelmeyer, "SMOTE: Synthetic Minority
Oversampling Technique," Artificial Intelligence Research, vol. 16, p. 321–357, June 2002.
AUTHORS
Kalpdrum Passi received his Ph.D. in Parallel Numerical Algorithms from Indian
Institute of Technology, Delhi, India in 1993. He is an Associate Professor,
Department of Mathematics & Co mputer Science, at Laurentian University, Ontario,
Canada. He has published many papers on Parallel Numerical Algorithms in
international journals and conferences. He has collaborative work with faculty in
Canada and US and the work was tested on the CRAY XMP’s and CRAY YMP’s. He
transitioned his research to web technology, and more recently has been involved in machine
learning and data mining applications in bioinformatics, social media and other data science areas.
He obtained funding from NSERC and Laurentian University for his research. He is a member of
the ACM and IEEE Computer Society.
Niravkumar Pandey is pursuing M.Sc. in Computational Science at Laurentian
University, Ontario, Canada. He received his Bachelor of Engineering degree from
Gujarat Technological University, Gujarat, India. Data mining and machine learning are
his primary areas of interest. He is also a cricket enthusiast and is studying applicationsof
machine learning and data mining in cricket analytics for his M.Sc. thesis
16. Data Mining and Its Applications for Knowledge Management :
A Literature Review from 2007 to 2012
Tipawan Silwattananusarn1and Assoc.Prof.Dr. KulthidaTuamsuk2
1Ph.D. Student in Information Studies Program,
KhonKaen University, Thailand
2Head, Information & Communication Management Program,
KhonKaen University, Thailand
ABSTRACT
Data mining is one of the most important steps of the knowledge discovery in databases
process and is considered as significant subfield in knowledge management. Research in
data mining continues growing in business and in learning organization over coming
decades. This review paper explores the applications of data mining techniques which
have been developed to support knowledge management process. The journal articles
indexed in ScienceDirect Database from 2007 to 2012 are analyzed and classified. The
discussion on the findings is divided into 4 topics: (i) knowledge resource; (ii) knowledge
types and/or knowledge datasets; (iii) data mining tasks; and (iv) data mining techniques
and applications used in knowledge management. The article first briefly describes the
definition of data mining and data mining functionality. Then the knowledge management
rationale and major knowledge management tools integrated in knowledge management
cycle are described. Finally, the applications of data mining techniques in the process of
knowledge management are summarized and discussed..
KEYWORDS
Data mining; Data mining applications; Knowledge management.
For More Details : http://aircconline.com/ijdkp/V2N5/2512ijdkp02.pdf
Volume Link :http://airccse.org/journal/ijdkp/vol2.htmll
17. REFERENCES
[1] An, X. & Wang, W. (2010). Knowledge management technologies and applications: A literaturereview.
IEEE, 138-141. doi:10.1109/ICAMS.2010.5553046
[2] Berson, A., Smith, S.J. &Thearling, K. (1999). Building Data Mining Applications for CRM. New York:
McGraw-Hill.
[3] Cantú, F.J. &Ceballos, H.G. (2010). A multiagent knowledge and information network approach for
managing research assets. Expert Systems with Applications, 37(7), 5272-
5284.doi:10.1016/j.eswa.2010.01.012
[4] Cheng, H., Lu, Y. &Sheu, C. (2009). An ontology-based business intelligence application in a financial
knowledge management system.Expert Systems with Applications, 36, 3614–
3622.Doi:10.1016/j.eswa.2008.02.047
[5] Dalkir, K. (2005). Knowledge Management in Theory and Practice. Boston: Butterworth-Heinemann.
[6] Dawei, J. (2011). The Application of Date Mining in Knowledge Management.2011 International
Conference on Management of e-Commerce and e-Government, IEEE Computer Society, 7-9.doi:
10.1109/ICMeCG.2011.58
[7] Fayyad, U., Piatetsky-Shapiro, G. & Smyth, P. (1996). From Data Mining to Knowledge Discovery in
Databases.AI Magazine, 17(3), 37-54.
[8] Gorunescu, F. (2011). Data Mining: Concepts, Models, and Techniques. India: Springer.
[9] Han, J. &Kamber, M. (2012). Data Mining: Concepts and Techniques. 3rd.ed. Boston: Morgan
Kaufmann Publishers.
[10] Hwang, H.G., Chang, I.C., Chen, F.J. & Wu, S.Y. (2008). Investigation of the application of KMS for
diseases classifications: A study in a Taiwanese hospital. Expert Systems with Applications, 34(1),
725-733. doi:10.1016/j.eswa.2006.10.018
[11] Lavrac, N., Bohanec, M., Pur, A., Cestnik, B., Debeljak, M. &Kobler, A. (2007).Data mining and
visualization for decision support and modeling of public health-care resources.Journal of Biomedical
Informatics, 40, 438-447. doi:10.1016/j.jbi.2006.10.003
[12] Li, X., Zhu, Z. & Pan, X. (2010). Knowledge cultivating for intelligent decision making in small
&middle businesses.Procedia Computer Science, 1(1), 2479-2488. doi:10.1016/j.procs.2010.04.280
[13] Li, Y., Kramer, M.R., Beulens, A.J.M., Van Der Vorst, J.G.A.J. (2010). A framework for earlywarning
and proactive control systems in food supply chain networks. Computers in Industry, 61, 852–862.
Doi:101.016/j.compind.2010.07.010
[14] Liao, S.H., Chen, C.M., Wu, C.H. (2008). Mining customer knowledge for product line and brand
extension in retailing.Expert Systems with Applications, 34(3), 1763-
1776.doi:10.1016/j.eswa.2007.01.036
[15] Liao, S. (2003). Knowledge management technologies and applications-literature review from 1995 to
2002. Expert Systems with Applications, 25, 155-164. doi:10.1016/S0957-4174(03)00043-5
[16] Liu, D.R. & Lai, C.H. (2011). Mining group-based knowledge flows for sharing task knowledge.
Decision Support Systems,50(2), 370-386. doi:10.1016/j.dss.2010.09.004
[17] Lee, M.R. & Chen, T.T. (2011). Revealing research themes and trends in knowledge management:
From 1995 to 2010. Knowledge-Based Systems.doi:10.1016/j.knosys.2011.11.016
[18] McInerney, C.R. & Koenig, M.E. (2011). Knowledge Management (KM) Processes in Organizations:
18. Theoretical Foundations and Practice. USA: Morgan & Claypool Publishers.
doi:10.2200/S00323ED1V01Y201012ICR018
[19] McInerney, C. (2002). Knowledge Management and the Dynamic Nature of Knowledge.Journal of the
American Society for Information Science and Technology, 53(12), 1009-1018. doi:10.1002/asi.10109
[20] Ngai, E., Xiu, L. &Chau, D. (2009). Application of data mining techniques in customer relationship
management: A literature review and classification. Expert Systems with Applications, 36, 2592- 2602.
doi:10.1016/j.eswa.2008.02.021
[21] Ruggles, R.L. (ed.). (1997). Knowledge Management Tools. Boston: Butterworth-Heinemann.
[22] Sher, P.J. & Lee, V.C. (2004). Information technology as a facilitator for enhancing dynamic
capabilities through knowledge management.Information& Management, 41, 933-
945.doi:10.1016/j.im.2003.06.004
[23] Tseng, S.M. (2008). The effects of information technology on knowledge management
systems.Expert Systems with Applications, 35, 150-160. doi:10.1016/j.eswa.2007.06.011
[24] Ur-Rahman, N. & Harding, J.A. (2012). Textual data mining for industrial knowledge management and
text classification: A business oriented approach. Expert Systems with Applications, 39, 4729- 4739.
doi:10.1016/j.eswa.2011.09.124
[25] Wang, F. & Fan, H. (2008). Investigation on Technology Systems for Knowledge Management.IEEE,
1-4. doi:10.1109/WiCom.2008.2716
[26] Wang, H. & Wang, S. (2008). A knowledge management approach to data mining process for business
intelligence. Industrial Management & Data Systems, 108(5), 622-634.
[27] Wu, W., Lee, Y.T., Tseng, M.L. & Chiang, Y.H. (2010). Data mining for exploring hidden patterns
between KM and its performance.Knowledge-Based Systems, 23, 397-
401.doi:10.1016/j.knosys.2010.01.014[15] C.-C.Chang, T. D. Kieu, and Y.-C.Chou, "A High
Payload Steganographic Scheme Based on (7, 4) Hamming Code for Digital Images," Proc. of the 2008
International Symposium on Electronic Commerce and Security, pp.16-21, August 2008.
19. APPLICATION OF SPATIOTEMPORAL ASSOCIATION
RULES ON SOLAR DATA TO SUPPORT SPACE
WEATHER FORECASTING
Carlos Roberto Silveira Junior1
, José Roberto Cecatto2
, Marilde Terezinha Prado
Santos1
and Marcela Xavier Ribeiro1
1
Department of Computing, Federal University of São Carlos, São Carlos, Brazil
2
National Institute of Space Research, São José dos Campos, Brazil
ABSTRACT
It is well known that solar energetic phenomena influence the Space Weather, in special
those directed to the Earth environment. In this context, the analysis of Solar Data is a
challenging task, particularly when are composed of Satellite Image Time Series (SITS). It
is a multidisciplinary domain that generates a massive amount of data (several Gigabytes
per year). It includes image processing, spatiotemporal characteristics, and the processing
of semantic data. Aiming to enhance the SITS analysis, we propose an algorithm called
"Miner of Thematic Spatiotemporal Associations for Images" (MiTSAI), which is anb
extractor of Thematic Spatiotemporal Association Rules (TSARs) from Solar SITS. Here,
a description is given about the details of the modern algorithm MiTSAI, which is an
extractor of Thematic Spatiotemporal Association Rules (TSARs) from solar Satellite
Image Time Series (SITS). In addition, its adaptation to the Space Weather and discussion
about the specific use in favor of forecasting activities are presented. Finally, some results
of its application specifically to solar flare forecasting are also presented. MiTSAI has to
extract interesting new patterns compared with the art-state algorithms.
KEYWORDS
Data Mining, Drug Discovery, Drug Description, Chemoinformatics, and Web
Application
For More Details : https://aircconline.com/ijdkp/V10N2/10220ijdkp01.pdf
Volume Link : http://airccse.org/journal/ijdkp/vol10.html
20. REFERENCES
[[1] T Abirami-Kongu, P Thangaraj, and P Priakanth-Kongu. Wireless sensor networks fault identification
using data association. Journal of Computer Science, 8(9):1501–1505, 2012.
[2] Sultan Alamri, David Taniar, and Maytham Safar. A taxonomy for moving object queries in
spatialdatabases. Future Generation Computer Systems, 37:232 – 242, 2014. ISSN 0167-739X. doi:
https://doi.org/10.1016/j.future.2014.02.007.
[3] Shadi A. Aljawarneh, Radhakrishna Vangipuram, Veereswara Kumar Puligadda, and Janaki
Vinjamuri. G-spamine: An approach to discover temporal association patterns and trends in internet of
things. Future Generation Computer Systems, 2017. ISSN 0167-739X. doi: https://doi.org/10.1016/
j.future.2017.01.013.
[4] Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. Surf: Speeded up robust features. pages 404–417,
2006.
[5] Monica G Bobra and Sebastien Couvidat. Solar flare prediction using sdo/hmi vector magnetic field
data with a machine-learning algorithm. The Astrophysical Journal, 798(2):135–172, 2015.
[6] I. Burbey and T.L. Martin. A survey on predicting personal mobility. International Journal of
Pervasive Computing and Communications, 8(1):5 – 22, 2012.
[7] M. Chen, S. Mao, and Y. Liu. Big data: A survey. Mobile Networks and Applications, 19(2):171–
209,2014.
[8] Paolo Compieta, Sergio Di Martino, Michela Bertolotto, Filomena Ferrucci, and T Kechadi.
Exploratory spatio-temporal data mining and visualization. Journal of Visual Languages & Computing,
18(3):255–279, 2007.
[9] G Fang and Y Wu. Frequent spatiotemporal association patterns mining based on granular computing.
Informatica (Slovenia), 37(4):443–453, 2013.
[10] A. Hana, Y.T. Sami, and F. Sami. Mining spatiotemporal associations using queries. 2012 International
Conference on Information Technology and e-Services, ICITeS 2012, 2012
[11] R. M. Haralick, K. Shanmugam, and I. Dinstein. Textural features for image classification. IEEE
Transactions on Systems, Man, and Cybernetics, SMC-3(6):610–621, Nov 1973. ISSN 0018-9472.
doi: 10.1109/TSMC.1973.4309314.
[12] J. Huo, J. Zhang, and X. Meng. On co-occurrence pattern discovery from spatio-temporal event stream.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and
Lecture Notes in Bioinformatics), 8181 LNCS(PART 2):385–395, 2013.
[13] J. Kawale, S. Liess, V. Kumar, U. Lall, and A. Ganguly. Mining time-lagged relationships in
spatiotemporal climate data. pages 130–135, 2012.
[14] Xiangjie Kong, Zhenzhen Xu, Guojiang Shen, Jinzhong Wang, Qiuyuan Yang, and Benshi Zhang.
Urban traffic congestion estimation and prediction based on floating car trajectory data. Future
Generation Computer Systems, 61:97 – 107, 2016. ISSN 0167-739X. doi:
https://doi.org/10.1016/j.future. 2015.11.013.
[15] T.C.W. Landgrebe, A. Merdith, A. Dutkiewicz, and R.D. Mafaler. Relationships between
palaeogeography and opal occurrence in australia: A data-mining approach. Computers and
Geosciences, 56: 76–82, 2013.
[16] Angela Lausch, Andreas Schmidt, and Lutz Tischendorf. Data mining and linked open data: New
perspectives for data analysis in environmental research. Ecological Modelling, 295(0):5 – 17, 2015.
ISSN 0304-3800. doi: http://dx.doi.org/10.1016/j.ecolmodel.2014.09.018.
21. [17] A. Madraky, Z.A. Othman, and A.R. Hamdan. Analytic methods for spatio-temporal data in a
natureinspired data model. International Review on Computers and Software, 9(3):547–556, 2014.
[18] A.a Mohan and P.Z.b Revesz. Applications of spatio-temporal data mining to north platter river
reservoirs. ACM International Conference Proceeding Series, pages 306–309, 2014.
[19] NOAA. www.solarmonitor.org, April 2016. Last Access April 13, 2016.
[20] K.G. Pillai, R.A. Angryk, J.M. Banda, M.A. Schuh, and T. Wylie. Spatio-temporal co-occurrence
pattern mining in data sets with evolving regions. Proceedings - 12th IEEE International Conference
on Data Mining Workshops, ICDMW 2012, pages 805–812, 2012.
[21] K.G.a Pillai, R.A.b Angryk, and B.b Aydin. A filter-and-refine approach to mine spatiotemporal
cooccurrences. pages 104–113, 2013.
[22] Vangipuram Radhakrishna, Shadi A. Aljawarneh, P.V. Kumar, and V. Janaki. A novel fuzzy similarity
measure and prevalence estimation approach for similarity profiled temporal association pattern
mining. Future Generation Computer Systems, 2017. ISSN 0167-739X. doi: https://doi.org/10.1016/j.
future.2017.03.016.
[23] K Venkateswara Rao, A Govardhan, and KV Chalapati Rao. Spatiotemporal data mining: Issues, tasks
and applications. International Journal of Computer Science & Engineering Survey (IJCSES) Vol,
3:39–52, 2012.
[24] Md. Mamunur Rashid, Iqbal Gondal, and Joarder Kamruzzaman. A technique for parallel sharefrequent
sensor pattern mining from wireless sensor networks. Procedia Computer Science, 29(0):124 – 133,
2014. ISSN 1877-0509. doi: http://dx.doi.org/10.1016/j.procs.2014.05.012. 2014 International
Conference on Computational Science.
[25] Md.M. Rashid, I. Gondal, and J. Kamruzzaman. Mining associated sensor patterns for data stream of
wireless sensor networks. pages 91–98, 2013.
[26] Marcela Xavier Ribeiro, Agma J. M. Traina, and Caetano Traina, Jr. A new algorithm for data
discretization and feature selection. In Proceedings of the 2008 ACM symposium on Applied
computing, SAC ’08, pages 953–954, New York, NY, USA, 2008. ACM. ISBN 978-1-59593-753-7.
doi: 10.1145/1363686.1363905.
[27] James A. Rodger. Toward reducing failure risk in an integrated vehicle health maintenance system: A
fuzzy multi-sensor data fusion kalman filter approach for ivhms. Expert Systems with Applications,
39(10):9821 – 9836, 2012. ISSN 0957-4174. doi: https://doi.org/10.1016/j.eswa.2012.02.171.
[28] W. Sammouri, E. Cafame, L. Oukhellou, and P. Aknin. Mining floating train data sequences for
temporal association rules within a predictive maintenance framework. Lecture Notes in Computer
subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7987
LNAI:112–126, 2013.
[29] T. Scheffer. Finding association rules that trade support optimally against confidence. pages 9: 381–
395, 1995.
[30] Carlos Roberto Silveira-Junior. github.com/carlossilveirajr/mitsai, April 2017. Last Access April 17,
2017.
[31] Carlos Roberto Silveira-Junior, Danilo Codeco Carvalho, Marilde Terezinha Prado Santos, and
Marcela Xavier Ribeiro. Incremental mining of frequent sequences in environmental sensor data. In
The Twenty-Sixth International FLAIRS Conference (2015), pages 452–455, .
[32] Carlos Roberto Silveira-Junior, Marilde Prado Santos, and Marcela Ribeiro. Stretchy time pattern
mining: A deeper analysis of environment sensor data. pages 468–473, .
[33] Carlos Roberto Silveira-Junior, Marcela Xavier Ribeiro, and Marilde Terezinha Prado Santos. A
flexible architecture to integrate the solar satellite image time series data - the setl architecture. Pages
1–14, 2017. No publish by the time of this paper preparation.
[34] Olga Spatenkovaa and Kirsi Virrantausb. Discovering spatio-temporal relationships in the distribution
22. of building fires. Fire Safety Journal, 62, Part A:49 – 63, 2013. ISSN 0379-7112. doi: http://dx.
doi.org/10.1016/j.firesaf.2013.07.001. Special Issue on Spatial Analytical Approaches in Urban Fire
Management.
[35] Fenzhen Su, Chenghu Zhou, and Wenzhoung Shi. Geoevent association rule discovery model based on
rough set with marine fishery application. In Geoscience and Remote Sensing Symposium, 2004.
IGARSS ’04. Proceedings. 2004 IEEE International, volume 2, pages 1455–1458 vol.2, Sept 2004.
doi: 10.1109/IGARSS.2004.1368694.
[36] Edi Winarko and John F. Roddick. Armada: An algorithm for discovering richer relative temporal
association rules from interval-based data. Data & Knowledge Engineering, 63(1):76 – 90, 2007 ISSN
0169-023X. doi: http://dx.doi.org/10.1016/j.datak.2006.10.009. Data Warehouse an Knowledge
Discovery, 7th International Congress on Data Warehouse and Knowledge Discovery.
[37] J.S. Yoo and M. Bow. Mining spatial colocation patterns: A different framework. Data Mining and
Knowledge Discovery, 24(1):159–194, 2012.
[38] B. Zaragoza, A. Rabasa, J.J. Rodriguez-Sala, J.T. Navarro, A. Belda, and A. Ramon. Modelling
farmland abandonment: A study combining gis and data mining techniques. volume 155, pages 124 –
132, 2012. doi: http://dx.doi.org/10.1016/j.agee.2012.03.019.
AUTHORS
Carlos Roberto Silveira Junior is a System Architect at Ericsson expert in software development for
telecom systems. He is a Ph.D. Student at Federal University of São Carlos (UFSCar) working with Data
Mining -Artificial Intelligence- with images processing and parallelism. Holds a Master Degree in Data
Mining and ontology at UFSCar, graduated Computer Science at UFSCar. Main areas of interest: data
mining, programming, and distributed system.
José Roberto Cecatto holds a BS in Physics from the Pontifical Catholic University of São Paulo, a
Master’s and a PhD in Astrophysics from the National Institute for Space Research (INPE). He is currently a
researcher at INPE. He has experience in Astronomy, with an emphasis on Radio Astronomy and
instrumental development, acting mainly on the following themes: solar, radio, spectroscopy and
interferometry. Currently, working in Space Climate with activities and development of tools related to the
forecast of solar phenomena.
Marilde Terezinha Prado Santos is an associate professor at the Center for Exact Sciences and
Technology / Department of Computing at the Federal University of São Carlos. PhD in Science with
emphasis in Computational Physics from the University of São Paulo (2000). Master in Computer Science
from the Federal University of São Carlos (1994) and Bachelor in Computer Science from the Pontifical
Catholic University of Rio Grande do Sul (1991). Main areas of interest: engineering and applications of
crisp and fuzzy ontologies, information retrieval, data mining, data integration and semantic web.