Data mining or Knowledge Discovery in Databases (KDD) is a new field in information technology that emerged because of progress in creation and maintenance of large databases by combining statistical and artificial intelligence methods with database management. Data mining is used to recognize hidden patterns and provide relevant information for decision making on complex problems where conventional methods are inecient or too slow. Data mining can be used as a powerful tool to predict future trends and behaviors, and this prediction allows making proactive, knowledge-driven decisions in businesses. Since the automated prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools, it can answer the business questions which are traditionally time consuming to resolve. Based on this great advantage, it provides more interest for the government, industry and commerce. In this paper we have used this tool to investigate the Euro currency fluctuation.For this investigation, we have three different algorithms: K*, IBK and MLP and we have extracted.Euro currency volatility by using the same criteria for all used algorithms. The used dataset has
21,084 records and is collected from daily price fluctuations in the Euro currency in the period
of10/2006 to 04/2010.
Different Classification Technique for Data mining in Insurance Industry usin...IOSRjournaljce
this paper addresses the issues and techniques for Property/Casualty actuaries applying data mining methods. Data mining means the effective unknown pattern discovery from a large amount database. It is an interactive knowledge discovery procedure which is includes data acquisition, data integration, data exploration, model building, and model validation. The paper provides an overview of the data discovery method and introduces some important data mining method for application to insurance concluding cluster discovery approaches.
6 ijaems sept-2015-6-a review of data security primitives in data miningINFOGAIN PUBLICATION
This paper has discussed various issues and security primitives like Spatial Data Handing, Privacy Protection of data, Data Load Balancing, Resource Mining etc. in the area of Data Mining.A 5-stage review process has been conductedfor 30 research papers which were published in the period of year ranging from 1996 to year 2013. After an exhaustive review process, nine key issues were found “Spatial Data Handing, Data Load Balancing, Resource Mining ,Visual Data Mining, Data Clusters Mining, Privacy Preservation, Mining of gaps between business tools & patterns, Mining of hidden complex patterns.” which have been resolved and explained with proper methodologies. Several solution approaches have been discussed in the 30 papers. This paper provides an outcome of the review which is in the form of various findings, found under various key issues. The findings included algorithms and methodologies used by researchers along with their strengths and weaknesses and the scope for the future work in the area.
A SURVEY ON DATA MINING IN STEEL INDUSTRIESIJCSES Journal
In Industrial environments, huge amount of data is being generated which in turn collected indatabase anddata warehouses from all involved areas such as planning, process design, materials, assembly, production, quality, process control, scheduling, fault detection,shutdown, customer relation management, and so on. Data Mining has become auseful tool for knowledge acquisition for industrial process of Iron and steel making. Due to the rapid growth in Data Mining, various industries started using data mining technology to search the hidden patterns, which might further be used to the system with the new knowledge which might design new models to enhance the production quality, productivity optimum cost and maintenance etc. The continuous improvement of all steel production process regarding the avoidance of quality deficiencies and the related improvement of production yield is an essential task of steel producer. Therefore, zero defect strategy is popular today and to maintain it several quality assurancetechniques areused. The present report explains the methods of data mining and describes its application in the industrial environment and especially, in the steel industry.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A gigantic archive of terabytes of information is created every day from current data frameworks and computerized advances, for example, Internet of Things and distributed computing. Examination of these gigantic information requires a ton of endeavors at various levels to extricate information for dynamic. Hence, huge information examination is an ebb and flow region of innovative work. The essential goal of this paper is to investigate the likely effect of huge information challenges, and different instruments related with it. Accordingly, this article gives a stage to investigate enormous information at various stages. Moreover, it opens another skyline for analysts to build up the arrangement, in light of the difficulties and open exploration issues.
Different Classification Technique for Data mining in Insurance Industry usin...IOSRjournaljce
this paper addresses the issues and techniques for Property/Casualty actuaries applying data mining methods. Data mining means the effective unknown pattern discovery from a large amount database. It is an interactive knowledge discovery procedure which is includes data acquisition, data integration, data exploration, model building, and model validation. The paper provides an overview of the data discovery method and introduces some important data mining method for application to insurance concluding cluster discovery approaches.
6 ijaems sept-2015-6-a review of data security primitives in data miningINFOGAIN PUBLICATION
This paper has discussed various issues and security primitives like Spatial Data Handing, Privacy Protection of data, Data Load Balancing, Resource Mining etc. in the area of Data Mining.A 5-stage review process has been conductedfor 30 research papers which were published in the period of year ranging from 1996 to year 2013. After an exhaustive review process, nine key issues were found “Spatial Data Handing, Data Load Balancing, Resource Mining ,Visual Data Mining, Data Clusters Mining, Privacy Preservation, Mining of gaps between business tools & patterns, Mining of hidden complex patterns.” which have been resolved and explained with proper methodologies. Several solution approaches have been discussed in the 30 papers. This paper provides an outcome of the review which is in the form of various findings, found under various key issues. The findings included algorithms and methodologies used by researchers along with their strengths and weaknesses and the scope for the future work in the area.
A SURVEY ON DATA MINING IN STEEL INDUSTRIESIJCSES Journal
In Industrial environments, huge amount of data is being generated which in turn collected indatabase anddata warehouses from all involved areas such as planning, process design, materials, assembly, production, quality, process control, scheduling, fault detection,shutdown, customer relation management, and so on. Data Mining has become auseful tool for knowledge acquisition for industrial process of Iron and steel making. Due to the rapid growth in Data Mining, various industries started using data mining technology to search the hidden patterns, which might further be used to the system with the new knowledge which might design new models to enhance the production quality, productivity optimum cost and maintenance etc. The continuous improvement of all steel production process regarding the avoidance of quality deficiencies and the related improvement of production yield is an essential task of steel producer. Therefore, zero defect strategy is popular today and to maintain it several quality assurancetechniques areused. The present report explains the methods of data mining and describes its application in the industrial environment and especially, in the steel industry.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A gigantic archive of terabytes of information is created every day from current data frameworks and computerized advances, for example, Internet of Things and distributed computing. Examination of these gigantic information requires a ton of endeavors at various levels to extricate information for dynamic. Hence, huge information examination is an ebb and flow region of innovative work. The essential goal of this paper is to investigate the likely effect of huge information challenges, and different instruments related with it. Accordingly, this article gives a stage to investigate enormous information at various stages. Moreover, it opens another skyline for analysts to build up the arrangement, in light of the difficulties and open exploration issues.
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
In this paper we focus on some techniques for solving data mining tasks such as: Statistics, Decision Trees and Neural
Networks. The new approach has succeed in defining some new criteria for the evaluation process, and it has obtained valuable results
based on what the technique is, the environment of using each techniques, the advantages and disadvantages of each technique, the
consequences of choosing any of these techniques to extract hidden predictive information from large databases, and the methods of
implementation of each technique. Finally, the paper has presented some valuable recommendations in this field.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
This Presentation is about Data mining and its application in different fields. This presentation shows why data mining is important and how it can impact businesses.
We are living in a world, where a vast amount of digital data which is called big data. Plus as the world becomes more and more connected via the Internet of Things (IoT). The IoT has been a major influence on the Big Data landscape. The analysis of such big data brings ahead business competition to the next level of innovation and productivity.
Data Science tutorial for beginner level to advanced level | Data Science pro...IQ Online Training
This is a complete tutorial to learn data science from beginner level to advanced level. Know about the projects that are deployed at each and every level. These are some of the examples of data set and why you should take them.
Data Mining of Project Management Data: An Analysis of Applied Research Studies.Gurdal Ertek
Data collected and generated through and posterior to projects, such as data residing in project management software and post project review documents, can be a major source of actionable insights and competitive advantage. This paper presents a rigorous
methodological analysis of the applied research published in academic literature, on the application of data mining (DM) for project management (PM). The objective of the paper is to provide a comprehensive analysis and discussion of where and how data mining is applied for project management data and to provide practical insights for future research in the field.
https://dl.acm.org/citation.cfm?id=3176714
https://ertekprojects.com/ftp/papers/2017/ertek_et_al_2017_Data_Mining_of_Project_Management_Data.pdf
What is Datamining? Which algorithms can be used for Datamining?Seval Çapraz
This presentation includes what is datamining, which technics and algorithms are available in datamining. This presentation helps you to understand the concepts of datamining.
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...ijdpsjournal
The paper aims at proposing a solution for designing and developing a seamless automation and integration of machine learning capabilities for Big Data with the following requirements: 1) the ability to seamlessly handle and scale very large amount of unstructured and structured data from diversified and heterogeneous sources; 2) the ability to systematically determine the steps and procedures needed for
analyzing Big Data datasets based on data characteristics, domain expert inputs, and data pre-processing component; 3) the ability to automatically select the most appropriate libraries and tools to compute and accelerate the machine learning computations; and 4) the ability to perform Big Data analytics with high learning performance, but with minimal human intervention and supervision. The whole focus is to provide
a seamless automated and integrated solution which can be effectively used to analyze Big Data with highfrequency
and high-dimensional features from different types of data characteristics and different application problem domains, with high accuracy, robustness, and scalability. This paper highlights the research methodologies and research activities that we propose to be conducted by the Big Data researchers and practitioners in order to develop and support seamless automation and integration of machine learning capabilities for Big Data analytics.
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...ijaia
Movies are among the most prominent contributors to the global entertainment industry today, and they
are among the biggest revenue-generating industries from a commercial standpoint. It's vital to divide
films into two categories: successful and unsuccessful. To categorize the movies in this research, a variety
of models were utilized, including regression models such as Simple Linear, Multiple Linear, and Logistic
Regression, clustering techniques such as SVM and K-Means, Time Series Analysis, and an Artificial
Neural Network. The models stated above were compared on a variety of factors, including their accuracy
on the training and validation datasets as well as the testing dataset, the availability of new movie
characteristics, and a variety of other statistical metrics. During the course of this study, it was discovered
that certain characteristics have a greater impact on the likelihood of a film's success than others. For
example, the existence of the genre action may have a significant impact on the forecasts, although another
genre, such as sport, may not. The testing dataset for the models and classifiers has been taken from the
IMDb website for the year 2020. The Artificial Neural Network, with an accuracy of 86 percent, is the best
performing model of all the models discussed.
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Ferhat Ozgur Catak
Database Management Systems (DBMS) play an important role to support
enterprise application developments. Selection of the right DBMS is a crucial decision for
software engineering process. This selection requires optimizing a number of criteria.
Evaluation and selection of DBMS among several candidates tend to be very complex. It
requires both quantitative and qualitative issues. Wrong selection of DBMS will have a
negative effect on the development of enterprise application. It can turn out to be costly and adversely affect business process. The following study focuses on the evaluation of a multi criteria
decision problem by the usage of fuzzy logic. We will demonstrate the methodological considerations
regarding to group decision and fuzziness based on the DBMS selection problem. We developed a new
Fuzzy AHP based decision model which is formulated and proposed to select a DBMS easily. In this
decision model, first, main criteria and their sub criteria are determined for the evaluation. Then these
criteria are weighted by pair-wise comparison, and then DBMS alternatives are evaluated by assigning a
rating scale.
OrganicDataNetwork Comprehensiveness & Compatibility of different organic mar...Raffaele Zanoli
The presentation is an abridged compilation of the following OrganicDataNetwork publications:
Feldmann, C. and Hamm, U. (2013). Executive summary report on the comprehensiveness and compatibility of organic market data collection methods. University of Kassel, Witzenhausen (D3.2) available at http://orgprints.org/23011/.
Feldmann, C. and Hamm, U. (2013). Report on collection methods: Classification of data collection methods University of Kassel, Witzenhausen (D3.1) available at http://orgprints.org/23010/.
Authoring system of drill & practice elearning modules for hearing impaired s...ijcsit
Hearing Impaired (HI) persons need to keep on practicing and repeating their lessons as well as their exercises. Teaching methodology of HI students differ than normal students. HI students need to be involved in practicing more and more using their modes of visual communication like sign language to cover their audio disability. Teaching methodology of HI students recommends demonstration and repeating with slow presentations of instructional material. A teacher displays his lesson directly face to
face without visual noise. More reinforcement and encouragement to HI students , fun & enjoyment should
be strongly included in the e-lessons as well as continuous interaction between teacher and HI students. As
per previous factors the decision of researchers is to develop Drill & Practice (D&P) e-learning modules(eLMs) for selected topics like Mathematics. D&P eLMs of Mathematics for HI persons would be the case study of this research including Developing & Evaluating.
The authors selected D&P eLMs Because eLMs match the requirements and mechanism of teaching methodology for HI students.
The mechanism of developing eLMs is represented by Developing an Authoring System which allows teachers of HI persons to generate any eLM of any selected topic for HI students. Also they can generate multiple eLMs in the project.
The evaluating producer & tools for the experimental eLMs were view points of Experts through openQuestionnaire to list their evaluating comments. Besides view points of experts. There are experiments which were held in real environment of HI students to test the eLMs of D&P of Mathematics to get valuable feedback from them.
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
In this paper we focus on some techniques for solving data mining tasks such as: Statistics, Decision Trees and Neural
Networks. The new approach has succeed in defining some new criteria for the evaluation process, and it has obtained valuable results
based on what the technique is, the environment of using each techniques, the advantages and disadvantages of each technique, the
consequences of choosing any of these techniques to extract hidden predictive information from large databases, and the methods of
implementation of each technique. Finally, the paper has presented some valuable recommendations in this field.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
This Presentation is about Data mining and its application in different fields. This presentation shows why data mining is important and how it can impact businesses.
We are living in a world, where a vast amount of digital data which is called big data. Plus as the world becomes more and more connected via the Internet of Things (IoT). The IoT has been a major influence on the Big Data landscape. The analysis of such big data brings ahead business competition to the next level of innovation and productivity.
Data Science tutorial for beginner level to advanced level | Data Science pro...IQ Online Training
This is a complete tutorial to learn data science from beginner level to advanced level. Know about the projects that are deployed at each and every level. These are some of the examples of data set and why you should take them.
Data Mining of Project Management Data: An Analysis of Applied Research Studies.Gurdal Ertek
Data collected and generated through and posterior to projects, such as data residing in project management software and post project review documents, can be a major source of actionable insights and competitive advantage. This paper presents a rigorous
methodological analysis of the applied research published in academic literature, on the application of data mining (DM) for project management (PM). The objective of the paper is to provide a comprehensive analysis and discussion of where and how data mining is applied for project management data and to provide practical insights for future research in the field.
https://dl.acm.org/citation.cfm?id=3176714
https://ertekprojects.com/ftp/papers/2017/ertek_et_al_2017_Data_Mining_of_Project_Management_Data.pdf
What is Datamining? Which algorithms can be used for Datamining?Seval Çapraz
This presentation includes what is datamining, which technics and algorithms are available in datamining. This presentation helps you to understand the concepts of datamining.
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...ijdpsjournal
The paper aims at proposing a solution for designing and developing a seamless automation and integration of machine learning capabilities for Big Data with the following requirements: 1) the ability to seamlessly handle and scale very large amount of unstructured and structured data from diversified and heterogeneous sources; 2) the ability to systematically determine the steps and procedures needed for
analyzing Big Data datasets based on data characteristics, domain expert inputs, and data pre-processing component; 3) the ability to automatically select the most appropriate libraries and tools to compute and accelerate the machine learning computations; and 4) the ability to perform Big Data analytics with high learning performance, but with minimal human intervention and supervision. The whole focus is to provide
a seamless automated and integrated solution which can be effectively used to analyze Big Data with highfrequency
and high-dimensional features from different types of data characteristics and different application problem domains, with high accuracy, robustness, and scalability. This paper highlights the research methodologies and research activities that we propose to be conducted by the Big Data researchers and practitioners in order to develop and support seamless automation and integration of machine learning capabilities for Big Data analytics.
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...ijaia
Movies are among the most prominent contributors to the global entertainment industry today, and they
are among the biggest revenue-generating industries from a commercial standpoint. It's vital to divide
films into two categories: successful and unsuccessful. To categorize the movies in this research, a variety
of models were utilized, including regression models such as Simple Linear, Multiple Linear, and Logistic
Regression, clustering techniques such as SVM and K-Means, Time Series Analysis, and an Artificial
Neural Network. The models stated above were compared on a variety of factors, including their accuracy
on the training and validation datasets as well as the testing dataset, the availability of new movie
characteristics, and a variety of other statistical metrics. During the course of this study, it was discovered
that certain characteristics have a greater impact on the likelihood of a film's success than others. For
example, the existence of the genre action may have a significant impact on the forecasts, although another
genre, such as sport, may not. The testing dataset for the models and classifiers has been taken from the
IMDb website for the year 2020. The Artificial Neural Network, with an accuracy of 86 percent, is the best
performing model of all the models discussed.
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Ferhat Ozgur Catak
Database Management Systems (DBMS) play an important role to support
enterprise application developments. Selection of the right DBMS is a crucial decision for
software engineering process. This selection requires optimizing a number of criteria.
Evaluation and selection of DBMS among several candidates tend to be very complex. It
requires both quantitative and qualitative issues. Wrong selection of DBMS will have a
negative effect on the development of enterprise application. It can turn out to be costly and adversely affect business process. The following study focuses on the evaluation of a multi criteria
decision problem by the usage of fuzzy logic. We will demonstrate the methodological considerations
regarding to group decision and fuzziness based on the DBMS selection problem. We developed a new
Fuzzy AHP based decision model which is formulated and proposed to select a DBMS easily. In this
decision model, first, main criteria and their sub criteria are determined for the evaluation. Then these
criteria are weighted by pair-wise comparison, and then DBMS alternatives are evaluated by assigning a
rating scale.
OrganicDataNetwork Comprehensiveness & Compatibility of different organic mar...Raffaele Zanoli
The presentation is an abridged compilation of the following OrganicDataNetwork publications:
Feldmann, C. and Hamm, U. (2013). Executive summary report on the comprehensiveness and compatibility of organic market data collection methods. University of Kassel, Witzenhausen (D3.2) available at http://orgprints.org/23011/.
Feldmann, C. and Hamm, U. (2013). Report on collection methods: Classification of data collection methods University of Kassel, Witzenhausen (D3.1) available at http://orgprints.org/23010/.
Authoring system of drill & practice elearning modules for hearing impaired s...ijcsit
Hearing Impaired (HI) persons need to keep on practicing and repeating their lessons as well as their exercises. Teaching methodology of HI students differ than normal students. HI students need to be involved in practicing more and more using their modes of visual communication like sign language to cover their audio disability. Teaching methodology of HI students recommends demonstration and repeating with slow presentations of instructional material. A teacher displays his lesson directly face to
face without visual noise. More reinforcement and encouragement to HI students , fun & enjoyment should
be strongly included in the e-lessons as well as continuous interaction between teacher and HI students. As
per previous factors the decision of researchers is to develop Drill & Practice (D&P) e-learning modules(eLMs) for selected topics like Mathematics. D&P eLMs of Mathematics for HI persons would be the case study of this research including Developing & Evaluating.
The authors selected D&P eLMs Because eLMs match the requirements and mechanism of teaching methodology for HI students.
The mechanism of developing eLMs is represented by Developing an Authoring System which allows teachers of HI persons to generate any eLM of any selected topic for HI students. Also they can generate multiple eLMs in the project.
The evaluating producer & tools for the experimental eLMs were view points of Experts through openQuestionnaire to list their evaluating comments. Besides view points of experts. There are experiments which were held in real environment of HI students to test the eLMs of D&P of Mathematics to get valuable feedback from them.
Clustering of Deep WebPages: A Comparative Studyijcsit
The internethas massive amount of information. This information is stored in the form of zillions of
webpages. The information that can be retrieved by search engines is huge, and this information constitutes
the ‘surface web’.But the remaining information, which is not indexed by search engines – the ‘deep web’,
is much bigger in size than the ‘surface web’, and remains unexploited yet.
Several machine learning techniques have been commonly employed to access deep web content. Under
machine learning, topic models provide a simple way to analyze large volumes of unlabeled text. A ‘topic’is
a cluster of words that frequently occur together and topic models can connect words with similar
meanings and distinguish between words with multiple meanings. In this paper, we cluster deep web
databases employing several methods, and then perform a comparative study. In the first method, we apply
Latent Semantic Analysis (LSA) over the dataset. In the second method, we use a generative probabilistic
model called Latent Dirichlet Allocation(LDA) for modeling content representative of deep web
databases.Both these techniques are implemented after preprocessing the set of web pages to extract page
contents and form contents.Further, we propose another version of Latent Dirichlet Allocation (LDA) to the
dataset. Experimental results show that the proposed method outperforms the existing clustering methods.
A CLUSTERING TECHNIQUE FOR EMAIL CONTENT MININGijcsit
In today’s world of internet, with whole lot of e-documents such, as html pages, digital libraries etc. occupying considerable amount of cyber space, organizing these documents has become a practical need. Clustering is an important technique that organizes large number of objects into smaller coherent groups.This helps in efficient and effective use of these documents for information retrieval and other NLP tasks.Email is one of the most frequently used e-document by individual or organization. Email categorization is one of the major tasks of email mining. Categorizing emails into different groups help easy retrieval and maintenance. Like other e-documents, emails can also be classified using clustering algorithms. In this
paper a similarity measure called Similarity Measure for Text Processing is suggested for email clustering.
The suggested similarity measure takes into account three situations: feature appears in both emails, feature appears in only one email and feature appears in none of the emails. The potency of suggested similarity measure is analyzed on Enron email data set to categorize emails. The outcome indicates that the efficiency acquired by the suggested similarity measure is better than that acquired by other measures.
T AXONOMY OF O PTIMIZATION A PPROACHES OF R ESOURCE B ROKERS IN D ATA G RIDSijcsit
A novel taxonomy of replica selection techniques is proposed. We studied some data grid approaches
where the selection strategies of data management is different. The aim of the study is to determine the
common concepts and observe their performance and to compare their performance with our strategy
A rule based approach towards detecting human temperamentijcsit
This paper presented a rule based system for detecting human temperament.. The system was developed to
provide support for an expert psychologist in properly predicting the temperament of an individual as well
as given advice to the user. The system does this by following specified rules. Of this, we have deduced
some features that makes up known temperament types from which the system can accurately classify the
user‘s temperament based on the person‘s characters. Also, our work is solely limited to temperament, any
expert advice sought from and given by the system is limited to this scope.
Căn hộ An gia garden Tân Phú, Giá chỉ 799tr/ căn 2PN. Hotline PKD: 0985 889 990TTC Land
CĂN HỘ AN GIA GARDEN TÂN PHÚ- Nơi muốn đến - Chốn muốn về
KHU DÂN CƯ BIỆT LẬP ĐẲNG CẤP HÀNG ĐẦU QUẬN TÂN PHÚ
***CHỈ TỪ 799TR/CĂN - 2 PHÒNG NGỦ***
****CHÍNH THỨC NHẬN ĐẶT MUA CĂN HỘ - ƯU TIÊN CHỌN VỊ TRÍ ĐẸP****
An ninh tuyệt đối, tích hợp kỹ thuật hiện đại
Ngân hàng hỗ trợ tối đa 70% giá trị căn hộ trong 10 năm
(Mở bán không mua hoàn tiền 100% - Cam kết trong phiếu đặt mua)
Hotline PKD: 0985 889 990
http://canhosaigon365.com/du-an/du-an-can-ho/300-can-ho-an-gia-garden-quan-tan-phu
HYBRID OPTICAL AND ELECTRICAL NETWORK FLOWS SCHEDULING IN CLOUD DATA CENTRESijcsit
Hybrid intra-data centre networks, with optical and electrical capabilities, are attracting research interest
in recent years. This is attributed to the emergence of new bandwidth greedy applications and novel
computing paradigms. A key decision to make in networks of this type is the selection and placement of
suitable flows for switching in circuit network. Here, we propose an efficient strategy for flow selection and
placement suitable for hybrid Intra-cloud data centre networks. We further present techniques for
investigating bottlenecks in a packet networks and for the selection of flows to switch in circuit network.
The bottleneck technique is verified on a Software Defined Network (SDN) testbed. We also implemented
the techniques presented here in a scalable simulation experiment to investigate the impact of flow
selection on network performance. Results obtained from scalable simulation experiment indicate a
considerable improvement on average throughput, lower configuration delay, and stability of offloaded
flows..
OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIG...ijcsit
In this paper, we propose a new algorithm to estimate a super-resolution image from a given low-resolution
image, by adding high-frequency information that is extracted from natural high-resolution images in the
training dataset. The selection of the high-frequency information from the training dataset is accomplished in
two steps, a nearest-neighbor search algorithm is used to select the closest images from the training dataset,
which can be implemented in the GPU, and a sparse-representation algorithm is used to estimate a weight
parameter to combine the high-frequency information of selected images. This simple but very powerful
super-resolution algorithm can produce state-of-the-art results. Qualitatively and quantitatively, we
demonstrate that the proposed algorithm outperforms existing state-of-the-art super-resolution algorithms.
We have concentrated on a range of strategies, methodologies, and distinct fields of research in this article, all of which are useful and relevant in the field of data mining technologies. As we all know, numerous multinational corporations and major corporations operate in various parts of the world. Each location of business may create significant amounts of data. Corporate decision-makers need access to all of these data sources in order to make strategic decisions.
Processing of the data generated from transactions that occur every day which resulted in nearly thousands of data per day requires software capable of enabling users to conduct a search of the necessary data. Data mining becomes a solution for the problem. To that end, many large industries began creating software that can perform data processing. Due to the high cost to obtain data mining software that comes from the big industry, then eventually some communities such as universities eventually provide convenience for users who want just to learn or to deepen the data mining to create software based on open source. Meanwhile, many commercial vendors market their products respectively. WEKA and Salford System are both of data mining software. They have the advantages and the disadvantages. This study is to compare them by using several attributes. The users can select which software is more suitable for their daily activities.
A STUDY OF TRADITIONAL DATA ANALYSIS AND SENSOR DATA ANALYTICSijistjournal
The growth of smart and intelligent devices known as sensors generate large amount of data. These generated data over a time span takes such a large volume which is designated as big data. The data structure of repository holds unstructured data. The traditional data analytics methods well developed and used widely to analyze structured data and to limit extend the semi-structured data which involves additional processing over heads. The similar methods used to analyze unstructured data are different because of distributed computing approach where as there is a possibility of centralized processing in case of structured and semi-structured data. The under taken work is confined to analysis of both verities of methods. The result of this study is targeted to introduce methods available to analyze big data.
In the recent years the scope of data mining has evolved into an active area of research because of the previously unknown and interesting knowledge from very large database collection. The data mining is applied on a variety of applications in multiple domains like in business, IT and many more sectors. In Data Mining the major problem which receives great attention by the community is the classification of the data. The classification of data should be such that it could be they can be easily verified and should be easily interpreted by the humans. In this paper we would be studying various data mining techniques so that we can find few combinations for enhancing the hybrid technique which would be having multiple techniques involved so enhance the usability of the application. We would be studying CHARM Algorithm, CM-SPAM Algorithm, Apriori Algorithm, MOPNAR Algorithm and the Top K Rules.
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...ijdpsjournal
The paper aims at proposing a solution for designing and developing a seamless automation and
integration of machine learning capabilities for Big Data with the following requirements: 1) the ability to
seamlessly handle and scale very large amount of unstructured and structured data from diversified and
heterogeneous sources; 2) the ability to systematically determine the steps and procedures needed for
analyzing Big Data datasets based on data characteristics, domain expert inputs, and data pre-processing
component; 3) the ability to automatically select the most appropriate libraries and tools to compute and
accelerate the machine learning computations; and 4) the ability to perform Big Data analytics with high
learning performance, but with minimal human intervention and supervision. The whole focus is to provide
a seamless automated and integrated solution which can be effectively used to analyze Big Data with highfrequency
and high-dimensional features from different types of data characteristics and different
application problem domains, with high accuracy, robustness, and scalability. This paper highlights the
research methodologies and research activities that we propose to be conducted by the Big Data
researchers and practitioners in order to develop and support seamless automation and integration of
machine learning capabilities for Big Data analytics.
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
Data mining works to extract information known in advance from the enormous quantities of data which can lead to knowledge. It provides information that helps to make good decisions. The effectiveness of data mining in access to knowledge to achieve the goal of which is the discovery of the hidden facts contained in databases and through the use of multiple technologies. Clustering is organizing data into clusters or groups such that they have high intra-cluster similarity and low inter cluster similarity. This paper deals with K-means clustering algorithm which collect a number of data based on the characteristics and attributes of this data, and process the Clustering by reducing the distances between the data center. This algorithm is applied using open source tool called WEKA, with the Insurance dataset as its input
Visual and analytical mining of sales transaction data for production plannin...Gurdal Ertek
Recent developments in information technology paved the way for the collection of large amounts of data pertaining to various aspects of an enterprise. The greatest challenge faced in
processing these massive amounts of raw data gathered turns out to be the effective management of data with the ultimate purpose of deriving necessary and meaningful information
out of it. The following paper presents an attempt to illustrate the combination of visual and analytical data mining techniques for planning of marketing and production activities. The
primary phases of the proposed framework consist of filtering, clustering and comparison steps implemented using interactive pie charts, K-Means algorithm and parallel coordinate plots
respectively. A prototype decision support system is developed and a sample analysis session is conducted to demonstrate the applicability of the framework.
http://research.sabanciuniv.edu.
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHODAM Publications
Data mining is the process of analyzing data from different perspectives and summarizing it into useful information the patterns, associations, or relationships among all this data can provide information. Spatial Data Mining (SDM) is the part of data mining. It is mainly used for finding the figure in data that is related to space. In spatial data mining, to get the result analyst use geographical or spatial information which require some special technique to get geographical data in the appropriate formats. SDM is mainly used in earth science data, crime mapping, census data, traffic data, and cancer cluster (i.e. to investigate environment health hazards). For real time processing, Stream data mining is used for the prediction of storm using spatial dataset with the help of stream data mining strategy i.e. CMR (Cloud Map-Reduce). In this, Stream data mining is presented to detect the storm disaster of coastal area which is located in Central America country and then by taking the dataset of coastal area which are having various regions. There are various regions i.e. Panama, Greater Antilles, Mexico golf which is used to detect the storm. It also detects that, is the region is affected or not, if affected then which area of that region is affected and from this it is helpful to predict the storm before the disaster occur. In this, two parameters are used to test the technique i.e. processing time and computational load and using this parameter compare the previous and proposed technique. The main aim of this is to predict storm disaster before it happen with the help of directionality and velocity of storm.
An analysis and impact factors on Agriculture field using Data Mining Techniquesijcnes
In computing and information huge amount of data was provided in the storage. The task is to extract the specified data from the raw data. Data mining is one of the techniques that will extract the data. Data mining techniques are used in many places. The techniques like K-means, K nearest neighbor, support vector machine, bi clustering, navie bayes classifier, neural networks and fuzzy C-means are applied on agricultural data. There are many factors in agriculture. The main factors for the farmer are climate, soil and yield prediction. Farmer must know To improve their production select suitable crop for suitable climate. This paper provides the various concepts of Data mining, their applications and also discusses the research field in agriculture. This paper discusses the different types of factors that impact in the agriculture field.
Real World Application of Big Data In Data Mining Toolsijsrd.com
The main aim of this paper is to make a study on the notion Big data and its application in data mining tools like R, Weka, Rapidminer, Knime,Mahout and etc. We are awash in a flood of data today. In a broad range of application areas, data is being collected at unmatched scale. Decisions that previously were based on surmise, or on painstakingly constructed models of reality, can now be made based on the data itself. Such Big Data analysis now drives nearly every aspect of our modern society, including mobile services, retail, manufacturing, financial services, life sciences, and physical sciences. The paper mainly focuses different types of data mining tools and its usage in big data in knowledge discovery.
The Survey of Data Mining Applications And Feature Scope IJCSEIT Journal
In this paper we have focused a variety of techniques, approaches and different areas of the research which
are helpful and marked as the important field of data mining Technologies. As we are aware that many MNC’s
and large organizations are operated in different places of the different countries. Each place of operation
may generate large volumes of data. Corporate decision makers require access from all such sources and
take strategic decisions .The data warehouse is used in the significant business value by improving the
effectiveness of managerial decision-making. In an uncertain and highly competitive business
environment, the value of strategic information systems such as these are easily recognized however in
today’s business environment, efficiency or speed is not the only key for competitiveness. This type of huge
amount of data’s are available in the form of tera- to peta-bytes which has drastically changed in the areas
of science and engineering. To analyze, manage and make a decision of such type of huge amount of data
we need techniques called the data mining which will transforming in many fields. This paper imparts more
number of applications of the data mining and also o focuses scope of the data mining which will helpful in
the further research.
Due to the arrival of new technologies, devices, and communication means, the amount of data produced by mankind is growing rapidly every year. This gives rise to the era of big data. The term big data comes with the new challenges to input, process and output the data. The paper focuses on limitation of traditional approach to manage the data and the components that are useful in handling big data. One of the approaches used in processing big data is Hadoop framework, the paper presents the major components of the framework and working process within the framework.
Introduction to feature subset selection methodIJSRD
Data Mining is a computational progression to ascertain patterns in hefty data sets. It has various important techniques and one of them is Classification which is receiving great attention recently in the database community. Classification technique can solve several problems in different fields like medicine, industry, business, science. PSO is based on social behaviour for optimization problem. Feature Selection (FS) is a solution that involves finding a subset of prominent features to improve predictive accuracy and to remove the redundant features. Rough Set Theory (RST) is a mathematical tool which deals with the uncertainty and vagueness of the decision systems.
Data Mining in Telecommunication Industryijsrd.com
Telecommunication companies today are operating in highly competitive and challenging environment. Vast volume of data is generated from various operational systems and these are used for solving many business problems that required urgent handling. These data include call detail data, customer data and network data. Data Mining methods and business intelligence technology are widely used for handling the business problems in this industry. The goal of this paper is to provide a broad review of data mining concepts.
Similar to Survey of the Euro Currency Fluctuation by Using Data Mining (20)
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Welocme to ViralQR, your best QR code generator.ViralQR
Welcome to ViralQR, your best QR code generator available on the market!
At ViralQR, we design static and dynamic QR codes. Our mission is to make business operations easier and customer engagement more powerful through the use of QR technology. Be it a small-scale business or a huge enterprise, our easy-to-use platform provides multiple choices that can be tailored according to your company's branding and marketing strategies.
Our Vision
We are here to make the process of creating QR codes easy and smooth, thus enhancing customer interaction and making business more fluid. We very strongly believe in the ability of QR codes to change the world for businesses in their interaction with customers and are set on making that technology accessible and usable far and wide.
Our Achievements
Ever since its inception, we have successfully served many clients by offering QR codes in their marketing, service delivery, and collection of feedback across various industries. Our platform has been recognized for its ease of use and amazing features, which helped a business to make QR codes.
Our Services
At ViralQR, here is a comprehensive suite of services that caters to your very needs:
Static QR Codes: Create free static QR codes. These QR codes are able to store significant information such as URLs, vCards, plain text, emails and SMS, Wi-Fi credentials, and Bitcoin addresses.
Dynamic QR codes: These also have all the advanced features but are subscription-based. They can directly link to PDF files, images, micro-landing pages, social accounts, review forms, business pages, and applications. In addition, they can be branded with CTAs, frames, patterns, colors, and logos to enhance your branding.
Pricing and Packages
Additionally, there is a 14-day free offer to ViralQR, which is an exceptional opportunity for new users to take a feel of this platform. One can easily subscribe from there and experience the full dynamic of using QR codes. The subscription plans are not only meant for business; they are priced very flexibly so that literally every business could afford to benefit from our service.
Why choose us?
ViralQR will provide services for marketing, advertising, catering, retail, and the like. The QR codes can be posted on fliers, packaging, merchandise, and banners, as well as to substitute for cash and cards in a restaurant or coffee shop. With QR codes integrated into your business, improve customer engagement and streamline operations.
Comprehensive Analytics
Subscribers of ViralQR receive detailed analytics and tracking tools in light of having a view of the core values of QR code performance. Our analytics dashboard shows aggregate views and unique views, as well as detailed information about each impression, including time, device, browser, and estimated location by city and country.
So, thank you for choosing ViralQR; we have an offer of nothing but the best in terms of QR code services to meet business diversity!
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Survey of the Euro Currency Fluctuation by Using Data Mining
1. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 4, August 2013
DOI : 10.5121/ijcsit.2013.5409 121
Survey of the Euro Currency Fluctuation by Using
Data Mining
M. Baan, E. Saadati, and M. Nasiri
Department of Computer Engineering, Iran University of Science and Technology,
Narmak, 16846-13114, Tehran, Iran
baan3117@gmail.com, it.saadati@gmail.com, nasiri_m@iust.ac.ir
ABSTRACT
Data mining or Knowledge Discovery in Databases (KDD) is a new field in information technology that
emerged because of progress in creation and maintenance of large databases by combining statistical
and artificial intelligence methods with database management. Data mining is used to recognize hidden
patterns and provide relevant information for decision making on complex problems where conventional
methods are inecient or too slow. Data mining can be used as a powerful tool to predict future trends and
behaviors, and this prediction allows making proactive, knowledge-driven decisions in businesses. Since
the automated prospective analyses offered by data mining move beyond the analyses of past events
provided by retrospective tools, it can answer the business questions which are traditionally time
consuming to resolve. Based on this great advantage, it provides more interest for the government,
industry and commerce. In this paper we have used this tool to investigate the Euro currency fluctuation.
For this investigation, we have three different algorithms: K*, IBK and MLP and we have extracted
Euro currency volatility by using the same criteria for all used algorithms. The used dataset has
21,084 records and is collected from daily price fluctuations in the Euro currency in the period
of10/2006 to 04/2010.
KEYWORDS
Euro Currency Fluctuation, Data Mining, Stock Market, Knowledge Discovery in Databases.
1 Introduction
Data mining or knowledge discovery in databases (KDD) is a new science considering the
countries progress in the field of IT and penetration of computer systems in the industry and
creating large databases by government departments, banks and private sector need to use it is
deeply felt. For example, data mining knowledge discovery and reliable information hidden in
databases, or to better express, machine data analysis to find useful, new and reliable patterns, in
large databases, called data mining. Data mining in small databases is most commonly used;
patterns of results produced by the strategic decisions of small business firms can take
advantage of many. Data mining application can be summed up in the following statement:
Data mining gives information for intelligent decisions that you make regarding di cult
dilemmas in the workplace. Neural networks have been used for prediction in various fields; as
the extent of the prediction area and its issues, collecting all possible resources and articles
published in this field makes us rigid.
2. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 4, August 2013
122
Example applications of neural networks in forecasting in the period be- tween 1996 to 2005
can be found in the Refs. [1]-[8]. More interested to study in other application areas of neural
networks can be provided by the articles Lin and Fadlalla (2001) [9] and Wong (1998) [10] in
the field of financial applications, Zhang and others (1998) [11] in the anticipated public domain
(General Forecasting Area) Krycha and (1999) Wagner works in the field of management
science.[12] other references in abundance are found to the global Internet net- work. As you
can see, a very large number of diverse prediction issues are solved by neural networks.
Diverse business fields including accounting domain (Financial predicted earnings, predicted
earnings astronomy, predicted the bankruptcy and commercial losses). Finance (predicted
direction, index, returns, risk, rate of change, futures, Stock and commodity prices,
efficiency and ... in Stock market), Marketing (predicted selected customers, market share,
classified, market trends) Economy (Forecasting Business Cycles) Record (Recession) Amount
spent (buying) customers, inflation, industrial production of goods And bonds purchased by
the production and utilization of America (demand predicted electricity consumption, traffic
highway investment, the success of new product projects, project size, application or sale of
goods and retail) International Trade (predicted performance projects with joint investment,
currency exchange rate) Housing transactions (making massive demand predicted, price and
housing), predicted processes related to the environment (predicted concentration levels of
ozone and air quality) has been solved by neural networks [13] ( Many facts mentioned above
are classified among the prediction issues; But applications such bankruptcy prediction is not
necessarily among prediction applications; Such matters like bankruptcy are not only important,
but time is also matters. Hence, these issues can also be predicted among the issues considered;
such as predicted probability of bankruptcy in the future.
2 Data Mining
The act of data mining is divided into several marked stages database. in this paper we confine
to introduce and a brief description of each of these steps:
1. Define the problem.
2. Data recognition:
• Data warehouse formation: This stage is where performed for the formation of continuous
and integrated environments are performed in order to perform the next steps and data
mining on it. In general, the data warehousing is continuous; collection and classified
constantly changing and a new is dynamic that ready to explore.
• Data Selection at this stage, in order to reduce costs of data mining
operations, data that have been studied are selected from the database. Data mining aims to
give results about them. Data conversion: is determined for data mining operations and
should be performed on necessary transformations for certain data. These transformations may
be very convenient and concise as byte to integer conversion or very complex and time
consuming with high costs, such as defining new attributes and extracting or converting data
from string values.
3. Pre-processing.
4. Explore in data: at this stage the data mining is done. At this stage data is being explored by
using data mining techniques. What is extracted is hidden knowledge of them and modeling is
done.
3. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 4, August 2013
123
5. Results Interpretation: At this stage, results and patterns offered by datamining tools
investigated and useful results is determined. Also used in data mining, are genetic
algorithms and neural networks. Neural networks are used due to their efficiency are useful in
solving large complex problems. The application of genetic algorithms in data mining for search
and build an optimal model among the models is achieved, Such that the initial models are placed
on some chromosomes and competition over the transfer of traits to the next generation. The
best and most worthy models are and most worthy to be presented to the user [14] [15].
3 Data collection
Our data set includes 21,084 records from 10/2006 to 4/2010 [16]; and have been gathered
based on daily price fluctuations in the euro exchange forex Sponsored Europe. This
includes the following classifications:
• Time, date: day, month, year: Each 24 record is a day.
• New Price in specific hours- Open: Euros Price at Arrival time in hours. For example, at
12:01 How much is the euro price.
• Out Price in an hour- Close: EUR Price per hour at the time of departure. For example, at 12:59
How much is the euro price.
• Lowest price per hour- Low: It is the Lowest Euros price in that time.
• The highest price at the time- High.
• Average price: Average price in that hour.
4 Data Pre-processing
In order to obtain better results when sorting through data, we applied the following changes:
• Add weekdays category to evaluate excitement of the first days of the week after weekend and
holidays in the market. - Rounding the total data to get a better result.
• Add seasons Category to evaluate the market fluctuations in different seasons.
• Separated the day, month and year to better access to data.
• Open and Close is possible nearly equal and difference of them is in the decimal range. So
to simplify the data we have decided to remove the exit price.
• Methods used in modeling.
5 Neural Network
Neural network have three concepts: 1 - Data Analysis System 2 - neurons or nerve cells 3 -
neurons Labor Law Group, or Network. In a classical definition, neural networks, are a set of
simple processing elements that are connected. Processing elements in neural networks are
much easier than conventional processors with numerous differences. [17] Each neuron with a
number of other neurons to connect directly and is independent and weight of the connections
will determine their relationship, the data are placed in weights. Neural net- work has the
following features:
1. Do Processing units.
2. No virtual memory Part And information are saved in a set of weights.
4. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 4, August 2013
124
3. Loss of parts of networks and failure are not causes of the networks failure to resist the noise
ratio and hardware failures.
Compared with artificial neural networks, the Vector Machine method is to support relatively
new Methods, that have shown good performance than the perceptron neural networks in recent
years. To find the solution equation Optimum Line for data by quadratic programming methods
the known methods in solving problems are significant limitations is done. Support Vector
Machine is basically a linear machine. From human neurons to artificial neurons: Setting aside
some of the critical properties of neurons, and internal communications can be simulated a
primary model of neurons by computer.
6 Neural Networks Structure
A neural network consists of components and weights of layers. Behavior of the network is
dependent on the relationship between members. In general neural networks have three neuronal
layers: 1. Input layer: the raw information is fed to the network. 2. Hidden layer: These layer
performances are determined by relationship between input and weights and hidden layers.
Weights between input and hidden units sets when a hidden unit that must be activated. 3.
Output layer: Outputs Unit performances are depending on the hidden unit activity and weight
between the hidden and output units.[18] There are also Single-layer and multi-layer networks
that single-layer organization where all units have a connectivity layer, the most used has
Greater computational potential than the multiple layers organization. Units in multi-layer
networks are numbered by the layers (Instead of pursuing overall numbering). [17] Both layers
of a network communicate with each other by weights, In fact connections. In neural networks
are some types of connection or link weight: Pioneer: More links of this type in which signals are
only in one direction. Does not exist any feedback from output to input (loop) . The output of
each layer has no effect on the layer. Backward: [17] Peripherals: Output nodes of each layer are
used as input nodes.
7 K* algorithm
An Instance-based Learner Using an Entropic Distance Measure selecting values for the
parameters x and s, and a way of using the values returned by the distance measure to give a
prediction. Choosing values for the arbitrary parameters for each dimension we have to choose
values for the parameters x (for real attributes) and s (for symbolic attributes). The behavior of
the distance measure as these parameters change is interesting. Consider the probability
function for symbolic attributes as s changes.
8 IBK algorithm
IBK is an implementation of the k-nearest-neighbors classifier that employs the distance metric
discussed. If more than one neighbor is selected, the predictions of the neighbors can be
weighted according to their distance to the test instance.[19]
5. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 4, August 2013
125
9 Decision Tree; Association Rules
Two different formulas have been implemented for conversion between weight and distance.
Number of training samples that are kept by the classifier may be limited to window size by
setting options. When new samples are added and old samples deleted the old samples remain
as the total number of samples in training set size [20] [21].
10 Results and Discussion
In this paper we used three classification algorithms to forecast the euro currency fluctuation.
Multi-layer Perceptron procedure neural network, K* and IBK were the models which used to
forecast the euro currency fluctuation. Five layers show input that has duty to get raw
information which feed the networks. Table 1 compares same criteria of each algorithm: This
index is the root mean square approximation. Root Mean Square Error index for good models is
0.05 or less. The models which there RMSEA is 0.1 have a week processes. This index is a
relative absolute error. Relative absolute error index is better to have a higher value. This
index is expressed as a percentage.
IBK K* MLP
Root Squared Error 0.051 0.42 0.52
Relative Absolute
Error
1.078 82.75100.1189
Table 1. Compares criteria of RSE and RAE
Also association rules and classification rules which induction of decision trees used for
conclusion. Some simple of association rules and classification rule are shown below follows
C5 algorithm in Clementine software. For example, analyses of the 5 roll are discussed below:
First roll:
If first month
And first day, then price decreases.
If second day and the average price is less than 1.745 and on Monday and Thursday, the price
decreases.
if Tuesday and
the average price is smaller than 432 or less then price decreases. the average price is greater
than 432, the price increases.
on Wednesday, the price increases. on Friday, the price decreases.
the average price is greater than 1.745, the price increases.
on day nine, on Monday, Tuesday, Wednesday, Thursday and Friday, the prices decrease
6. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 4, August 2013
126
Second roll:
on days 10, 15, 17, 20, 21, 22, 29, the price increases.
If on first month, 9.48 percent done shopping that 18.01 percent are on Monday. Trading
volume in January was higher. It seems after the holidays ensure the relative movement at the
end of the year causes the market itself shows good mobility. 9.48 percent for trades made in
January, confirmed it. Statistics show that on Mondays in January, more trading takes place
during the week than the rest of. On Fridays we also have a relative increase in trading volume
and price fluctuation. It is well because it could be time to declare the U.S. unemployment
statistics attributed. The listed statistics affected more on the trend of equality exchange between
dollars to euro rate. Since it shows the strength of the U.S. economy and future prospects in the
short term. Successful transaction is not so simple and easy and needs several components
such as extensive knowledge, understanding market conditions and confidence and composure.
In currency markets, the timing and on time entering into the trade are the most important
factor in successful deal but sometimes deter- mine the right time to deal is unknown. Never
expect that every transaction could earn profits. Trading in foreign exchange markets based
on conjecture and estimates and can be caused losses. However, these transactions can be
exciting, teaser and even addictive. Whatever you have more dependence on your money and
investment, making decisions with a comfortable mind about it would be more di cult. Your
money is worth so with the money that you need to live, you should never be traded. Before the
transaction you should know what the market situation. Whether the process has upside,
downside, or is neutral. Whether this trend is strong or weak? And did so before the start of
the process or process is new? Obtain the clear and accurate picture of market position, cause to
the successful transaction. Many traders attempted to transaction without specifying the time
out of the transaction. Of course not doubt that the first goal is profit; however the trader
should be focused his mind on exactly what to predict the market movements. For the
transaction, carefully considered and determined on the anticipated market moves in a certain
period of time is essential. One of the cases in this area should be considered is out the deal
time. Importance of this issue is that your mind would be ready to make it. Although
specifying the exact time of departure of the transaction is not possible, but to specify the time
before entering the trade is very important. If your number of transactions is high during the day,
technical analysis on the daily charts is less important and is better to use 30- minute charts or an
hour ones. Moreover, you should know start and end of working hours of financial and
economic centers of the world. You should keep in mind these times when doing the trading.
Since hours, volatility, liquidity and market movements are noticeably changed. Can be
speculated synchronize on with the potential of market But it may be too early or too late to
do. Attention to the time of transaction can be effective in the result. The news will be
announced in the market, such as CPI (Consumer Price Index), announced retail sales or the
central banks decide to increase interest rates, can stabilize the previous market movement. Due
to the timing transaction means that know what to expect before the transaction and you can
specify them in advance. Technical analysis can help you to detect when and how the prices
will be changed. If you have doubt about doing a deal correctly and you are not sure, do not
enter the market. Generally, Adjusting and measuring trans- actions in the way to re-enter the
market and trading with other currencies are more rational. In short, with large quantities that
may destroy your account do not deal. And with a slang words; do not put all your eggs in one
basket. What the majority of market plan for situation and movement or would do in future are
called market trends. Which means you will successful if you go to the right direction for trading
7. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 4, August 2013
127
market. It should be mansion that this removal is a very simple and basic image as a process
may take in any time and move in the opposite direction. Technical and fundamental analysis
can determine when to start the process and whether this trend is strong or weak. Market
expectations are indicating tendencies that most traders and analysts of the market and the news
will be announced in the near future. If they expect that interest rates rise then it will. A lot of
changes in market movements will not be seen because the market already announced this
change to react and prepare for it but if the news is announced unlike forecasts, market will
inevitably react strongly to it. Attractive Association Rules Extracted:
• If it is 13th day and 3.15 percent of the purchase is done then 28.87 percent is on Tuesday.
• If it is 4th month and 7.57 percent of the purchase is done then 22.63
percent is on Wednesday.
• If it is 5th month and 3.75 percent of the purchase is done then 22.88 percent is on Friday.
• If it is 12th day and 3.38 percent of the purchase is done then 22.88 percent
is on Friday.
• If it is first month and 9.48 percent of the purchase is done then 18.01 percent is on
Monday.
• If it is 3th day and 3.28 percent of the purchase is done then 16.64 percent
is on Friday.
• If it is 12th day and 3.38 percent of the purchase is done then 16.69 percent is on Wednesday.
• If it is 4th day and 3.39 percent of the purchase is done then 16.81 percent
is on Tuesday.
• If it is 11th day and 3.39 percent of the purchase is done then 16.81 percent is on Tuesday.
• If it is 3th month and 5.01 percent of the purchase is done then 17.98
percent is on Thursday.
• If it is second month and 4.57 percent of the purchase is done then 17.96 percent is on
Wednesday.
• If it is first month and 4.46 percent of the purchase is done then 17.75
percent is on Friday.
• If it is second month and 9.14 percent of the purchase is done then 20.29 percent is on Friday.
• If it is first month and 9.48 percent of the purchase is done then 47.07
percent of price decreases.
11 Conclusion
K* algorithm has a less Root Squared error. MLP algorithm has e more Relative absolute error
than the rest. In order to make a better understanding in the classification, classification
techniques based on the Association Rules, i.e. the association classification. The main purpose
of classification is a prediction in terms of class. While the discovery of association rules
describe relationships between items in transaction database. In association classification,
classifier made section on a subset of association rules called association rules classification.
In association rules classification, after every law is a class attributes. Classifier studies a rule
or set of compatible laws with the object to predict an object tag. The results of work
performed in this field show that association classification act better than Machine learning
Classification algorithms [22] However, association rules algorithms has challenges such as
determine the minimum values, For extracted association rules because first, the algorithm
produces a large number of laws and storage, retrieval, pruning and sorting of these laws is
difficult. Also, to find the best subset of rules for build- ing strong and accurate classification is
8. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 4, August 2013
128
challengeable. [22] However, association rules algorithms has challenges such as determine the
minimum values for extracted association rules because first, the algorithm produces a large
number of laws and storage, retrieval, pruning and sorting of these laws is di cult. Also, to
find the best subset of rules for building strong and accurate classification is challengeable. [9]
In recent years a number of association classification such as CPAR (Han J.; Pei J. Yin Y.;
2000), CMAR (Li W.; Han J.; Pei J.; 2001), MCAR (Thabtah, F.; Cowling P.; 2005),
MMAC (Thabtah F.; Cowling P.;2004) is presented. These algorithms are used various ways
in the discovery, ranking, pruning, prediction and evaluation of interest. To construct classifier
with association rules algorithms, first found complete set of association rules from training data
set and select a subset to make a classifier. Selecting such a subset has different methods. For
example CBA (Liu B.; Hsu W.; Ma Y.; 1998[23] and CMAR algorithms, this method occurs of
selection based on coverage heuristic method. In this method we evaluate complete set of CARs
on educational data sets and considered laws that cover certain number of educational data.
When the classifier made, it evaluate the power of predictive the test data for predicting class
labels. Several AC techniques are presented in recent years. [22] These methods make different
approaches in discover frequent item collection, extraction rules, classification rules, sorting
rules; pruning waste and harmful laws - that leads to the wrong classification and classification
of new educational samples. The first finding from this study is to understand the complexity
of mechanisms in stock price changes. Neural networks models in recent research got success
to predict the indicators. Neural network designed to predict the index of input data interrupt
such interruption of economic factors, along with some of their interrupt has a better performance
than neural networks that have only input index. But this situation not found when the
interrupt input was removed from the index. This shows that macroeconomic variables
associated with stock market indices in this study does not determine a relationship and
indicators applied the most effective from their historical values. Partially adding
macroeconomic variables to the model increases power of provider distribution models, and no
decisive role. Contents stated above indicate that psychological climate prevailing stock market
price changes. Stock market is not yet prices to determine principally but Chartyst theory is
based on yesterday price changes to determine today price changes.
References
1. Dhar and Chou (2001) et al. (1996); earnings surprises,
2. Yang (1999), Zhang et al. (1999), Mckee and Greenstein; Business failure, bankruptcy, or
financial health.
3. Kohzadi et al. (1996), Yao et al. (2000); Stock and commodity prices.
4. Agrawal and Schorling (1996), West et al. (1997), Aiken and Bsat (1999), Wang (1999), Jiang
et al. (2000), Vellido et al. (1999) ; Selected customers, market share, packaging market and
market trends.
5. Chiang et al. (1996), Indro et al. (1999); Investment Performance.
6. Wang and Leu (1996), Wittkemper and Steiner (1996), Desai and Bharati (1998), Saad et al.
(1998), Qi (1999), Leung et al. (2000b), Chen et al. (2003) ; Predicted direction, Index, returns,
risk, Rate of change, Futures stocks and commodity prices, e ciency and ... Stock market
7. Pang-Ning Tan, Michael Steinbach, Vipin Kumar; Introduction to Data Mining, Pearson Addison
Wesley (2005).
8. Ian H. Witten and Eibe Frank, Morgan Kaufmann; Data Mining: Practical Machine
Learning Tools and Techniques (2005).
9. Fadlalla, A., Lin, C. (2001). An Analysis of the Application of Neural Networks in
Finance. Interfaces, Vol. 33, No. 4, 112.
9. International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 4, August 2013
129
10. Wong, B. and Selvi, Y.,(1998), Neural network applications in finance: A review and analysis of
literature (1990-1996), Information & Management, 34, pp. 129- 139
11. Zhang and Hu (1998), Leung et al. (2000a), Nag and Mitra (2000); Currency exchange rate.
12. Krycha,K.,andU.Wagner.(1999).Applicationsof artificial neural networks in man- agement science:
a survey, In: Journal of Retailing and Consumer Services, 6,185-203.
13. G. Peter Zhang, (2004). Neural Networks in Business Forecasting, Idea Group Publishing,
USA.
14. Han J.; Pei J. Yin Y.;( 2000). Mining frequent patterns without candidate gener- ation. In
Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. Dallas,
TX: ACM Press, pp. 112.
15. Li W.; Han J.; Pei J.; (2001). CMAR: Accurate and e cient classification based on multiple-
class association rule. In Proceedings of the International Conference on Data Mining (ICDM01),
San Jose, CA, pp. 369376.
16. Used Data from: http://ch.saxobank.com
17. Arezoo Aghaie, Ali Saeedi, ( 2009) , Using Bayesian Networks for Bankruptcy Pre- diction:
Empirical Evidence from Iranian Companies, ISBN: 978-0-7695-3595-1
18. Hassanali,Saeid, Teimoriasl, 2007, Tehran stock exchang eindex prediction using artificial neural
networks, The Iranian Accounting and Auditing Review,ISSN 1024- 8161
19. M.Nasiri, B.Minaei, A.Hadian, 2007,Comparison-Find Closestdistance, SV- MandC.5, Third
ConferenceonData Mining, Iran University of Science and Technology
20. Thabtah, F.; Cowling P.; Peng Y.; (2005). MCAR: Multi-class classification based on association
rule approach. In Proceeding of the 3rd IEEE International Conference on Computer Systems and
Applications, Cairo, Egypt, pp. 17.
21. Thabtah F.; Cowling P.; Peng Y.; (2004). MMAC: A new multi-class, multi-label associative
classification approach. In Proceedings of the 4th IEEE International Conference on Data Mining
(ICDM04), Brighton, UK, pp. 217224.
22. Pourhassan, Atabak, Minaei-Bidgoli, 2007, 3, Classification using the combinedas- sociation
rulesand classification, Amir Kabir University.
23. Liu B.; Hsu W.; Ma Y.; (1998) Integrating classification and association rule mining. In
Proceedings of the International conference on Knowledge Discovery and Data Mining. New York,
NY: AAAI Press, pp. 8086.