IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
This document summarizes a survey on data mining. It discusses how data mining helps extract useful business information from large databases and build predictive models. Commonly used data mining techniques are discussed, including artificial neural networks, decision trees, genetic algorithms, and nearest neighbor methods. An ideal data mining architecture is proposed that fully integrates data mining tools with a data warehouse and OLAP server. Examples of profitable data mining applications are provided in industries such as pharmaceuticals, credit cards, transportation, and consumer goods. The document concludes that while data mining is still developing, it has wide applications across domains to leverage knowledge in data warehouses and improve customer relationships.
This document provides an overview of big data by discussing its background and definitions. It describes how data has grown exponentially in recent years due to factors like the internet, cloud computing, and internet of things. Big data is defined as data that cannot be processed by traditional technologies due to its huge size, speed of growth, and variety of data types. The document outlines several common definitions of big data, including the 3Vs (volume, velocity, variety) and 4Vs (volume, variety, velocity, value) models. It aims to provide readers with a comprehensive understanding of the emerging field of big data.
A Novel Framework for Big Data Processing in a Data-driven SocietyAnthonyOtuonye
This document summarizes a journal article that proposes a novel big data processing framework. It begins by defining big data and noting the rapid rise in data from sources like social media, sensors, and the internet. It then describes challenges with analyzing this large, complex data. The paper introduces a three-tier big data mining structure that analyzes data from multiple sources on a single platform and provides real-time social feedback. It adopts the HACE theorem to characterize big data's size, heterogeneity, complexity and evolving nature. The framework uses Hadoop's MapReduce for distributed parallel processing. The study aims to fully leverage big data's benefits and enhance large-scale data management and analysis for governments and businesses.
Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. 1 Data mining is an interdisciplinary sub field of computer science and statistics with an overall goal to extract from a data set and transform the information into a comprehensible structure for further use. 1 2 3 4 The process of digging through data to discover hidden connections and predict future trends has a long history. Sometimes referred to as 'knowledge discovery' in databases, the term data mining wasn't coined until the 1990s. What was old is new again, as data mining technology keeps evolving to keep pace with the limitless potential of big data and affordable computing power. Over the last decade, advances in processing power and speed have enabled us to move beyond manual, tedious and time consuming practices to quick, easy and automated data analysis. The more complex the data sets collected, the more potential there is to uncover relevant insights. Rupashi Koul "Overview of Data Mining" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd31368.pdf Paper Url :https://www.ijtsrd.com/engineering/computer-engineering/31368/overview-of-data-mining/rupashi-koul
Big data refers to extremely large data sets that are too large to be processed using traditional data processing applications. It is characterized by high volume, variety, and velocity. Examples of big data sources include social media, jet engines, stock exchanges, and more. Big data can be structured, unstructured, or semi-structured. Key characteristics include volume, variety, velocity, and variability. Analyzing big data can provide benefits like improved customer service, better operational efficiency, and more informed decision making for organizations in various industries.
A Model Design of Big Data Processing using HACE TheoremAnthonyOtuonye
This document presents a model for big data processing using the HACE theorem. It proposes a three-tier data mining structure to provide accurate, real-time social feedback for understanding society. The model adopts Hadoop's MapReduce for big data mining and uses k-means and Naive Bayes algorithms for clustering and classification. The goal is to address challenges of big data and assist governments and businesses in using big data technology.
Real World Application of Big Data In Data Mining Toolsijsrd.com
The main aim of this paper is to make a study on the notion Big data and its application in data mining tools like R, Weka, Rapidminer, Knime,Mahout and etc. We are awash in a flood of data today. In a broad range of application areas, data is being collected at unmatched scale. Decisions that previously were based on surmise, or on painstakingly constructed models of reality, can now be made based on the data itself. Such Big Data analysis now drives nearly every aspect of our modern society, including mobile services, retail, manufacturing, financial services, life sciences, and physical sciences. The paper mainly focuses different types of data mining tools and its usage in big data in knowledge discovery.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
This document summarizes a survey on data mining. It discusses how data mining helps extract useful business information from large databases and build predictive models. Commonly used data mining techniques are discussed, including artificial neural networks, decision trees, genetic algorithms, and nearest neighbor methods. An ideal data mining architecture is proposed that fully integrates data mining tools with a data warehouse and OLAP server. Examples of profitable data mining applications are provided in industries such as pharmaceuticals, credit cards, transportation, and consumer goods. The document concludes that while data mining is still developing, it has wide applications across domains to leverage knowledge in data warehouses and improve customer relationships.
This document provides an overview of big data by discussing its background and definitions. It describes how data has grown exponentially in recent years due to factors like the internet, cloud computing, and internet of things. Big data is defined as data that cannot be processed by traditional technologies due to its huge size, speed of growth, and variety of data types. The document outlines several common definitions of big data, including the 3Vs (volume, velocity, variety) and 4Vs (volume, variety, velocity, value) models. It aims to provide readers with a comprehensive understanding of the emerging field of big data.
A Novel Framework for Big Data Processing in a Data-driven SocietyAnthonyOtuonye
This document summarizes a journal article that proposes a novel big data processing framework. It begins by defining big data and noting the rapid rise in data from sources like social media, sensors, and the internet. It then describes challenges with analyzing this large, complex data. The paper introduces a three-tier big data mining structure that analyzes data from multiple sources on a single platform and provides real-time social feedback. It adopts the HACE theorem to characterize big data's size, heterogeneity, complexity and evolving nature. The framework uses Hadoop's MapReduce for distributed parallel processing. The study aims to fully leverage big data's benefits and enhance large-scale data management and analysis for governments and businesses.
Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. 1 Data mining is an interdisciplinary sub field of computer science and statistics with an overall goal to extract from a data set and transform the information into a comprehensible structure for further use. 1 2 3 4 The process of digging through data to discover hidden connections and predict future trends has a long history. Sometimes referred to as 'knowledge discovery' in databases, the term data mining wasn't coined until the 1990s. What was old is new again, as data mining technology keeps evolving to keep pace with the limitless potential of big data and affordable computing power. Over the last decade, advances in processing power and speed have enabled us to move beyond manual, tedious and time consuming practices to quick, easy and automated data analysis. The more complex the data sets collected, the more potential there is to uncover relevant insights. Rupashi Koul "Overview of Data Mining" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd31368.pdf Paper Url :https://www.ijtsrd.com/engineering/computer-engineering/31368/overview-of-data-mining/rupashi-koul
Big data refers to extremely large data sets that are too large to be processed using traditional data processing applications. It is characterized by high volume, variety, and velocity. Examples of big data sources include social media, jet engines, stock exchanges, and more. Big data can be structured, unstructured, or semi-structured. Key characteristics include volume, variety, velocity, and variability. Analyzing big data can provide benefits like improved customer service, better operational efficiency, and more informed decision making for organizations in various industries.
A Model Design of Big Data Processing using HACE TheoremAnthonyOtuonye
This document presents a model for big data processing using the HACE theorem. It proposes a three-tier data mining structure to provide accurate, real-time social feedback for understanding society. The model adopts Hadoop's MapReduce for big data mining and uses k-means and Naive Bayes algorithms for clustering and classification. The goal is to address challenges of big data and assist governments and businesses in using big data technology.
Real World Application of Big Data In Data Mining Toolsijsrd.com
The main aim of this paper is to make a study on the notion Big data and its application in data mining tools like R, Weka, Rapidminer, Knime,Mahout and etc. We are awash in a flood of data today. In a broad range of application areas, data is being collected at unmatched scale. Decisions that previously were based on surmise, or on painstakingly constructed models of reality, can now be made based on the data itself. Such Big Data analysis now drives nearly every aspect of our modern society, including mobile services, retail, manufacturing, financial services, life sciences, and physical sciences. The paper mainly focuses different types of data mining tools and its usage in big data in knowledge discovery.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
Data has become an indispensable part of every economy, industry, organization, business
function and individual. Big Data is a term used to identify the datasets that whose size is
beyond the ability of typical database software tools to store, manage and analyze. The Big
Data introduce unique computational and statistical challenges, including scalability and
storage bottleneck, noise accumulation, spurious correlation and measurement errors. These
challenges are distinguished and require new computational and statistical paradigm. This
paper presents the literature review about the Big data Mining and the issues and challenges
with emphasis on the distinguished features of Big Data. It also discusses some methods to deal
with big data.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Big data is a prominent term which characterizes the improvement and availability of data in all three
formats like structure, unstructured and semi formats. Structure data is located in a fixed field of a record
or file and it is present in the relational data bases and spreadsheets whereas an unstructured data file
includes text and multimedia contents. The primary objective of this big data concept is to describe the
extreme volume of data sets i.e. both structured and unstructured. It is further defined with three “V”
dimensions namely Volume, Velocity and Variety, and two more “V” also added i.e. Value and Veracity.
Volume denotes the size of data, Velocity depends upon the speed of the data processing, Variety is
described with the types of the data, Value which derives the business value and Veracity describes about
the quality of the data and data understandability. Nowadays, big data has become unique and preferred
research areas in the field of computer science. Many open research problems are available in big data
and good solutions also been proposed by the researchers even though there is a need for development of
many new techniques and algorithms for big data analysis in order to get optimal solutions. In this paper,
a detailed study about big data, its basic concepts, history, applications, technique, research issues and
tools are discussed.
This document discusses dimensionality reduction techniques for data mining. It introduces principal component analysis (PCA) as an unsupervised linear algorithm that reduces the dimensionality of a dataset while retaining most of its information. PCA finds new variables, or principal components, that are smaller in number than the original variables. It provides a geometric and algebraic description of PCA. The document also describes a proposed data mining system architecture to examine a university course database. The architecture includes components for data warehousing, online analytical processing tools, and a graphical interface.
New approaches of Data Mining for the Internet of things with systems: Litera...IRJET Journal
This document provides a literature review of new approaches to data mining for the Internet of Things (IoT). It discusses data mining from three perspectives: data view, method view, and application view. In the data view, it examines data mining functions like classification, clustering, association analysis, time series analysis, and anomaly detection. In the method view, it reviews machine learning, statistical, and deep learning techniques used for data mining. In the application view, it discusses data mining applications in industries like business, healthcare, and government. It also discusses challenges of big data mining for IoT and proposes a framework for large-scale IoT data mining using open source tools.
Big DataParadigm, Challenges, Analysis, and ApplicationUyoyo Edosio
Big Data Paradigm: Analysis, Application and Challenges
This document discusses big data, including its definition in terms of volume, variety and velocity; how it is analyzed using machine learning algorithms and distributed storage and processing; applications in various domains like healthcare, transportation and consumer products; and challenges like privacy, noisy data, skills shortage and immature tools. The conclusion recommends further research on hardware, algorithms and computational methods to effectively manage and gain insights from increasingly large data volumes.
Big Data is the new technology or science to make the well informed decision in
business or any other science discipline with huge volume of data from new sources of
heterogeneous data. . Such new sources include blogs, online media, social network, sensor network,
image data and other forms of data which vary in volume, structure, format and other factors. Big
Data applications are increasingly adopted in all science and engineering domains, including space
science, biomedical sciences and astronomic and deep space studies. The major challenges of big
data mining are in data accessing and processing, data privacy and mining algorithms. This paper
includes the information about what is big data, data mining with big data, the challenges in big data
mining and what are the currently available solutions to meet those challenges.
Big data is the term that characterized by its increasing
volume, velocity, variety and veracity. All these characteristics
make processing on this big data a complex task. So, for
processing such data we need to do it differently like map reduce
framework. When an organization exchanges data for mining
useful information from this big data then privacy of the data
becomes an important problem. In the past, several privacy
preserving algorithms have been proposed. Of all those
anonymizing the data has been the most efficient one.
Anonymizing the dataset can be done on several operations like
generalization, suppression, anatomy, specialization, permutation
and perturbation. These algorithms are all suitable for dataset
that does not have the characteristics of the big data. To preserve
the privacy of the large dataset an algorithm was proposed
recently. It applies the top down specialization approach for
anonymizing the dataset and the scalability is increasing my
applying the map reduce frame work. In this paper we survey the
growth of big data, characteristics, map-reduce framework and
all the privacy preserving mechanisms and propose future
directions of our research.
This document provides an introduction to data mining concepts and techniques. It discusses why data mining is needed due to the massive growth of data. It defines data mining as the extraction of interesting patterns from large datasets. The document outlines the key steps in the knowledge discovery process and how data mining fits within business intelligence applications. It also describes different types of data that can be mined and popular data mining algorithms.
Data mining is the process of discovering useful patterns from large amounts of data using statistical, mathematical, and artificial intelligence techniques. It involves applying these techniques to extract and identify useful information from large datasets. Data mining draws from multiple disciplines including statistics, pattern recognition, mathematical modeling, information systems, and machine learning. It has various applications in domains such as customer relationship management, banking, retailing, manufacturing, insurance, software, government, travel, and healthcare. The CRISP-DM process provides a standard methodology for data mining projects involving six steps: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.
Big Data Paradigm - Analysis, Application and ChallengesUyoyo Edosio
This document discusses big data, including its exponential growth driven by the internet and cheaper computing. It defines big data based on its volume, velocity, and variety (the 3 Vs). Hadoop and MapReduce are presented as tools for analyzing large and diverse datasets. The challenges of big data analysis are also discussed, such as noise, privacy issues, and infrastructure costs.
A review on Visualization Approaches of Data mining in heavy spatial databasesIOSR Journals
This document provides a review of visualization approaches for data mining in heavy spatial databases. It begins with introducing concepts of data mining, data visualization, spatial data warehouses, and spatial data mining. It then discusses the literature on the process of visualization including mapping, selection, presentation, interactivity, usability, and evaluation. Challenges of visualization are also reviewed. Different types of visualization methods and their applications in data mining are described, including information visualization, software visualization, scientific visualization, and spatial data mining.
This document provides an overview of big data by exploring its definition, origins, characteristics and applications. It defines big data as large data sets that cannot be processed by traditional software tools due to size and complexity. The creator of big data is identified as Doug Laney who in 2001 defined the 3Vs of big data - volume, velocity and variety. A variety of sectors are discussed where big data is used including social media, science, retail and government. The document concludes by stating we are in the age of big data due to new capabilities to analyze large data sets quickly and cost effectively.
Top 10 Read articles in Web & semantic technologydannyijwest
This document discusses the future of cloud computing in government IT. It begins by defining cloud computing and examining its growing adoption across both private and public sectors globally. The document then outlines challenges for governments transitioning to cloud computing, including workforce and computing resource issues. The author presents a six-step strategy for government agencies to migrate to the cloud. Finally, it explores implications of the continued cloud computing revolution for public sector organizations and the IT community.
Data Science Innovations : Democratisation of Data and Data Science suresh sood
Data Science Innovations : Democratisation of Data and Data Science covers the opportunity of citizen data science lying at the convergence of natural language generation and discoveries in data made by the professions, not data scientists.
This Is just a little overview on it not fully explaned.
Data Mining:-
Data mining is the process of analyzing data from different perspectives and summarizing it into useful information.
Data Mining is the Process that is used by big companies or organizations to handle,balance and analyzing big data.
Data mining is primarily used today by companies with a strong consumer focus - retail, financial, communication, and marketing organizations. It enables these companies to determine relationships among internal factors such as price, product positioning, or staff skills, and external factor.
It is used by the companies to increase their revenue or cut their costs or both.
While large-scale information technology has been evolving separate transaction and analytical systems, data mining provides the link between the two.
Data mining software analyzes relationships and patterns in stored transaction data based on open-ended user queries.
SOFTWARES FOR DATA MINING:
Microsoft SQL SERVER 2005.
Mircrsoft SQL SERVER 2008.
Oracle Data Mining etc.
One Super Market in Canada used data mining capacity of Oracle Software to analyze local buying patterns.They discovered that when Mens bought Food for home on Saturday and Sunday They like to tended to buy beer.On Other days of the week mens don’t usually buy beers.
The Shopkeeper said to his workers to put sufficient amount of beers on Saturday and Sunday.In this way income of the shop was increased.
BI refers to applications & technologies which are use to gather information about their company opertaions.
Data Mining is importand part of business intellegence.
Some Basic Examples of Use of Data Mining Are Given Below:
In Finance Data Mining is used for Credit Cards Analysis.
Astronomy:
Palomar Obstervatory discovered 22 quasars with the help of Data Mining.
3) Telecommunication:
In Telecommunication Data Mining is used for Call Records.
4) Offices:
In Offices it is used for to balance data and records of the staff. etc
Following are some of the types if Data Mining:
Assoication Rule is used for store layout. Etc.
Classification is used for weather prediction. Etc.
Clustering is used for Graphical Represention of Universe.
Sequential Pattern is used for medical diagnosis.
THANK YOU...!!!
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Properties of the cake layer in the ultrafiltration of polydisperse colloidal...eSAT Publishing House
The document discusses a study analyzing the properties of cake layers formed during ultrafiltration of polydisperse silica colloid dispersions. Scanning electron microscopy (SEM) and atomic force microscopy (AFM) were used to analyze fouled membranes and directly compare the particle arrangement and average thickness of cake layers formed under different conditions. Results from SEM and AFM analysis corresponded in terms of particle arrangement and average cake thickness. A resistance model was used to estimate specific cake resistance from filtration rate and cake layer thickness analysis via SEM.
Reconfigurable and versatile bil rc architecture design with an area and powe...eSAT Publishing House
The document describes a proposed modification to the BilRC (Bilkent Reconfigurable Computer) architecture to improve its performance and efficiency. The key aspects of the proposed architecture include using distributed local memory for each processing element, a separate memory for data and instructions, and a "pass-through" operation that allows easier mapping. Simulation results show the proposed BilRC architecture reduces power consumption by 92% and area utilization compared to the original BilRC architecture.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Data has become an indispensable part of every economy, industry, organization, business
function and individual. Big Data is a term used to identify the datasets that whose size is
beyond the ability of typical database software tools to store, manage and analyze. The Big
Data introduce unique computational and statistical challenges, including scalability and
storage bottleneck, noise accumulation, spurious correlation and measurement errors. These
challenges are distinguished and require new computational and statistical paradigm. This
paper presents the literature review about the Big data Mining and the issues and challenges
with emphasis on the distinguished features of Big Data. It also discusses some methods to deal
with big data.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Big data is a prominent term which characterizes the improvement and availability of data in all three
formats like structure, unstructured and semi formats. Structure data is located in a fixed field of a record
or file and it is present in the relational data bases and spreadsheets whereas an unstructured data file
includes text and multimedia contents. The primary objective of this big data concept is to describe the
extreme volume of data sets i.e. both structured and unstructured. It is further defined with three “V”
dimensions namely Volume, Velocity and Variety, and two more “V” also added i.e. Value and Veracity.
Volume denotes the size of data, Velocity depends upon the speed of the data processing, Variety is
described with the types of the data, Value which derives the business value and Veracity describes about
the quality of the data and data understandability. Nowadays, big data has become unique and preferred
research areas in the field of computer science. Many open research problems are available in big data
and good solutions also been proposed by the researchers even though there is a need for development of
many new techniques and algorithms for big data analysis in order to get optimal solutions. In this paper,
a detailed study about big data, its basic concepts, history, applications, technique, research issues and
tools are discussed.
This document discusses dimensionality reduction techniques for data mining. It introduces principal component analysis (PCA) as an unsupervised linear algorithm that reduces the dimensionality of a dataset while retaining most of its information. PCA finds new variables, or principal components, that are smaller in number than the original variables. It provides a geometric and algebraic description of PCA. The document also describes a proposed data mining system architecture to examine a university course database. The architecture includes components for data warehousing, online analytical processing tools, and a graphical interface.
New approaches of Data Mining for the Internet of things with systems: Litera...IRJET Journal
This document provides a literature review of new approaches to data mining for the Internet of Things (IoT). It discusses data mining from three perspectives: data view, method view, and application view. In the data view, it examines data mining functions like classification, clustering, association analysis, time series analysis, and anomaly detection. In the method view, it reviews machine learning, statistical, and deep learning techniques used for data mining. In the application view, it discusses data mining applications in industries like business, healthcare, and government. It also discusses challenges of big data mining for IoT and proposes a framework for large-scale IoT data mining using open source tools.
Big DataParadigm, Challenges, Analysis, and ApplicationUyoyo Edosio
Big Data Paradigm: Analysis, Application and Challenges
This document discusses big data, including its definition in terms of volume, variety and velocity; how it is analyzed using machine learning algorithms and distributed storage and processing; applications in various domains like healthcare, transportation and consumer products; and challenges like privacy, noisy data, skills shortage and immature tools. The conclusion recommends further research on hardware, algorithms and computational methods to effectively manage and gain insights from increasingly large data volumes.
Big Data is the new technology or science to make the well informed decision in
business or any other science discipline with huge volume of data from new sources of
heterogeneous data. . Such new sources include blogs, online media, social network, sensor network,
image data and other forms of data which vary in volume, structure, format and other factors. Big
Data applications are increasingly adopted in all science and engineering domains, including space
science, biomedical sciences and astronomic and deep space studies. The major challenges of big
data mining are in data accessing and processing, data privacy and mining algorithms. This paper
includes the information about what is big data, data mining with big data, the challenges in big data
mining and what are the currently available solutions to meet those challenges.
Big data is the term that characterized by its increasing
volume, velocity, variety and veracity. All these characteristics
make processing on this big data a complex task. So, for
processing such data we need to do it differently like map reduce
framework. When an organization exchanges data for mining
useful information from this big data then privacy of the data
becomes an important problem. In the past, several privacy
preserving algorithms have been proposed. Of all those
anonymizing the data has been the most efficient one.
Anonymizing the dataset can be done on several operations like
generalization, suppression, anatomy, specialization, permutation
and perturbation. These algorithms are all suitable for dataset
that does not have the characteristics of the big data. To preserve
the privacy of the large dataset an algorithm was proposed
recently. It applies the top down specialization approach for
anonymizing the dataset and the scalability is increasing my
applying the map reduce frame work. In this paper we survey the
growth of big data, characteristics, map-reduce framework and
all the privacy preserving mechanisms and propose future
directions of our research.
This document provides an introduction to data mining concepts and techniques. It discusses why data mining is needed due to the massive growth of data. It defines data mining as the extraction of interesting patterns from large datasets. The document outlines the key steps in the knowledge discovery process and how data mining fits within business intelligence applications. It also describes different types of data that can be mined and popular data mining algorithms.
Data mining is the process of discovering useful patterns from large amounts of data using statistical, mathematical, and artificial intelligence techniques. It involves applying these techniques to extract and identify useful information from large datasets. Data mining draws from multiple disciplines including statistics, pattern recognition, mathematical modeling, information systems, and machine learning. It has various applications in domains such as customer relationship management, banking, retailing, manufacturing, insurance, software, government, travel, and healthcare. The CRISP-DM process provides a standard methodology for data mining projects involving six steps: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.
Big Data Paradigm - Analysis, Application and ChallengesUyoyo Edosio
This document discusses big data, including its exponential growth driven by the internet and cheaper computing. It defines big data based on its volume, velocity, and variety (the 3 Vs). Hadoop and MapReduce are presented as tools for analyzing large and diverse datasets. The challenges of big data analysis are also discussed, such as noise, privacy issues, and infrastructure costs.
A review on Visualization Approaches of Data mining in heavy spatial databasesIOSR Journals
This document provides a review of visualization approaches for data mining in heavy spatial databases. It begins with introducing concepts of data mining, data visualization, spatial data warehouses, and spatial data mining. It then discusses the literature on the process of visualization including mapping, selection, presentation, interactivity, usability, and evaluation. Challenges of visualization are also reviewed. Different types of visualization methods and their applications in data mining are described, including information visualization, software visualization, scientific visualization, and spatial data mining.
This document provides an overview of big data by exploring its definition, origins, characteristics and applications. It defines big data as large data sets that cannot be processed by traditional software tools due to size and complexity. The creator of big data is identified as Doug Laney who in 2001 defined the 3Vs of big data - volume, velocity and variety. A variety of sectors are discussed where big data is used including social media, science, retail and government. The document concludes by stating we are in the age of big data due to new capabilities to analyze large data sets quickly and cost effectively.
Top 10 Read articles in Web & semantic technologydannyijwest
This document discusses the future of cloud computing in government IT. It begins by defining cloud computing and examining its growing adoption across both private and public sectors globally. The document then outlines challenges for governments transitioning to cloud computing, including workforce and computing resource issues. The author presents a six-step strategy for government agencies to migrate to the cloud. Finally, it explores implications of the continued cloud computing revolution for public sector organizations and the IT community.
Data Science Innovations : Democratisation of Data and Data Science suresh sood
Data Science Innovations : Democratisation of Data and Data Science covers the opportunity of citizen data science lying at the convergence of natural language generation and discoveries in data made by the professions, not data scientists.
This Is just a little overview on it not fully explaned.
Data Mining:-
Data mining is the process of analyzing data from different perspectives and summarizing it into useful information.
Data Mining is the Process that is used by big companies or organizations to handle,balance and analyzing big data.
Data mining is primarily used today by companies with a strong consumer focus - retail, financial, communication, and marketing organizations. It enables these companies to determine relationships among internal factors such as price, product positioning, or staff skills, and external factor.
It is used by the companies to increase their revenue or cut their costs or both.
While large-scale information technology has been evolving separate transaction and analytical systems, data mining provides the link between the two.
Data mining software analyzes relationships and patterns in stored transaction data based on open-ended user queries.
SOFTWARES FOR DATA MINING:
Microsoft SQL SERVER 2005.
Mircrsoft SQL SERVER 2008.
Oracle Data Mining etc.
One Super Market in Canada used data mining capacity of Oracle Software to analyze local buying patterns.They discovered that when Mens bought Food for home on Saturday and Sunday They like to tended to buy beer.On Other days of the week mens don’t usually buy beers.
The Shopkeeper said to his workers to put sufficient amount of beers on Saturday and Sunday.In this way income of the shop was increased.
BI refers to applications & technologies which are use to gather information about their company opertaions.
Data Mining is importand part of business intellegence.
Some Basic Examples of Use of Data Mining Are Given Below:
In Finance Data Mining is used for Credit Cards Analysis.
Astronomy:
Palomar Obstervatory discovered 22 quasars with the help of Data Mining.
3) Telecommunication:
In Telecommunication Data Mining is used for Call Records.
4) Offices:
In Offices it is used for to balance data and records of the staff. etc
Following are some of the types if Data Mining:
Assoication Rule is used for store layout. Etc.
Classification is used for weather prediction. Etc.
Clustering is used for Graphical Represention of Universe.
Sequential Pattern is used for medical diagnosis.
THANK YOU...!!!
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Properties of the cake layer in the ultrafiltration of polydisperse colloidal...eSAT Publishing House
The document discusses a study analyzing the properties of cake layers formed during ultrafiltration of polydisperse silica colloid dispersions. Scanning electron microscopy (SEM) and atomic force microscopy (AFM) were used to analyze fouled membranes and directly compare the particle arrangement and average thickness of cake layers formed under different conditions. Results from SEM and AFM analysis corresponded in terms of particle arrangement and average cake thickness. A resistance model was used to estimate specific cake resistance from filtration rate and cake layer thickness analysis via SEM.
Reconfigurable and versatile bil rc architecture design with an area and powe...eSAT Publishing House
The document describes a proposed modification to the BilRC (Bilkent Reconfigurable Computer) architecture to improve its performance and efficiency. The key aspects of the proposed architecture include using distributed local memory for each processing element, a separate memory for data and instructions, and a "pass-through" operation that allows easier mapping. Simulation results show the proposed BilRC architecture reduces power consumption by 92% and area utilization compared to the original BilRC architecture.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Low complexity digit serial fir filter by multiple constant multiplication al...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Design and implementation of secured scan based attacks on ic’s by using on c...eSAT Publishing House
The document proposes a scan-protection scheme that provides testing facilities both during chip assembly and over the course of the circuit's life. It compares expected and actual test responses inside the circuit using an efficient principle to scan-in both input vectors and expected responses. This avoids needing to disconnect scan chains after manufacturing testing for security purposes. The proposed approach compares test responses within the chip at the vector level rather than providing the value of each individual scan bit. This ensures security while not relying on costly external test infrastructure. Simulation results show the proposed scheme occupies less area on the chip than existing approaches.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document presents an investigation of the behavior of 3 degree of freedom spring mass damper systems subjected to transient loads. Two models of the spring mass damper system are modeled and their governing equations are derived. The velocities of the oscillators are estimated by solving the governing equations for a loss factor of 0.15. The kinetic and potential energies are calculated using the mass, velocity and stiffness of the oscillators to estimate the total energy. It is found that when the load changes from full to partial over time, there is a significant increase in displacement and velocity, dissipating more energy.
A survey on incremental relaying protocols in cooperative communicationeSAT Publishing House
This document summarizes recent work on incremental relaying protocols in cooperative communication. It begins with an introduction to cooperative communication and relaying protocols such as fixed relaying, selection relaying, and incremental relaying. Incremental relaying is based on feedback and allows relays to retransmit only when needed based on SNR or errors. The document then reviews several specific incremental relaying protocols proposed in other works, including Incremental selection amplify and forward, Joint incremental selection relaying, Fractional Incremental relaying, Selective Fractional Incremental relaying, and Efficient Incremental relaying.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document provides an overview of data mining and discusses:
1. The large amounts of data being generated every day and the need for data mining to analyze this data and discover useful patterns.
2. The data mining process, including data cleaning, integration, selection, transformation, mining, pattern evaluation and knowledge presentation.
3. Common data mining techniques like classification, estimation, prediction, association rules and clustering.
4. Data mining models like decision trees and neural networks that are used to analyze data and discover patterns.
Abstract: In many fields, such as industry, commerce, government, and education, knowledge discovery and data
mining can be immensely valuable to the subject of Artificial Intelligence. Because of the recent increase in
demand for KDD techniques, such as those used in machine learning, databases, statistics, knowledge acquisition,
data visualisation, and high performance computing, knowledge discovery and data mining have grown in
importance. By employing standard formulas for computational correlations, we hope to create an integrated
technique that can be used to filter web world social information and find parallels between similar tastes of
diverse user information in a variety of settings
In the information age, data turns to be the vital. Hence it is important to understand the data in order to face the future information challenges. This paper deals with the importance of data mining while explaining the concepts and life cycle involved. It extracts the basic gist of the topic presented in a user-friendly way. Further, in developing different stages of data mining followed by its extended application usage in practical business platform.
Isolating values from big data with the help of four v’seSAT Journals
Abstract
Big Data refers to the massive amounts of data that collect over time that are difficult to analyze and handle using common database management tools. It includes business transactions, e-mail messages, photos, surveillance videos and activity logs. It also includes unstructured text posted on the Web, such as blogs and social media. Big Data has shown lot of potential in real world industry and research community. We support the power and Potential of it in solving real world problems. However, it is imperative to understand Big Data through the lens of 4 Vs. 4th V as ‘Value’ is desired output for industry challenges and issues. We provide a brief survey study of 4 Vs. of Big Data in order to understand Big Data and extract Value concept in general. Finally we conclude by showing our vision of improved healthcare, a product of Big Data Utilization, as a future work for researchers and students, while moving forward.
Keywords: Big Data, Surveillance videos, blogs, social media, four Vs.
The document provides a literature review on data mining. It discusses data mining concepts such as classification and prediction. Data mining has roots in machine learning, statistics, and artificial intelligence. It involves extracting patterns from large datasets. The document outlines several uses and functions of data mining, including classification, clustering, and anomaly detection. It also gives examples of data mining applications in fields like medicine, banking, insurance, and electronic commerce.
Fundamentals of data mining and its applicationsSubrat Swain
Data mining involves applying intelligent methods to extract patterns from large data sets. It is used to discover useful knowledge from a variety of data sources. The overall goal is to extract human-understandable knowledge that can be used for decision-making.
The document discusses the data mining process, which typically involves problem definition, data exploration, data preparation, modeling, evaluation, and deployment. It also covers data mining software tools and techniques for ensuring privacy, such as randomization and k-anonymity. Finally, it outlines several applications of data mining in fields like industry, science, music, and more.
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET Journal
This document discusses the scope of big data analytics in industrial domains. It begins by defining big data and its key characteristics, known as the "7 V's" - volume, velocity, variety, variability, veracity, value, and volatility. It then discusses how big data is generated in various fields like social media, search engines, healthcare, online shopping, and stock exchanges. The document focuses on how big data analytics can be applied in industrial Internet of Things (IoT) to extract meaningful information from large and continuous data streams generated by IoT devices using machine learning techniques.
This document discusses data mining techniques for big data. It defines big data as large, complex collections of data from various sources that contain both structured and unstructured data. Big data is growing rapidly due to data from sources like social media, sensors, and digital content. Data mining can extract useful insights from big data by discovering patterns and relationships. The document outlines common data mining techniques like classification, prediction, clustering and association rule mining that can be applied to big data. It also discusses challenges of big data like its huge volume, variety of data types, and rapid growth that require new data management approaches.
A Deep Dissertion Of Data Science Related Issues And Its ApplicationsTracy Hill
This document summarizes a paper on data science that discusses its definition, processes, applications, and open research issues. It defines data science as extracting, collecting, and analyzing data for business or technical purposes. The paper describes the typical data science process as involving data wrangling, analysis, and communication. It discusses applications of data science in areas like business analytics, prediction, and healthcare. Finally, it outlines open research issues involving integrating data science with emerging technologies like the Internet of Things, cloud computing, and quantum computing.
A STUDY- KNOWLEDGE DISCOVERY APPROACHESAND ITS IMPACT WITH REFERENCE TO COGNI...ijistjournal
As we all know, in the current era, Internet of Things (IOT) word is very booming in technological market and everyone is talking about the term Smart city especially in India and with reference to keyword smart city, IOT comes with it. The Small word IOT but very big responsibility comes on the shoulders of the technical person to Play with it and extract the data from the IOT . IoT its connecting the multiple things this interconnection is in between living as well as non living things and in that communication huge amount of data is generated so tools and technique which are used for knowledge discover we discuss in this paper.
Internet of Things (IOT) and knowledge discovery are the two sides of the coin and both go together. In the absence of one, there is no use of other. This Paper also focuses on types of the data and data generative sources, Knowledge discovery from that data, tools which are useful for the discovery of the knowledge. Technique, which are to be followed for the purpose of discovering meaningful data from the huge amount of data and its impact.
This document summarizes a research paper on using big data methodologies with IoT and its applications. It discusses how big data analytics is being used across various fields like engineering, data management, and more. It also discusses how IoT enables the collection of massive amounts of data from sensors and devices. Machine learning techniques are used to analyze this big data from IoT and enable communication between devices. The document provides examples of domains where big data and IoT are being applied, such as healthcare, energy, transportation, and others. It analyzes the similarities and differences in how big data techniques are used across these IoT domains.
Semantic Web Mining of Un-structured Data: Challenges and OpportunitiesCSCJournals
The management of unstructured data is acknowledged as one of the most critical unsolved problems in data management and business intelligence fields in current times. The major reason for this unresolved problem is primarily because of the actuality that the methods, systems and related tools that have established themselves so successfully converting structured information into business intelligence, simply are ineffective when we try to implement the same on unstructured information. New methods and approaches are very much necessary. It is a known realism that huge amount of information is shared by the organizations across the world over the web. It is, however, significant to observe that this information explosion across the globe has resulted in opening a lot of new avenues to create tools for data management and business intelligence primarily focusing on unstructured data. In this paper, we explore the challenges being faced by information system developers during mining of unstructured data in the context of semantic web and web mining. Opportunities in the wake of these challenges are discussed towards the end of the paper.
This document discusses data mining and provides an overview of the topic. It begins by defining data mining as the process of analyzing large amounts of data to discover hidden patterns and rules. The goal is to analyze this data and summarize it into useful information that can be used to make decisions.
It then describes some common data mining techniques like decision trees, neural networks, and clustering. It also discusses the typical stages of a data mining project, including business understanding, data preparation, modeling, evaluation, and deployment.
Finally, it provides examples of applications for data mining, such as in healthcare to identify patterns in patient data, education to improve learning outcomes, and manufacturing to enhance product quality. In summary, the document outlines the
A Review On Data Mining From Past To The FutureKaela Johnson
This document discusses trends in data mining from the past to the future. It begins by discussing how data mining has evolved from early techniques that focused on numerical data from single databases to current techniques that can handle diverse data types and sources. It outlines several current areas of data mining like hypertext mining, ubiquitous data mining, multimedia mining and spatial/time series mining. It also discusses how computing advances like parallel, distributed and grid technologies have enabled mining large heterogeneous datasets. Finally, it states that the paper will review historical trends, current trends, and explore future trends in data mining.
Artificial intelligence has been a buzz word that is impacting every industry in the world. With the rise of
such advanced technology, there will be always a question regarding its impact on our social life,
environment and economy thus impacting all efforts exerted towards sustainable development. In the
information era, enormous amounts of data have become available on hand to decision makers. Big data
refers to datasets that are not only big, but also high in variety and velocity, which makes them difficult to
handle using traditional tools and techniques. Due to the rapid growth of such data, solutions need to be
studied and provided in order to handle and extract value and knowledge from these datasets for different
industries and business operations. Numerous use cases have shown that AI can ensure an effective supply
of information to citizens, users and customers in times of crisis. This paper aims to analyse some of the
different methods and scenario which can be applied to AI and big data, as well as the opportunities
provided by the application in various business operations and crisis management domains.
Artificial intelligence has been a buzz word that is impacting every industry in the world. With the rise of
such advanced technology, there will be always a question regarding its impact on our social life,
environment and economy thus impacting all efforts exerted towards sustainable development. In the
information era, enormous amounts of data have become available on hand to decision makers. Big data
refers to datasets that are not only big, but also high in variety and velocity, which makes them difficult to
handle using traditional tools and techniques. Due to the rapid growth of such data, solutions need to be
studied and provided in order to handle and extract value and knowledge from these datasets for different
industries and business operations. Numerous use cases have shown that AI can ensure an effective supply
of information to citizens, users and customers in times of crisis. This paper aims to analyse some of the
different methods and scenario which can be applied to AI and big data, as well as the opportunities
provided by the application in various business operations and crisis management domains.
Artificial intelligence has been a buzz word that is impacting every industry in the world. With the rise of such advanced technology, there will be always a question regarding its impact on our social life, environment and economy thus impacting all efforts exerted towards sustainable development. In the information era, enormous amounts of data have become available on hand to decision makers. Big data refers to datasets that are not only big, but also high in variety and velocity, which makes them difficult to handle using traditional tools and techniques. Due to the rapid growth of such data, solutions need to be studied and provided in order to handle and extract value and knowledge from these datasets for different industries and business operations. Numerous use cases have shown that AI can ensure an effective supply of information to citizens, users and customers in times of crisis. This paper aims to analyse some of the different methods and scenario which can be applied to AI and big data, as well as the opportunities provided by the application in various business operations and crisis management domains.
1. Web Mining – Web mining is an application of data mining for di.docxbraycarissa250
1. Web Mining – Web mining is an application of data mining for discovering data patterns from the web. Web mining is of three categories – content mining, structure mining and usage mining. Content mining detects patterns from data collected by the search engine. Structure mining examines the data which is related to the structure of the website while usage mining examines data from the user’s browser. The data collected through web mining is evaluated and analyzed using techniques like clustering, classification, and association. It is a very good topic for the thesis in data mining.
2. Predictive Analytics – Predictive Analytics is a set of statistical techniques to analyze the current and historical data to predict the future events. The techniques include predictive modeling, machine learning, and data mining. In large organizations, predictive analytics help businesses to identify risks and opportunities in their business. Both structured and unstructured data is analyzed to detect patterns. Predictive Analysis is a lengthy process and consist of seven stages which are project defining, data collection, data analysis, statistics, modeling, deployment, and monitoring. It is an excellent choice for research and thesis.
3. Oracle Data Mining – Oracle Data Mining, also referred as ODM, is a component of Oracle Advanced Analytics Database. It provides powerful data mining algorithms to assist the data analysts to get valuable insights from data to predict the future standards. It helps in predicting the customer behavior which will ultimately help in targeting the best customer and cross-selling. SQL functions are used in the algorithm to mine data tables and views. It is also a good choice for thesis and research in data mining and database.
4. Clustering – Clustering is a process in which data objects are divided into meaningful sub-classes known as clusters. Objects with similar characteristics are aggregated together in a cluster. There are distinct models of clustering such as centralized, distributed. In centroid-based clustering, a vector value is assigned to each cluster. There are various applications of clustering in data mining such as market research, image processing, and data analysis. It is also used in credit card fraud detection.
5. Text mining – Text mining or text data mining is a process to extract high-quality information from the text. It is done through patterns and trends devised using statistical pattern learning. Firstly, the input data is structured. After structuring, patterns are derived from this structured data and finally, the output is evaluated and interpreted. The main applications of text mining include competitive intelligence, E-Discovery, National Security, and social media monitoring. It is a trending topic for the thesis in data mining.
6. Fraud Detection – The number of frauds in daily life is increasing in sectors like banking, finance, and government. Accurate detection of fraud is a challenge. Da.
The document discusses tools and techniques for big data analytics, including A/B testing, crowdsourcing, machine learning, and data mining. It provides an overview of the big data analysis pipeline, including data acquisition, information extraction, integration and representation, query processing and analysis, and interpretation. The document also discusses fields where big data is relevant like industry, healthcare, and research. It analyzes tools like A/B testing, machine learning, and data mining techniques in more detail.
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Onyebuchi nosiri
Efficient data filtering algorithm for Big Data technology Telecommunication is a concept aimed at effectively filtering desired information for preventive purposes, the challenges posed by unprecedented rise in volume, variety and velocity of information has necessitated the need for exploring various methods Big Data which is simply a data sets that are so large and complex that traditional data processing tools and technologies cannot cope with is been considered. The process of examining such data to uncover hidden patterns in them was evolved, this was achieved by coming up with an Algorithm comprising of various stages like Artificial neural Network, Backtracking Algorithm, Depth First Search, Branch and Bound and dynamic programming and error check. The algorithm developed gave rise to the flowchart, with each line of block representing a sub-algorithm.
Similar to A comprehensive survey on data mining (20)
Hudhud cyclone caused extensive damage in Visakhapatnam, India in October 2014, especially to tree cover. This will likely impact the local environment in several ways: increased air pollution as trees absorb less; higher temperatures without tree canopy; increased erosion and landslides. It also created large amounts of waste from destroyed trees. Proper management of solid waste is needed to prevent disease spread. Suggested measures include restoring damaged plants, building fountains to reduce heat, mandating light-colored buildings, improving waste management, and educating public on health risks. Overall, changes are needed to water, land, and waste practices to rebuild the environment after the cyclone removed green cover.
Impact of flood disaster in a drought prone area – case study of alampur vill...eSAT Publishing House
1) In September-October 2009, unprecedented heavy rainfall and dam releases caused widespread flooding in Alampur village in Mahabub Nagar district, a historically drought-prone area.
2) The flood damaged or destroyed homes, buildings, infrastructure, crops, and documents. It displaced many residents and cut off the village.
3) The socioeconomic conditions and mud-based construction of homes in the village exacerbated the flood's impacts, making damage more severe and recovery more difficult.
The document summarizes the Hudhud cyclone that struck Visakhapatnam, India in October 2014. It describes the cyclone's formation, rapid intensification to winds of 175 km/h, and landfall near Visakhapatnam. The cyclone caused extensive damage estimated at over $1 billion and at least 109 deaths in India and Nepal. Infrastructure like buildings, bridges, and power lines were destroyed. Crops and fishing boats were also damaged. The document then discusses coping strategies and improvements needed to disaster management plans to better prepare for future cyclones.
Groundwater investigation using geophysical methods a case study of pydibhim...eSAT Publishing House
This document summarizes the results of a geophysical investigation using vertical electrical sounding (VES) methods at 13 locations around an industrial area in India. The VES data was interpreted to generate geo-electric sections and pseudo-sections showing subsurface resistivity variations. Three main layers were typically identified - a high resistivity topsoil, a weathered middle layer, and a basement rock. Pseudo-sections revealed relatively more weathered areas in the northwest and southwest. Resistivity sections helped identify zones of possible high groundwater potential based on low resistivity anomalies sandwiched between more resistive layers. The study concluded the electrical resistivity method was useful for understanding subsurface geology and identifying areas prospective for groundwater exploration.
Flood related disasters concerned to urban flooding in bangalore, indiaeSAT Publishing House
1. The document discusses urban flooding in Bangalore, India. It describes how factors like heavy rainfall, population growth, and improper land use have contributed to increased flooding in the city.
2. Flooding events in 2013 are analyzed in detail. A November rainfall caused runoff six times higher than the drainage capacity, inundating low-lying residential areas.
3. Impacts of urban flooding include disrupted daily life, damaged infrastructure, and decreased economic activity in affected areas. The document calls for improved flood management strategies to better mitigate urban flooding risks in Bangalore.
Enhancing post disaster recovery by optimal infrastructure capacity buildingeSAT Publishing House
This document discusses enhancing post-disaster recovery through optimal infrastructure capacity building. It presents a model to minimize the cost of meeting demand using auxiliary capacities when disaster damages infrastructure. The model uses genetic algorithms to select optimal capacity combinations. The document reviews how infrastructure provides vital services supporting recovery activities and discusses classifying infrastructure into six types. When disaster reduces infrastructure services, a gap forms between community demands and available support, hindering recovery. The proposed research aims to identify this gap and optimize capacity selection to fill it cost-effectively.
Effect of lintel and lintel band on the global performance of reinforced conc...eSAT Publishing House
This document analyzes the effect of lintels and lintel bands on the seismic performance of reinforced concrete masonry infilled frames through non-linear static pushover analysis. Four frame models are considered: a frame with a full masonry infill wall; a frame with a central opening but no lintel/band; a frame with a lintel above the opening; and a frame with a lintel band above the opening. The results show that the full infill wall model has 27% higher stiffness and 32% higher strength than the model with just an opening. Models with lintels or lintel bands have slightly higher strength and stiffness than the model with just an opening. The document concludes lintels and lintel
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...eSAT Publishing House
1) A cyclone with wind speeds of 175-200 kph caused massive damage to the green cover of Gitam University campus in Visakhapatnam, India. Thousands of trees were uprooted or damaged.
2) A study assessed different types of damage to trees from the cyclone, including defoliation, salt spray damage, damage to stems/branches, and uprooting. Certain tree species were more vulnerable than others.
3) The results of the study can help in selecting more wind-resistant tree species for future planting and reducing damage from future storms.
Wind damage to buildings, infrastrucuture and landscape elements along the be...eSAT Publishing House
1) A visual study was conducted to assess wind damage from Cyclone Hudhud along the 27km Visakha-Bheemli Beach road in Visakhapatnam, India.
2) Residential and commercial buildings suffered extensive roof damage, while glass facades on hotels and restaurants were shattered. Infrastructure like electricity poles and bus shelters were destroyed.
3) Landscape elements faced damage, including collapsed trees that damaged pavements, and debris in parks. The cyclone wiped out over half the city's green cover and caused beach erosion around protected areas.
1) The document reviews factors that influence the shear strength of reinforced concrete deep beams, including compressive strength of concrete, percentage of tension reinforcement, vertical and horizontal web reinforcement, aggregate interlock, shear span-to-depth ratio, loading distribution, side cover, and beam depth.
2) It finds that compressive strength of concrete, tension reinforcement percentage, and web reinforcement all increase shear strength, while shear strength decreases as shear span-to-depth ratio increases.
3) The distribution and amount of vertical and horizontal web reinforcement also affects shear strength, but closely spaced stirrups do not necessarily enhance capacity or performance.
Role of voluntary teams of professional engineers in dissater management – ex...eSAT Publishing House
1) A team of 17 professional engineers from various disciplines called the "Griha Seva" team volunteered after the 2001 Gujarat earthquake to provide technical assistance.
2) The team conducted site visits, assessments, testing and recommended retrofitting strategies for damaged structures in Bhuj and Ahmedabad. They were able to fully assess and retrofit 20 buildings in Ahmedabad.
3) Factors observed that exacerbated the earthquake's impacts included unplanned construction, non-engineered buildings, improper prior retrofitting, and defective materials and workmanship. The professional engineers' technical expertise was crucial for effective post-disaster management.
This document discusses risk analysis and environmental hazard management. It begins by defining risk, hazard, and toxicity. It then outlines the steps involved in hazard identification, including HAZID, HAZOP, and HAZAN. The document presents a case study of a hypothetical gas collecting station, identifying potential accidents and hazards. It discusses quantitative and qualitative approaches to risk analysis, including calculating a fire and explosion index. The document concludes by discussing hazard management strategies like preventative measures, control measures, fire protection, relief operations, and the importance of training personnel on safety.
Review study on performance of seismically tested repaired shear wallseSAT Publishing House
This document summarizes research on the performance of reinforced concrete shear walls that have been repaired after damage. It begins with an introduction to shear walls and their failure modes. The literature review then discusses the behavior of original shear walls as well as different repair techniques tested by other researchers, including conventional repair with new concrete, jacketing with steel plates or concrete, and use of fiber reinforced polymers. The document focuses on evaluating the strength retention of shear walls after being repaired with various methods.
Monitoring and assessment of air quality with reference to dust particles (pm...eSAT Publishing House
This document summarizes a study on monitoring and assessing air quality with respect to dust particles (PM10 and PM2.5) in the urban environment of Visakhapatnam, India. Sampling was conducted in residential, commercial, and industrial areas from October 2013 to August 2014. The average PM2.5 and PM10 concentrations were within limits in residential areas but moderate to high in commercial and industrial areas. Exceedance factor levels indicated moderate pollution for residential areas and moderate to high pollution for commercial and industrial areas. There is a need for management measures like improved public transport and green spaces to combat particulate air pollution in the study areas.
Low cost wireless sensor networks and smartphone applications for disaster ma...eSAT Publishing House
This document describes a low-cost wireless sensor network and smartphone application system for disaster management. The system uses an Arduino-based wireless sensor network comprising nodes with various sensors to monitor the environment. The sensor data is transmitted to a central gateway and then to the cloud for analysis. A smartphone app connected to the cloud can detect disasters from the sensor data and send real-time alerts to users to help with early evacuation. The system aims to provide low-cost localized disaster detection and warnings to improve safety.
Coastal zones – seismic vulnerability an analysis from east coast of indiaeSAT Publishing House
This document summarizes an analysis of seismic vulnerability along the east coast of India. It discusses the geotectonic setting of the region as a passive continental margin and reports some moderate seismic activity from offshore in recent decades. While seismic stability cannot be assumed given events like the 2004 tsunami, no major earthquakes have been recorded along this coast historically. The document calls for further study of active faults, neotectonics, and implementation of improved seismic building codes to mitigate vulnerability.
Can fracture mechanics predict damage due disaster of structureseSAT Publishing House
This document discusses how fracture mechanics can be used to better predict damage and failure of structures. It notes that current design codes are based on small-scale laboratory tests and do not account for size effects, which can lead to more brittle failures in larger structures. The document outlines how fracture mechanics considers factors like size effect, ductility, and minimum reinforcement that influence the strength and failure behavior of structures. It provides examples of how fracture mechanics has been applied to problems like evaluating shear strength in deep beams and investigating a failure of an oil platform structure. The document argues that fracture mechanics provides a more scientific basis for structural design compared to existing empirical code provisions.
This document discusses the assessment of seismic susceptibility of reinforced concrete (RC) buildings. It begins with an introduction to earthquakes and the importance of vulnerability assessment in mitigating earthquake risks and losses. It then describes modeling the nonlinear behavior of RC building elements and performing pushover analysis to evaluate building performance. The document outlines modeling RC frames and developing moment-curvature relationships. It also summarizes the results of pushover analyses on sample 2D and 3D RC frames with and without shear walls. The conclusions emphasize that pushover analysis effectively assesses building properties but has limitations, and that capacity spectrum method provides appropriate results for evaluating building response and retrofitting impact.
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...eSAT Publishing House
1) A 6.0 magnitude earthquake occurred off the coast of Paradip, Odisha in the Bay of Bengal on May 21, 2014 at a depth of around 40 km.
2) Analysis of magnetic and bathymetric data from the area revealed the presence of major lineaments in NW-SE and NE-SW directions that may be responsible for seismic activity through stress release.
3) Movements along growth faults at the margins of large Bengal channels, due to large sediment loads, could also contribute to seismic events by triggering movements along the faults.
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...eSAT Publishing House
This document discusses the effects of Cyclone Hudhud on the development of Visakhapatnam as a smart and green city through a case study and preliminary surveys. The surveys found that 31% of participants had experienced cyclones, 9% floods, and 59% landslides previously in Visakhapatnam. Awareness of disaster alarming systems increased from 14% before the 2004 tsunami to 85% during Cyclone Hudhud, while awareness of disaster management systems increased from 50% before the tsunami to 94% during Hudhud. The surveys indicate that initiatives after the tsunami improved awareness and preparedness. Developing Visakhapatnam as a smart, green city should consider governance
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...DharmaBanothu
The Network on Chip (NoC) has emerged as an effective
solution for intercommunication infrastructure within System on
Chip (SoC) designs, overcoming the limitations of traditional
methods that face significant bottlenecks. However, the complexity
of NoC design presents numerous challenges related to
performance metrics such as scalability, latency, power
consumption, and signal integrity. This project addresses the
issues within the router's memory unit and proposes an enhanced
memory structure. To achieve efficient data transfer, FIFO buffers
are implemented in distributed RAM and virtual channels for
FPGA-based NoC. The project introduces advanced FIFO-based
memory units within the NoC router, assessing their performance
in a Bi-directional NoC (Bi-NoC) configuration. The primary
objective is to reduce the router's workload while enhancing the
FIFO internal structure. To further improve data transfer speed,
a Bi-NoC with a self-configurable intercommunication channel is
suggested. Simulation and synthesis results demonstrate
guaranteed throughput, predictable latency, and equitable
network access, showing significant improvement over previous
designs
This study Examines the Effectiveness of Talent Procurement through the Imple...DharmaBanothu
In the world with high technology and fast
forward mindset recruiters are walking/showing interest
towards E-Recruitment. Present most of the HRs of
many companies are choosing E-Recruitment as the best
choice for recruitment. E-Recruitment is being done
through many online platforms like Linkedin, Naukri,
Instagram , Facebook etc. Now with high technology E-
Recruitment has gone through next level by using
Artificial Intelligence too.
Key Words : Talent Management, Talent Acquisition , E-
Recruitment , Artificial Intelligence Introduction
Effectiveness of Talent Acquisition through E-
Recruitment in this topic we will discuss about 4important
and interlinked topics which are
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...IJCNCJournal
Paper Title
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation with Hybrid Beam Forming Power Transfer in WSN-IoT Applications
Authors
Reginald Jude Sixtus J and Tamilarasi Muthu, Puducherry Technological University, India
Abstract
Non-Orthogonal Multiple Access (NOMA) helps to overcome various difficulties in future technology wireless communications. NOMA, when utilized with millimeter wave multiple-input multiple-output (MIMO) systems, channel estimation becomes extremely difficult. For reaping the benefits of the NOMA and mm-Wave combination, effective channel estimation is required. In this paper, we propose an enhanced particle swarm optimization based long short-term memory estimator network (PSOLSTMEstNet), which is a neural network model that can be employed to forecast the bandwidth required in the mm-Wave MIMO network. The prime advantage of the LSTM is that it has the capability of dynamically adapting to the functioning pattern of fluctuating channel state. The LSTM stage with adaptive coding and modulation enhances the BER.PSO algorithm is employed to optimize input weights of LSTM network. The modified algorithm splits the power by channel condition of every single user. Participants will be first sorted into distinct groups depending upon respective channel conditions, using a hybrid beamforming approach. The network characteristics are fine-estimated using PSO-LSTMEstNet after a rough approximation of channels parameters derived from the received data.
Keywords
Signal to Noise Ratio (SNR), Bit Error Rate (BER), mm-Wave, MIMO, NOMA, deep learning, optimization.
Volume URL: https://airccse.org/journal/ijc2022.html
Abstract URL:https://aircconline.com/abstract/ijcnc/v14n5/14522cnc05.html
Pdf URL: https://aircconline.com/ijcnc/V14N5/14522cnc05.pdf
#scopuspublication #scopusindexed #callforpapers #researchpapers #cfp #researchers #phdstudent #researchScholar #journalpaper #submission #journalsubmission #WBAN #requirements #tailoredtreatment #MACstrategy #enhancedefficiency #protrcal #computing #analysis #wirelessbodyareanetworks #wirelessnetworks
#adhocnetwork #VANETs #OLSRrouting #routing #MPR #nderesidualenergy #korea #cognitiveradionetworks #radionetworks #rendezvoussequence
Here's where you can reach us : ijcnc@airccse.org or ijcnc@aircconline.com
Blood finder application project report (1).pdfKamal Acharya
Blood Finder is an emergency time app where a user can search for the blood banks as
well as the registered blood donors around Mumbai. This application also provide an
opportunity for the user of this application to become a registered donor for this user have
to enroll for the donor request from the application itself. If the admin wish to make user
a registered donor, with some of the formalities with the organization it can be done.
Specialization of this application is that the user will not have to register on sign-in for
searching the blood banks and blood donors it can be just done by installing the
application to the mobile.
The purpose of making this application is to save the user’s time for searching blood of
needed blood group during the time of the emergency.
This is an android application developed in Java and XML with the connectivity of
SQLite database. This application will provide most of basic functionality required for an
emergency time application. All the details of Blood banks and Blood donors are stored
in the database i.e. SQLite.
This application allowed the user to get all the information regarding blood banks and
blood donors such as Name, Number, Address, Blood Group, rather than searching it on
the different websites and wasting the precious time. This application is effective and
user friendly.
Digital Twins Computer Networking Paper Presentation.pptxaryanpankaj78
A Digital Twin in computer networking is a virtual representation of a physical network, used to simulate, analyze, and optimize network performance and reliability. It leverages real-time data to enhance network management, predict issues, and improve decision-making processes.
Supermarket Management System Project Report.pdfKamal Acharya
Supermarket management is a stand-alone J2EE using Eclipse Juno program.
This project contains all the necessary required information about maintaining
the supermarket billing system.
The core idea of this project to minimize the paper work and centralize the
data. Here all the communication is taken in secure manner. That is, in this
application the information will be stored in client itself. For further security the
data base is stored in the back-end oracle and so no intruders can access it.
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...DharmaBanothu
Natural language processing (NLP) has
recently garnered significant interest for the
computational representation and analysis of human
language. Its applications span multiple domains such
as machine translation, email spam detection,
information extraction, summarization, healthcare,
and question answering. This paper first delineates
four phases by examining various levels of NLP and
components of Natural Language Generation,
followed by a review of the history and progression of
NLP. Subsequently, we delve into the current state of
the art by presenting diverse NLP applications,
contemporary trends, and challenges. Finally, we
discuss some available datasets, models, and
evaluation metrics in NLP.
1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 185
A COMPREHENSIVE SURVEY ON DATA MINING Kautkar Rohit A1 1M. Tech Scholar, Computer science and technology, Maharashtra Institute of Technology (MIT), Aurangabad, Maharashtra, India Abstract Now a day’s internet is a significant place for interchanging of data like text, images, audio, and video and for share-out information preferably in digital form. The usage of internet leads to accessing the immense amount of data. Data may be unstructured data, structure data, and semi-structured data. So we are storing and processing such vast amount of data having gigantic complexity. Researchers from the University of Berkeley estimate that every year about one Exabyte (= 1 Million Terabyte) of data brought forth, of which a large portion is in digital form. For ex., 100 hours of videos are uploaded on YouTube every minute (statistics from YouTube). The query is how to analyze such ample data effectively and efficiently? Answer is data mining. Data mining pertains to the process of analyzing, studying such declamatory quantity of data for witnessing useful patterns and knowledge. Today we have accumulation of large amount of data but deficiency of knowledge. In this paper our focusing on surveillance of a nimble arising field data mining which is also known as knowledge discovery from data (KDD). We are constituting fundamentals of data mining, also several strategies for analyzing data like classification, estimation, prediction, association rules, clustering etc., data mining process, its types like web, content and structure mining. After understating basics, we are presenting the different data mining models like decision tree, neural network with their bedrocks. Also presents real world applications and future scope of the dynamic and extraordinary discipline. Keywords: - Internet, Data, Unstructured Data, Structure Data, Semi-Structured Data, Data Mining, Knowledge Discovery from Data (KDD)
--------------------------------------------------------------------***---------------------------------------------------------------------- 1. INTRODUCTION Now day’s internet is a crucial place for interchanging the information and for sharing, dealing out the digital contents like text, audio, video, images etc. In that social networking sites acts crucial role like face book (www.facebook.com), LinkedIn micro blogging sites like Twitter (www.twitter.com) for sharing digital contents. So they produce, process huge and gigantic quantity of data every day in fact every minute. For example According to public statistics more than 48 hours of video contents are uploaded every minute and billions views are rendered every day on YouTube (www.youtube.com). Researchers from the University of Berkeley estimate that every year about 1 Exabyte (= 1 Million Terabyte) of data are generated, of which a large portion is available in digital form [3]. For goes on updates it connects to social networking sites like Face book, Twitter etc. so amount of data produce is quite vast and may have size in millions megabytes (or more) and complex in structure. So it leads to use highly efficient, advanced tools and techniques for intension of analyzing, processing such data. Analyzing and processing of data allow comprehending useful information and knowledge about the data. The term "Data mining" is acquainted in 1990s [12]. So inquisitory for knowledge in data is nothing but data mining [13]. Mining is important since it grant learning about versatile trends lives in data [18].
For example, Google search engine which accept and process huge amount of information every minute. If it just have a large accumulation of data and do not have knowledge regarding it? Then it’s pathetic. On other hand, if we employ data mining tools and techniques then we may find practicable patterns and trends. Google demonstrates different current trends according to geographical area which is example of gaining knowledge from user submitted queries. Search engines and social networking sites like Google and face book produce, store and process huge amount of data in terms of volume, velocity and variety denoted as big data. Big data integrates structured, semi-structured and structured data. So in reality data mining go through big data which in huge in size and complex in structure. The ambitious matter is that, it is continuously and rapidly increasing. There are several reasons for increasing the data on internet in rapid fashion. The use of internet leads to access, process and creation of data. For example, the use of face book leads sharing or uploading any images, videos. It’s nothing but creation of data. Another example is use of twitter. When we post any twit then it contributes to creation of Data. The Company’s which deals with huge amount of data everyday for them the field of data mining is vital.
2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 186
The data over internet may available in different types. Figure 1 depicted its types like structure, semi-structure and structure. When we have vast amount of data then it produces big data which creates another challenge in every aspect. The sources of data may be internal or external. Such data analyzed by predefined tools and technique referred as data mining. It allows discovering utile trends and patterns in data.
Fig 1: Types of Data The simplest example of pattern, milk can be purchased with bread or computer can be purchased with Antivirus software [13]. According to Gartner group (www.gartner.com), data mining is the process of discovering meaningful correlations, patterns and trends by shifting large amount of data store in repositories, using pattern recognition technologies as well as statistical and mathematical techniques [12]. We have large amount of data available but don’t have knowledge concerning it. So the data mining imparts a way for experiencing knowledge from data. For example, Wall mart a megastore, which records millions of transaction every day, give rise to creation of huge data [13]. So it we don’t have a satisfying way to canvass it then its utility will be atrophied. So the usefulness can be enhanced by analyzing it with different tools and techniques. At present, the data mining established its grandness in field of information retrieval and processing and there is no uncertainty that in near future, it will keep arising. Data mining also called Knowledge discovery in Database (KDD) i.e. the process of exploring useful patterns and knowledge from data. Knowledge discovery is the treatment of analyzing data for future projection [21]. To gratify the demand of analyzing the available large volume data sets, more efficiently the data mining employs diverse fields like Machine learning, Information retrieval, Statistics, and visualization. Figure 2 shows; there are two primal intentions of data mining process- prediction and description. Prediction permit to influence the stranger or future values of pursuit where as description grant noticing patterns and describing the data [6]. The Meta Group estimates that the market size for data mining market will grow from $50 million in 1996 to $800 million by 2000 [11].
As shown in figure 3, the consolidation of diverse fields leads to new dynamic, encouraged area data mining. The data mining construct is actually formulated from statistics and machine learning. It also consolidates artificial intelligence, Information retrieval, visualization etc. We can conflate data mining with machine learning techniques for deceitful usage of cellular telephones grounded on profiling customers [29]. Fundamentally machines ascertain from data collected in past like humans learns from past experience and every time uses yesteryear experience to perform more beneficial. It is similar to supervised learning also referred as classification. It uses classification algorithms like decision tree or naive bayes [14]. When genuine utile pattern detected, then its usefulness can be determined with assist of statistics. We may visualize our data that to analyzed, the data visualization is ministrant for discovering interesting patterns easily. Basically computers cater easy mode to view data in more powerful and efficient manner. Now there are more powerful visualization tools are available [18]. The visualization tools can help for discovering approximate idea about the kind of data and its quality, also to anticipate where the useful patterns will situate [30]. At start-of paper, we stated, the internet is place for interchanging the digital contents over web. So it having gigantic size of information available and to explore accurate information in which user is interested is challenge. The search is most prominent practical application of web. The field which is participating in this problem is Text mining. The target of Information retrieval system is to elicit documents pertained to user query (a set of keywords) from huge amount of available information. The user’s information demand can be provide in form of query to Search engines [14]. Text analysis is performed for ordering documents. After exploring documents, their degree of matching keyword query is used to rank those [19]. Different commercial algorithms used by search engines. Basically Information Retrieval (IR) Systems are employs Information agents [30] which search for appropriate documents. The challenge is retrieve relevant documents to user query with fast response. Also the web pages are not just text; they also with hyperlinks and anchored text [14]. Data mining system can be very complex or simple as it integrates different arenas. All subfields are important in data mining as they grant constructing solution to a greater extent complex problem. It also support for miscellany of data mining system [13]. So data mining system can be class based on measures like kind of database used for mining, kinds of knowledge mined and levels of abstraction of knowledge mined. Data mining system can be constructed based on what type of pattern we are exploring, regular or irregular [13]. Statistics are useful for assess data. The data may continuous or discrete. So different themes used to measure continuous data like average, mean, median, mode. To measure discrete data, histograms can be used which basically represent single values [31].
3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 187
Fig 2: Data Mining Process Goal
Fig 3: Association of data mining with other fields 2. RELATED WORK Various data mining techniques that allow extracting unidentified relationships among the data items from large data collection that are useful for decision making [24]. Techniques for performing clustering on supervised data proposed [23]. Data mining techniques can play an important role in rule refinement even if the sample size is limited [8]. Artificial neural network models are proposed for imitate the human brain model. Decision tree allow segmenting the data in to appropriate groups. Both artificial neural network and decision tree have idea where to look for patterns in data. The use of market basket analysis & clustering techniques does not require any knowledge about relationships in the data, knowledge is discovered when these techniques are applied to the data. Market basket analysis tools sift through data to let retailers know what products are being purchased together. Clusters prove to be most useful when they are integrated into a marketing strategy [8]. Different marketing schemes shows that the firms profit can be increased by focusing on different marketing resources [25]. Apriori Algorithm [26] for data mining for frequent pattern mining proposed which uses prior knowledge about data. Different approaches to automatic construction of web pages by using log file data proposed [27]. The expected location of web pages can be handle page caching by browser. For this, algorithm automatically discovers pages in a website whose location is different from where visitors expect to find them [28].
The concept of a KDDM process model was primitively devised in 1989. Initial KD systems provided only a single DM technique, such as a decision tree or clustering algorithm, with a very imperfect support for the overall process framework. Initially it was using exclusive Data mining technique like association rule or clustering or decision tree. In Advances Knowledge Discovery and Data Mining define the process as being highly interactive and complex [24]. They also hint that KDDM process may comprise aggregation of Data mining techniques for processing and observing more complex patterns [10]. 3. DATA MINING STRATEGIES AND PROCESS 3.1 Data Mining Strategies There are following strategies for analyzing data items-
1. Classification in which items are ordered /classified in to usable target classes.
2. Estimation allows ascertaining unidentified output variables.
3. Prediction allows determining the future consequence. It is similar to classification and estimation.
4. Association Rules allow analyzing the big data sets for detecting useful patterns and relationships between data items. Masses of applications are consume association rule concept.
5. Clustering allow grouping of items according to alike properties and behaviors. There are various algorithms are available for clustering like K-means.
3.2. Data Mining Process Exploring utile patterns and knowledge from available data invokes fuse of steps as depicted in figure 4, like [13]-
1. Data Cleaning
As we go through, data is collected from internal or external sources and may contain noise and inconsistent information. Data cleaning phase allow getting rid of noise and inconsistent data. When data set is large, then data cleaning is time demanding process.
2. Data integration
The data may available in scattered structure. So data integration phase allow to blend it from different sources. Phase 1 and 2 are considered as preprocessing phases and accompanying data may be stored in to data warehouse. (Not mandatory).
4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 188
Fig 4: Data Mining Process [22]
3. Data selection
Allow retrieving data from database for analyzing. According to problem domain, the different data sources can be selected.
4. Data transformation
The data transformed into the form which will be appropriate for processing and analyzing.
5. Data mining
Actual grasping of useful knowledge and patterns in data is achieved by data mining.
6. Pattern evaluation
In this, noticing truly interesting patterns, various standards like lift, support, confidence etc. can be used.
7. Knowledge presentation
Visualizing knowledge and representation technique can be used to constitute results. 4. DATA MINING MODELS There are many different data models are available for figuring out the data mining problems. The models are like decision tree, neural network, naive bayes, classifiers, lazy learners etc. 4.1 Decision Trees It is also mentioned as classification tree. The decision tree allows distinguishing data and classifying it into different subdivisions or groups. As shown in Figure 5, it follows the divide and conquers scheme. As it is decision tree it permits drawing decisions by trialing different potential considerations. The node of tree depicting the testing of attributes values. It allows testing of condition with several relational and comparison operators. Decision tree can be used for both description and prediction. Various algorithms are available like ID5, C4.5 and C5 etc. Decision tree act upon data that can be presents using table where each row is keyed out as instance shown in Table 1. Various useful patterns can be extracted from table for learning and knowledge intention as shown in figure 5.
Fig 5: Decision Tree Decision trees are interesting as they are based on rules. They are used to explore relationships amongst the data items. The decision trees are structure used to divide large data set in to small sets based on rules. Rules can be written into English language [31]. Table 1: Table with Different Instance of Data
Sr.No
Employee Name
Age
Gender
Salary
Item Buy
1
Ram
30
M
40000
TV
2
Chetan
35
M
52000
{TV - >VCD Player}
3
Prasad
25
M
25000
{Laptop- >Antivirus Software}
4
Sangeeta
45
F
45000
{Laptop- >Antivirus software}
4.2 Neural Network The concept of neural network is fundamentally inherited from the construction of human brain. Basically the neural networks are used for acquiring knowledge from various data sets and amend the performance of system. In Neural Network the knowledge is presented using interconnected processors addressed neurons. Each neuron consist weight value and with defined function. [1, 5] As indicated in figure 6, there are different input vectors(x1, x2…..xm) with consorted weights (w1, w2…..wm) and possible outcome positive or negative. The neurons are cultivated to reacts each input vector. The vectors from training set is applied over Neural Network, if the output is right then no alteration is made otherwise the weights and bias are adjusted. When the NN is fully trained, then it will classify the applied input vector by using its knowledge. Different classifiers can be used to classify the output.
5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 189
Fig 6: Neural Network 4.3 Association Rules Association rules basically give the relation between different attributes and their values. Gobs of useful associations can be explored by analyzing the data sets. Association rule utilizes the concept of support and confidence to determine the usefulness of rule. Patterns are practicable for probing the buying habits of customer for business analysis. Support for items X1 and Y1 for particular transaction can be given by percentage of transactions from data set which let in both items i.e. X1 U Y1. Confidence for association rule X1->Y1 is given by percentage of transaction from data set containing X1 also with Y1 [13]. The rule is genuinely useful if it has support and confidence value above the defined support count for that dataset. The support count can fix by domain expert [13]. 4.4 Naive Bayes The naive bayes constructs model that predict the probability of particular outcome. The possible outcomes can be predicted by finding patterns and correlations between the data instances. Afterward it picks out the data mining model for representation of the patterns and relations. 4.5 Machine Learning and Statistics As we know the data mining field extensively uses the concepts of machine learning and statistics for mining complex patterns. The statistics and machine learning are really much pertained arenas and in much sense depend on each other. The statistics are basically used for constructing the hypothesis where the hypothesis is used in the machine learning for generalization process. 5. TYPES OF MINING 5.1 Web Mining The web mining grants eliciting information from web documents and services using data mining techniques. Web mining term devised by Etzioni [4].
Internet comprises immense volume of data and is the most declamatory data source for acquiring and apportioning information. The web mining pertains to the discovering useful information from the web hyperlink structure, web page contents and web usage data. There are three classes as shown in figure 7 [4]-
Fig 7: Web Mining Types 5.1.1 Web Structure Mining Web structure mining has to do with discovering the useful knowledge is from the hyperlinks. Basically, hyperlinks represent structure of web. Here, we are interested in structure of web documents and also in structure of hyperlinks [4]. It as well applies concept of social network analysis in which the graph can be used to represent the structure. For example, on twitter, the graph based database (FLOCK DB) can be utilized to represent users with their followers. 5.1.2 Web Content Mining Basically it grants to extract the useful patterns and knowledge from the web page contents. It is based on traditional mining process which allows classifying and clustering web pages (referred as items) according to their types of contents [4].There are two different standpoints are available in concern with web content mining-
1) Information Retrieval 2) DB View
Basically information retrieval is for semi structured and unstructured data. It permits querying the data and finding useful information. Database view is basically for the structured information having proper organization for storage and queries. 5.1.3 Web Usage Mining Web usage mining rivets the technique that could predict user behavior whenever the user interacts with web. The data is viewed in terms of interactivity with web application. The web usage mining basically uses the web logs and server logs for analyzing it. The usage data can be depicted with relational tables or graphs. The usage mining is applies concepts like association rules, machine learning etc. for mining aim. The usage mining is important at many places like web application construction, marketing etc.
6. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 190
The web usage mining is useful in learning the user profile for providing better personalized experience. It is also helpful in the dynamic recommendation systems. Web usage mining permits to study user access patterns for giving better experience [19]. The issue of privacy arises; it exposes the privacy because the user profile data is used for clustering and analyzing the user profiles. 6. APPLICATIONS OF DATA MINING The data mining is applicable in panoptic and diverse areas. The data mining tools and techniques are really play important role in fields where data is integral, highly sensitive and important [3].
1. Retail Industry
Huge amount of data is generated and stored. Data may comprise transaction details of sales, customer shopping etc. The stored data can be utilized for ascertaining the buying habits of customers, ameliorating the business, supervising the shelf space etc.
2. Telecommunication Industry
In telecommunication industry it can be useful for improving service quality, making better use of resources etc.
3. Biological Data Analysis
The data mining is useful here, to interpret the biological sequence and structure.
4. Semantic web
The web pages are analyzed and organized according to their types of contents. It utilizes the proficiency referred as Resource description framework (RDF) for classifying web pages. The RDF is implemented at lots of websites like orkut, face book for tagging purpose.
5. Business Trends
To serve customer more accurately, quickly and efficiently the data mining can be used.
6. Financial Data Analysis
Data collected from different sources of financial sectors and analyzed using data mining.
7. Sports
Many games played worldwide. So on each day many games are planned and played. It causes the vast amount of data creation. The data about each game and player can be analyzed and used to predict the performance of player.
8. Manufacturing Process
In manufacturing the data mining is useful for finding faults in manufacturing process and those faults can be amended. The data is collected from the manufacturing system. 7. CONCLUSIONS
This study paper canvassed the extremely dynamic and substantial area data mining. In this paper we confronted, the basics of data mining, why data mining is important, different strategies for data mining like classification, estimation, prediction, clustering, and association rules. Also we studied that many applications now day’s based on strategies association rule for mining various trends. Also data mining process describes different phases in order to find the useful patterns and knowledge. It too introduces various data mining models like decision tree which allow making decision, NN which mimic the human brain structure and used for mining patterns more accurately, association rules for exploring relationships between data items. Naive bayes, machine learning and statistics are original ancestral of data mining field. The important portion of paper gives the type of web mining like structure mining, usage mining and content mining with their description. At long last it presents the diverse areas uses the data mining. We also resolve that analyzing the huge amount of data with prominent complexity is challenging chore and data mining supplies a direction to deal with such data efficiently and effectively. The data mining is one of the spectacular and challenging field today’s date and no doubts that, hereafter it will play vital primal in many areas. ACKNOWLEDGEMENTS I am enormously grateful to Prof. Mrs. K.V. Bhosale, Prof. V. Kala, Prof. R.B. Mapari, and Prof. Sonavane of Computer science and technology Department, MIT, Aurangabad. As well thankful to Prof. M. M. Goswami, Prof. P. Ghotkar, Prof. Mrs. R.D. Kalambe, Prof. Mrs. M.S. Arade, Prof. Miss. N. S. Nikale, Prof. Mrs. S. P. Dudhe of Government Polytechnic, Nashik. I genuinely thankful to Prof. J. Gorane,Prof. N. A. Anwat who directed me in right way for penning above work. REFERENCES
[1] Singh, Y., & Chauhan, A. S. (2009). Neural Networks In Data Mining. Journal Of Theoretical And Applied Information Technology, 5(6), Pages 36-42.
[2] Keim, D. A. (2002). Information Visualization and Visual Data Mining. Visualization and Computer Graphics, IEEE Transactions On, Volume 8(1), Pages 1-8.
[3] Silwattananusarn, T., & Tuamsuk, K. (2012). Data Mining and Its Applications for Knowledge Management: A Literature Review from 2007 to 2012. Arxiv Preprint Arxiv: 1210.2872.
[4] Kosala, R., & Blockeel, H. (2000). Web Mining Research: A Survey. ACM Sigkdd Explorations Newsletter, Volume 2(1), Pages 1-15.
[5] Lu, H., Setiono, R., & Liu, H. (1996). Effective Data Mining Using Neural Networks. Knowledge and Data Engineering, IEEE Transactions On, Volume 8(6), Pages 957-961.
[6] Chaudhary, R., Singh, P., & Mahajan, R. A SURVEY ON DATA MINING TECHNIQUES.
[7] Jain, N., & Srivastava, V. (2013). DATA MINING TECHNIQUES: A SURVEY PAPER. IJRET: International Journal of Research in Engineering and Technology, Volume 2(11).
7. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 08 | Aug-2014, Available @ http://www.ijret.org 191
[8] Hilage, T. A., & Kulkarni, R. V. (2012). Review of Literature on Data Mining. IJRRAS, Volume 10(1), Pages 107-114.
[9] Mann, A. K., & Kaur, N. (2013). Survey Paper on Clustering Techniques. International Journal of Science, Engineering and Technology Research, Volume 2(4), Pp-0803.
[10] Kurgan, L. A., & Musilek, P. (2006). A Survey Of Knowledge Discovery And Data Mining Process Models. The Knowledge Engineering Review, Volume 21(01), Pages 1-24.
[11] Goebel, M., & Gruenwald, L. (1999). A Survey of Data Mining and Knowledge Discovery Software Tools. ACM SIGKDD Explorations Newsletter, Volume 1(1), Pages 20-33.
[12] Sharma, M. Data Mining: A Literature Survey.
[13] Han, J., & Kamber, M. (2006). Data Mining, Southeast Asia Edition: Concepts and Techniques. Morgan Kaufmann.
[14] Liu, B. (2007). Web Data Mining. Springer-Verlag Berlin Heidelberg.
[15] Bramer, M. (2013). Principles of Data Mining. Springer.
[16] Witten, I. H., & Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.
[17] Wattenhofer, M., Wattenhofer, R., & Zhu, Z. (2012, June). The Youtube Social Network. In ICWSM.
[18] Hand, D. J., Mannila, H., & Smyth, P. (2001). Principles of Data Mining. MIT Press.
[19] Markov, Z., & Larose, D. T. (2007). Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage. John Wiley & Sons.
[20] Larose, D. T. (2006). Data Mining Methods & Models. John Wiley & Sons.
[21] Bhosle, K. (2010). KDS for Sericulture Cocoon Production. International Journal of Computer Applications, Volume 1(18), Pages 86-90.
[22] Raval, K. M. Data Mining Techniques. International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2(10), Pages 439-442.
[23] Ahmed, S., Coenen, F., & Leng, P. (2006). Tree- based partitioning of date for association rule mining. Knowledge and information systems, Volume 10(3), Pages 315-331.
[24] Fayyad, U. M. (1996). Data mining and knowledge discovery: Making sense out of data. IEEE Intelligent Systems, Volume 11(5), Pages 20-25.
[25] Peppers, D. (1993). The one to one future: Building relationship one customer at a time.
[26] Agrawal, R., & Srikant, R. (1994, September). Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB (Volume. 1215, Pages. 487-499).
[27] Perkowitz, M., & Etzioni, O. (1999, July). Adaptive web sites: Conceptual cluster mining. In IJCAI (Volume. 99, Pages. 264-269).
[28] Srikant, R., & Yang, Y. (2001, April). Mining web logs to improve website organization.
In Proceedings of the 10th international conference on World Wide Web (Pages. 430-437). ACM.
[29] Fawcett, T., & Provost, F. J. (1996, August). Combining Data Mining and Machine Learning for Effective User Profiling. In KDD (Pages. 8-13).
[30] Sumathi, S., & Sivanandam, S. N. (2006). Introduction to data mining and its applications. Springer.
[31] Berry, M. J., & Linoff, G. (1997). Data mining techniques: for marketing, sales, and customer support. John Wiley & Sons, Inc.
BIOGRAPHIES
Kautkar Rohit A, Pursuing M. Tech From MIT, Aurangabad. Since last three years he is assisting as a lecturer in Computer Technology at Government Polytechnic, Nasik.
He has accomplished B. Tech (Computer science and Engineering) from SGGSIE&T, Nanded. He also completed Diploma in Information Technology from Government Polytechnic, Nasik. His domain of pursuit is Data Mining as well Big data, Natural Language Processing and web associated fields.