This document reviews recent research on network big data, including different data types (online network data, mobile/IoT data, geography data, spatial temporal data, streaming/real-time data, and visual data), storage models, privacy and security issues, analysis methods, and applications. It discusses the trends in network big data moving from simple online social network data to include more data sources and data that has spatial, temporal, and real-time components. The challenges of analyzing large and diverse datasets from networks are also addressed.
Content an Insight to Security Paradigm for BigData on Cloud: Current Trend a...IJECEIAES
The sucesssive growth of collabrative applications producing Bigdata on timeline leads new opprutinity to setup commodities on cloud infrastructure. Mnay organizations will have demand of an efficient data storage mechanism and also the efficient data analysis. The Big Data (BD) also faces some of the security issues for the important data or information which is shared or transferred over the cloud. These issues include the tampering, losing control over the data, etc. This survey work offers some of the interesting, important aspects of big data including the high security and privacy issue. In this, the survey of existing research works for the preservation of privacy and security mechanism and also the existing tools for it are stated. The discussions for upcoming tools which are needed to be focused on performance improvement are discussed. With the survey analysis, a research gap is illustrated, and a future research idea is presented
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian networkIJECEIAES
Recent progress on real-time systems are growing high in information technology which is showing importance in every single innovative field. Different applications in IT simultaneously produce the enormous measure of information that should be taken care of. In this paper, a novel algorithm of adaptive knowledge-based Bayesian network is proposed to deal with the impact of big data congestion in decision processing. A Bayesian system show is utilized to oversee learning arrangement toward all path for the basic leadership process. Information of Bayesian systems is routinely discharged as an ideal arrangement, where the examination work is to find a development that misuses a measurably inspired score. By and large, available information apparatuses manage this ideal arrangement by methods for normal hunt strategies. As it required enormous measure of information space, along these lines it is a tedious method that ought to be stayed away from. The circumstance ends up unequivocal once huge information include in hunting down ideal arrangement. A calculation is acquainted with achieve quicker preparing of ideal arrangement by constraining the pursuit information space. The proposed algorithm consists of recursive calculation intthe inquiry space. The outcome demonstrates that the ideal component of the proposed algorithm can deal with enormous information by processing time, and a higher level of expectation rates.
Insights on critical energy efficiency approaches in internet-ofthings applic...IJECEIAES
This document summarizes approaches to improving energy efficiency in internet-of-things (IoT) applications. It discusses how cross-layer schemes can be used to optimize network performance and energy efficiency in IoT systems. It also examines optimization-based approaches using machine learning techniques. While various studies have identified methods to improve energy efficiency, open issues remain that need to be addressed to further enhance IoT system performance and outcomes.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
Social network analysis is a method of big data analysis which reveals the nature
of connections between objects, including implicit connections. This is a tool of interest
since it can be applied to large data sets, manual processing of which is very laborintensive,
while automated processing through self-learning linguistic engines requires
a lot of resources. In this regard a study was carried out: it was aimed at development
and testing of social network analysis tools and creating a research algorithm which is
applicable to solve a wide range of analytical and search tasks. The current image of
Russia and its activities in the Arctic was chosen as a case.
The research algorithm helps to discover implicit patterns and trends, relate
information flows and events with relevant newsworthy events and news stories to form
a “clear” view of the study object and key actors which this object is associated with.
The work contributes to filling the gap in scientific literature, caused by insufficient
development of applied issues of using social network analysis to solve managerial
tasks, while theoretical papers, which describe the theory and methodology of such an
analysis, are abundant.
Feature based similarity search in 3 d object databasesunyil96
This document discusses feature-based similarity search in 3D object databases. It surveys existing feature-based methods for 3D object retrieval and proposes a taxonomy for these methods. The document also presents experimental results that compare the effectiveness of some surveyed methods. Similarity search in 3D object databases has applications in fields like computer-aided design, medicine, molecular biology, and military applications. Defining similarity among 3D objects and designing algorithms to implement such similarity definitions is a difficult problem that researchers have worked to solve.
TREND-BASED NETWORKING DRIVEN BY BIG DATA TELEMETRY FOR SDN AND TRADITIONAL N...ijngnjournal
Organizations face a challenge of accurately analyzing network data and providing automated action
based on the observed trend. This trend-based analytics is beneficial to minimize the downtime and
improve the performance of the network services, but organizations use different network management
tools to understand and visualize the network traffic with limited abilities to dynamically optimize the
network. This research focuses on the development of an intelligent system that leverages big data
telemetry analysis in Platform for Network Data Analytics (PNDA) to enable comprehensive trendbased networking decisions. The results include a graphical user interface (GUI) done via a web
application for effortless management of all subsystems, and the system and application developed in
this research demonstrate the true potential for a scalable system capable of effectively benchmarking
the network to set the expected behavior for comparison and trend analysis. Moreover, this research
provides a proof of concept of how trend analysis results are actioned in both a traditional network and
a software-defined network (SDN) to achieve dynamic, automated load balancing.
This document summarizes a research paper on using big data methodologies with IoT and its applications. It discusses how big data analytics is being used across various fields like engineering, data management, and more. It also discusses how IoT enables the collection of massive amounts of data from sensors and devices. Machine learning techniques are used to analyze this big data from IoT and enable communication between devices. The document provides examples of domains where big data and IoT are being applied, such as healthcare, energy, transportation, and others. It analyzes the similarities and differences in how big data techniques are used across these IoT domains.
Content an Insight to Security Paradigm for BigData on Cloud: Current Trend a...IJECEIAES
The sucesssive growth of collabrative applications producing Bigdata on timeline leads new opprutinity to setup commodities on cloud infrastructure. Mnay organizations will have demand of an efficient data storage mechanism and also the efficient data analysis. The Big Data (BD) also faces some of the security issues for the important data or information which is shared or transferred over the cloud. These issues include the tampering, losing control over the data, etc. This survey work offers some of the interesting, important aspects of big data including the high security and privacy issue. In this, the survey of existing research works for the preservation of privacy and security mechanism and also the existing tools for it are stated. The discussions for upcoming tools which are needed to be focused on performance improvement are discussed. With the survey analysis, a research gap is illustrated, and a future research idea is presented
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian networkIJECEIAES
Recent progress on real-time systems are growing high in information technology which is showing importance in every single innovative field. Different applications in IT simultaneously produce the enormous measure of information that should be taken care of. In this paper, a novel algorithm of adaptive knowledge-based Bayesian network is proposed to deal with the impact of big data congestion in decision processing. A Bayesian system show is utilized to oversee learning arrangement toward all path for the basic leadership process. Information of Bayesian systems is routinely discharged as an ideal arrangement, where the examination work is to find a development that misuses a measurably inspired score. By and large, available information apparatuses manage this ideal arrangement by methods for normal hunt strategies. As it required enormous measure of information space, along these lines it is a tedious method that ought to be stayed away from. The circumstance ends up unequivocal once huge information include in hunting down ideal arrangement. A calculation is acquainted with achieve quicker preparing of ideal arrangement by constraining the pursuit information space. The proposed algorithm consists of recursive calculation intthe inquiry space. The outcome demonstrates that the ideal component of the proposed algorithm can deal with enormous information by processing time, and a higher level of expectation rates.
Insights on critical energy efficiency approaches in internet-ofthings applic...IJECEIAES
This document summarizes approaches to improving energy efficiency in internet-of-things (IoT) applications. It discusses how cross-layer schemes can be used to optimize network performance and energy efficiency in IoT systems. It also examines optimization-based approaches using machine learning techniques. While various studies have identified methods to improve energy efficiency, open issues remain that need to be addressed to further enhance IoT system performance and outcomes.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
Social network analysis is a method of big data analysis which reveals the nature
of connections between objects, including implicit connections. This is a tool of interest
since it can be applied to large data sets, manual processing of which is very laborintensive,
while automated processing through self-learning linguistic engines requires
a lot of resources. In this regard a study was carried out: it was aimed at development
and testing of social network analysis tools and creating a research algorithm which is
applicable to solve a wide range of analytical and search tasks. The current image of
Russia and its activities in the Arctic was chosen as a case.
The research algorithm helps to discover implicit patterns and trends, relate
information flows and events with relevant newsworthy events and news stories to form
a “clear” view of the study object and key actors which this object is associated with.
The work contributes to filling the gap in scientific literature, caused by insufficient
development of applied issues of using social network analysis to solve managerial
tasks, while theoretical papers, which describe the theory and methodology of such an
analysis, are abundant.
Feature based similarity search in 3 d object databasesunyil96
This document discusses feature-based similarity search in 3D object databases. It surveys existing feature-based methods for 3D object retrieval and proposes a taxonomy for these methods. The document also presents experimental results that compare the effectiveness of some surveyed methods. Similarity search in 3D object databases has applications in fields like computer-aided design, medicine, molecular biology, and military applications. Defining similarity among 3D objects and designing algorithms to implement such similarity definitions is a difficult problem that researchers have worked to solve.
TREND-BASED NETWORKING DRIVEN BY BIG DATA TELEMETRY FOR SDN AND TRADITIONAL N...ijngnjournal
Organizations face a challenge of accurately analyzing network data and providing automated action
based on the observed trend. This trend-based analytics is beneficial to minimize the downtime and
improve the performance of the network services, but organizations use different network management
tools to understand and visualize the network traffic with limited abilities to dynamically optimize the
network. This research focuses on the development of an intelligent system that leverages big data
telemetry analysis in Platform for Network Data Analytics (PNDA) to enable comprehensive trendbased networking decisions. The results include a graphical user interface (GUI) done via a web
application for effortless management of all subsystems, and the system and application developed in
this research demonstrate the true potential for a scalable system capable of effectively benchmarking
the network to set the expected behavior for comparison and trend analysis. Moreover, this research
provides a proof of concept of how trend analysis results are actioned in both a traditional network and
a software-defined network (SDN) to achieve dynamic, automated load balancing.
This document summarizes a research paper on using big data methodologies with IoT and its applications. It discusses how big data analytics is being used across various fields like engineering, data management, and more. It also discusses how IoT enables the collection of massive amounts of data from sensors and devices. Machine learning techniques are used to analyze this big data from IoT and enable communication between devices. The document provides examples of domains where big data and IoT are being applied, such as healthcare, energy, transportation, and others. It analyzes the similarities and differences in how big data techniques are used across these IoT domains.
Information Fusion Methods for Location Data AnalysisAlket Cecaj
This document outlines a thesis on using data fusion methods for location data analysis. The thesis will examine using aggregated call detail records and social media data to detect and describe events in urban areas. It will also analyze re-identifying anonymized call detail records by fusing them with social network data. The document discusses different location data types, outlier detection methods, evaluating event detection precision and recall, and improving results by combining datasets. It also covers calculating user uniqueness from mobility data and the probabilistic approach to re-identification.
Re-identification of Anomized CDR datasets using Social networlk DataAlket Cecaj
This document discusses re-identifying users across anonymized datasets using social network data. It summarizes related work showing high re-identification rates just from gender, ZIP code and date of birth. The study aims to use social network data to re-identify users in an anonymized call detail records dataset. It describes matching users between the datasets using time and location parameters and probabilistic modeling. The results show the potential and limits of re-identifying users across multiple mobility datasets, and future work is needed to refine the model and address privacy concerns when correlating multiple datasets.
CRITICAL INFRASTRUCTURE CYBERSECURITY CHALLENGES: IOT IN PERSPECTIVEIJNSA Journal
A technology platform that is gradually bridging the gap between object visibility and remote accessibility is the Internet of Things (IoT). Rapid deployment of this application can significantly transform the health, housing, and power (distribution and generation) sectors, etc. It has considerably changed the power sector regarding operations, services optimization, power distribution, asset management and aided in engaging customers to reduce energy consumption. Despite its societal opportunities and the benefits it presents, the power generation sector is bedeviled with many security challenges on the critical infrastructure. This review discusses the security challenges posed by IoT in power generation and critical infrastructure. To achieve this, the authors present the various IoT applications, particularly on the grid infrastructure, from an empirical literature perspective. The authors concluded by discussing how the various entities in the sector can overcome these security challenges to ensure an exemplary future IoT implementation on the power critical infrastructure value chain.
Data fusion for city live event detectionAlket Cecaj
Event detection in urban context by using aggregated mobile activity as for example CDR data and social network data in this case geo-referenced Twitter data. The experiments show that the two datasets - CDR and social data - used, complement each other by providing better event detection results and event detscription.
A tutorial on secure outsourcing of large scalecomputation for big dataredpel dot com
A tutorial on secure outsourcing of large scalecomputation for big data
for more ieee paper / full abstract / implementation , just visit www.redpel.com
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. 1 Data mining is an interdisciplinary sub field of computer science and statistics with an overall goal to extract from a data set and transform the information into a comprehensible structure for further use. 1 2 3 4 The process of digging through data to discover hidden connections and predict future trends has a long history. Sometimes referred to as 'knowledge discovery' in databases, the term data mining wasn't coined until the 1990s. What was old is new again, as data mining technology keeps evolving to keep pace with the limitless potential of big data and affordable computing power. Over the last decade, advances in processing power and speed have enabled us to move beyond manual, tedious and time consuming practices to quick, easy and automated data analysis. The more complex the data sets collected, the more potential there is to uncover relevant insights. Rupashi Koul "Overview of Data Mining" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd31368.pdf Paper Url :https://www.ijtsrd.com/engineering/computer-engineering/31368/overview-of-data-mining/rupashi-koul
The document discusses tools and techniques for big data analytics, including A/B testing, crowdsourcing, machine learning, and data mining. It provides an overview of the big data analysis pipeline, including data acquisition, information extraction, integration and representation, query processing and analysis, and interpretation. The document also discusses fields where big data is relevant like industry, healthcare, and research. It analyzes tools like A/B testing, machine learning, and data mining techniques in more detail.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
Recent changes to the existing power grid are expected to influence the way energy is provided and consumed by customers. Advanced Metering Infrastructure (AMI) is a tool to incorporate these changes for modernizing the electricity grid. Growing energy needs are forcing government agencies and utility companies to move towards AMI systems as part of larger smart grid initiatives. The smart grid promises to enable a more reliable, sustainable, and efficient power grid by taking advantage of information and communication technologies. However, this information-based power grid can reveal sensitive private information from the user’s perspective due to its ability to gather highly granular power consumption data. This has resulted in limited consumer acceptance and proliferation of the smart grid. Hence, it is crucial to design a mechanism to prevent the leakage of such sensitive consumer usage information in
smart grid. Among different solutions for preserving consumer privacy in Smart Grid Network(SGN), private data aggregation techniques have received a tremendous focus from security researchers. Existing privacy-preserving aggregation mechanisms in SGNs utilize cryptographic techniques, specifically homomorphic properties of public-key cryptosystems. Such homomorphic approaches are bandwidthintensive (due to large output blocks they generate), and in most cases, are computationally complex. In this paper, we present a novel and efficient CDMA-based approach to achieve privacy-preserving aggregation in SGNs by utilizing random perturbation of power consumption data and with limited use of traditional cryptography. We evaluate and validate the efficiency and performance of our proposed privacy preserving data aggregation scheme through extensive statistical analyses and simulations.
Due to technological advances, vast data sets (e.g. big data) are increasing now days. Big Data a new term; is used
to identify the collected datasets. But due to their large size and complexity, we cannot manage with our current
methodologies or data mining software tools to extract those datasets. Such datasets provide us with unparalleled
opportunities for modelling and predicting of future with new challenges. So as an awareness of this and
weaknesses as well as the possibilities of these large data sets, are necessary to forecast the future. Today’s we
have an overwhelming growth of data in terms of volume, velocity and variety on web. Moreover this, from a
security and privacy views, both area have an unpredictable growth. So Big Data challenge is becoming one of the
most exciting opportunities for researchers in upcoming years.
Hence this paper discuss about this topic in a broad overview like; its current status; controversy; and challenges to
forecast the future. This paper defines at some of these problems, using illustrations with applications from various
areas. Finally this paper discuss secure management and privacy of big data as one of essential issues.
Achieving Sustainable Development Goals using Computer VisionAkshat Gupta
This paper discusses how computer vision and image classification techniques can help achieve the UN's Sustainable Development Goals by promoting green economy initiatives. It reviews research on using image binarization and feature extraction for content-based image classification. The paper finds that a local thresholding technique called Niblack's method achieves the highest accuracy for classifying images in a sample dataset. Promoting technologies like this could encourage online transactions and digitization, reducing environmental impact while achieving economic and social benefits in line with sustainable development.
This document summarizes an article that discusses the integration of data warehouse and big data technologies. It outlines some of the key differences between data warehouses and big data, such as data warehouses being structured data focused while big data handles structured and unstructured data. However, it argues that both aim to support data exploration and decision making, so they should be integrated. It reviews existing integration proposals and highlights common elements between the technologies that could serve as a basis for full integration. Finally, it proposes an approach to integrating data warehouses and big data.
Evaluation of a Multiple Regression Model for Noisy and Missing Data IJECEIAES
The standard data collection problems may involve noiseless data while on the other hand large organizations commonly experience noisy and missing data, probably concerning data collected from individuals. As noisy and missing data will be significantly worrisome for occasions of the vast data collection then the investigation of different filtering techniques for big data environment would be remarkable. A multiple regression model where big data is employed for experimenting will be presented. Approximation for datasets with noisy and missing data is also proposed. The statistical root mean squared error (RMSE) associated with correlation coefficient (COEF) will be analyzed to prove the accuracy of estimators. Finally, results predicted by massive online analysis (MOA) will be compared to those real data collected from the following different time. These theoretical predictions with noisy and missing data estimation by simulation, revealing consistency with the real data are illustrated. Deletion mechanism (DEL) outperforms with the lowest average percentage of error.
Clouds provide a powerful computing platform that enables individuals and organizations to perform variety levels of tasks such as: use of online storage space, adoption of business applications, development of customized computer software, and creation of a “realistic” network environment. In previous years, the number of people using cloud services has dramatically increased and lots of data has been stored in cloud computing environments. In the meantime, data breaches to cloud services are also increasing every year due to hackers who are always trying to exploit the security vulnerabilities of the architecture of cloud. In this paper, three cloud service models were compared; cloud security risks and threats were investigated based on the nature of the cloud service models. Real world cloud attacks were included to demonstrate the techniques that hackers used against cloud computing systems. In addition,countermeasures to cloud security breaches are presented.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
CLOUD COMPUTING IN THE PUBLIC SECTOR: MAPPING THE KNOWLEDGE DOMAINijmpict
This document summarizes a research study that mapped the knowledge domain of public sector cloud computing. The study utilized 188 research publications from the Web of Science database and the CiteSpace visualization software. Key findings included:
1) The top publishing countries were Canada, the US, Australia, China, and the UK. The top institutions were from Malaysia, Australia, and the UK.
2) Author collaboration was limited, though a few nascent research clusters were identified. The most published authors were from the Middle East, Malaysia, Australia, and China. The most co-cited authors focused on cloud standards and technologies.
3) The top co-cited journals were Communications of the ACM, NIST Special
A survey of big data and machine learning IJECEIAES
This paper presents a detailed analysis of big data and machine learning (ML) in the electrical power and energy sector. Big data analytics for smart energy operations, applications, impact, measurement and control, and challenges are presented in this paper. Big data and machine learning approaches need to be applied after analyzing the power system problem carefully. Determining the match between the strengths of big data and machine learning for solving the power system problem is of utmost important. They can be of great help to plan and operate the traditional grid/smart grid (SG). The basics of big data and machine learning are described in detailed manner along with their applications in various fields such as electrical power and energy, health care and life sciences, government, telecommunications, web and digital media, retailers, finance, e-commerce and customer service, etc. Finally, the challenges and opportunities of big data and machine learning are presented in this paper.
Data-Driven Discovery Science with FAIR Knowledge GraphsMichel Dumontier
Data-Driven Discovery Science with FAIR Knowledge Graphs
Despite the existence of vast amounts of biomedical data, these remain difficult to find and to productively reuse in machine learning and other Artificial Intelligence technologies. In this talk, I will discuss the role of the FAIR Guiding Principles to make AI-ready biomedical data, and their representation as knowledge graphs not only enables powerful ontology-backed semantic queries, but also can be used to predict missing information, as well as to check the quality of knowledge collected.
The main idea of the talk is to introduce the FAIR principles (what they are and what they are not), and how their application with semantic web technologies (ontologies/linked data) creates improved possibilities for large scale data integration, answering sophisticated questions using automated reasoners, and predicting new relations/validating data using graph embeddings. The audience will gain insight into the state of the art in a carefully presented manner that introduces principles, approaches, and outcomes relevant to Health AI.
An efficient approach on spatial big data related to wireless networks and it...eSAT Journals
Abstract
Spatial big data acts as a important key role in wireless networks applications. In that spatial and spatio temporal problems contains the distinct role in big data and it’s compared to common relational problems. If we are solving those problems means describing the three applications for spatial big data. In each applications imposing the specific design and we are developing our work on highly scalable parallel processing for spatial big data in Hadoop frameworks by using map reduce computational model. Our results show that enables highly scalable implementations of algorithms using Hadoop for the purpose of spatial data processing problems. Inspite of developing these implementations requires specialized knowledge and user friendly.
Keywords: Spatial Big Data, Hadoop, Wireless Networks, Map reduce
This document discusses challenges and outlooks related to big data. It begins with an introduction describing how big data is being collected and analyzed in various fields such as science, education, healthcare, urban planning, and more. It then outlines the key phases in big data analysis: data acquisition and recording, information extraction and cleaning, data integration and representation, query processing and analysis, and result interpretation. For each phase, it discusses challenges and how existing techniques can be applied or extended to address big data issues. Some of the major challenges discussed are data scale, heterogeneity, lack of structure, privacy, timeliness, provenance, and visualization across the entire big data analysis pipeline.
Information Fusion Methods for Location Data AnalysisAlket Cecaj
This document outlines a thesis on using data fusion methods for location data analysis. The thesis will examine using aggregated call detail records and social media data to detect and describe events in urban areas. It will also analyze re-identifying anonymized call detail records by fusing them with social network data. The document discusses different location data types, outlier detection methods, evaluating event detection precision and recall, and improving results by combining datasets. It also covers calculating user uniqueness from mobility data and the probabilistic approach to re-identification.
Re-identification of Anomized CDR datasets using Social networlk DataAlket Cecaj
This document discusses re-identifying users across anonymized datasets using social network data. It summarizes related work showing high re-identification rates just from gender, ZIP code and date of birth. The study aims to use social network data to re-identify users in an anonymized call detail records dataset. It describes matching users between the datasets using time and location parameters and probabilistic modeling. The results show the potential and limits of re-identifying users across multiple mobility datasets, and future work is needed to refine the model and address privacy concerns when correlating multiple datasets.
CRITICAL INFRASTRUCTURE CYBERSECURITY CHALLENGES: IOT IN PERSPECTIVEIJNSA Journal
A technology platform that is gradually bridging the gap between object visibility and remote accessibility is the Internet of Things (IoT). Rapid deployment of this application can significantly transform the health, housing, and power (distribution and generation) sectors, etc. It has considerably changed the power sector regarding operations, services optimization, power distribution, asset management and aided in engaging customers to reduce energy consumption. Despite its societal opportunities and the benefits it presents, the power generation sector is bedeviled with many security challenges on the critical infrastructure. This review discusses the security challenges posed by IoT in power generation and critical infrastructure. To achieve this, the authors present the various IoT applications, particularly on the grid infrastructure, from an empirical literature perspective. The authors concluded by discussing how the various entities in the sector can overcome these security challenges to ensure an exemplary future IoT implementation on the power critical infrastructure value chain.
Data fusion for city live event detectionAlket Cecaj
Event detection in urban context by using aggregated mobile activity as for example CDR data and social network data in this case geo-referenced Twitter data. The experiments show that the two datasets - CDR and social data - used, complement each other by providing better event detection results and event detscription.
A tutorial on secure outsourcing of large scalecomputation for big dataredpel dot com
A tutorial on secure outsourcing of large scalecomputation for big data
for more ieee paper / full abstract / implementation , just visit www.redpel.com
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. 1 Data mining is an interdisciplinary sub field of computer science and statistics with an overall goal to extract from a data set and transform the information into a comprehensible structure for further use. 1 2 3 4 The process of digging through data to discover hidden connections and predict future trends has a long history. Sometimes referred to as 'knowledge discovery' in databases, the term data mining wasn't coined until the 1990s. What was old is new again, as data mining technology keeps evolving to keep pace with the limitless potential of big data and affordable computing power. Over the last decade, advances in processing power and speed have enabled us to move beyond manual, tedious and time consuming practices to quick, easy and automated data analysis. The more complex the data sets collected, the more potential there is to uncover relevant insights. Rupashi Koul "Overview of Data Mining" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd31368.pdf Paper Url :https://www.ijtsrd.com/engineering/computer-engineering/31368/overview-of-data-mining/rupashi-koul
The document discusses tools and techniques for big data analytics, including A/B testing, crowdsourcing, machine learning, and data mining. It provides an overview of the big data analysis pipeline, including data acquisition, information extraction, integration and representation, query processing and analysis, and interpretation. The document also discusses fields where big data is relevant like industry, healthcare, and research. It analyzes tools like A/B testing, machine learning, and data mining techniques in more detail.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
Recent changes to the existing power grid are expected to influence the way energy is provided and consumed by customers. Advanced Metering Infrastructure (AMI) is a tool to incorporate these changes for modernizing the electricity grid. Growing energy needs are forcing government agencies and utility companies to move towards AMI systems as part of larger smart grid initiatives. The smart grid promises to enable a more reliable, sustainable, and efficient power grid by taking advantage of information and communication technologies. However, this information-based power grid can reveal sensitive private information from the user’s perspective due to its ability to gather highly granular power consumption data. This has resulted in limited consumer acceptance and proliferation of the smart grid. Hence, it is crucial to design a mechanism to prevent the leakage of such sensitive consumer usage information in
smart grid. Among different solutions for preserving consumer privacy in Smart Grid Network(SGN), private data aggregation techniques have received a tremendous focus from security researchers. Existing privacy-preserving aggregation mechanisms in SGNs utilize cryptographic techniques, specifically homomorphic properties of public-key cryptosystems. Such homomorphic approaches are bandwidthintensive (due to large output blocks they generate), and in most cases, are computationally complex. In this paper, we present a novel and efficient CDMA-based approach to achieve privacy-preserving aggregation in SGNs by utilizing random perturbation of power consumption data and with limited use of traditional cryptography. We evaluate and validate the efficiency and performance of our proposed privacy preserving data aggregation scheme through extensive statistical analyses and simulations.
Due to technological advances, vast data sets (e.g. big data) are increasing now days. Big Data a new term; is used
to identify the collected datasets. But due to their large size and complexity, we cannot manage with our current
methodologies or data mining software tools to extract those datasets. Such datasets provide us with unparalleled
opportunities for modelling and predicting of future with new challenges. So as an awareness of this and
weaknesses as well as the possibilities of these large data sets, are necessary to forecast the future. Today’s we
have an overwhelming growth of data in terms of volume, velocity and variety on web. Moreover this, from a
security and privacy views, both area have an unpredictable growth. So Big Data challenge is becoming one of the
most exciting opportunities for researchers in upcoming years.
Hence this paper discuss about this topic in a broad overview like; its current status; controversy; and challenges to
forecast the future. This paper defines at some of these problems, using illustrations with applications from various
areas. Finally this paper discuss secure management and privacy of big data as one of essential issues.
Achieving Sustainable Development Goals using Computer VisionAkshat Gupta
This paper discusses how computer vision and image classification techniques can help achieve the UN's Sustainable Development Goals by promoting green economy initiatives. It reviews research on using image binarization and feature extraction for content-based image classification. The paper finds that a local thresholding technique called Niblack's method achieves the highest accuracy for classifying images in a sample dataset. Promoting technologies like this could encourage online transactions and digitization, reducing environmental impact while achieving economic and social benefits in line with sustainable development.
This document summarizes an article that discusses the integration of data warehouse and big data technologies. It outlines some of the key differences between data warehouses and big data, such as data warehouses being structured data focused while big data handles structured and unstructured data. However, it argues that both aim to support data exploration and decision making, so they should be integrated. It reviews existing integration proposals and highlights common elements between the technologies that could serve as a basis for full integration. Finally, it proposes an approach to integrating data warehouses and big data.
Evaluation of a Multiple Regression Model for Noisy and Missing Data IJECEIAES
The standard data collection problems may involve noiseless data while on the other hand large organizations commonly experience noisy and missing data, probably concerning data collected from individuals. As noisy and missing data will be significantly worrisome for occasions of the vast data collection then the investigation of different filtering techniques for big data environment would be remarkable. A multiple regression model where big data is employed for experimenting will be presented. Approximation for datasets with noisy and missing data is also proposed. The statistical root mean squared error (RMSE) associated with correlation coefficient (COEF) will be analyzed to prove the accuracy of estimators. Finally, results predicted by massive online analysis (MOA) will be compared to those real data collected from the following different time. These theoretical predictions with noisy and missing data estimation by simulation, revealing consistency with the real data are illustrated. Deletion mechanism (DEL) outperforms with the lowest average percentage of error.
Clouds provide a powerful computing platform that enables individuals and organizations to perform variety levels of tasks such as: use of online storage space, adoption of business applications, development of customized computer software, and creation of a “realistic” network environment. In previous years, the number of people using cloud services has dramatically increased and lots of data has been stored in cloud computing environments. In the meantime, data breaches to cloud services are also increasing every year due to hackers who are always trying to exploit the security vulnerabilities of the architecture of cloud. In this paper, three cloud service models were compared; cloud security risks and threats were investigated based on the nature of the cloud service models. Real world cloud attacks were included to demonstrate the techniques that hackers used against cloud computing systems. In addition,countermeasures to cloud security breaches are presented.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
CLOUD COMPUTING IN THE PUBLIC SECTOR: MAPPING THE KNOWLEDGE DOMAINijmpict
This document summarizes a research study that mapped the knowledge domain of public sector cloud computing. The study utilized 188 research publications from the Web of Science database and the CiteSpace visualization software. Key findings included:
1) The top publishing countries were Canada, the US, Australia, China, and the UK. The top institutions were from Malaysia, Australia, and the UK.
2) Author collaboration was limited, though a few nascent research clusters were identified. The most published authors were from the Middle East, Malaysia, Australia, and China. The most co-cited authors focused on cloud standards and technologies.
3) The top co-cited journals were Communications of the ACM, NIST Special
A survey of big data and machine learning IJECEIAES
This paper presents a detailed analysis of big data and machine learning (ML) in the electrical power and energy sector. Big data analytics for smart energy operations, applications, impact, measurement and control, and challenges are presented in this paper. Big data and machine learning approaches need to be applied after analyzing the power system problem carefully. Determining the match between the strengths of big data and machine learning for solving the power system problem is of utmost important. They can be of great help to plan and operate the traditional grid/smart grid (SG). The basics of big data and machine learning are described in detailed manner along with their applications in various fields such as electrical power and energy, health care and life sciences, government, telecommunications, web and digital media, retailers, finance, e-commerce and customer service, etc. Finally, the challenges and opportunities of big data and machine learning are presented in this paper.
Data-Driven Discovery Science with FAIR Knowledge GraphsMichel Dumontier
Data-Driven Discovery Science with FAIR Knowledge Graphs
Despite the existence of vast amounts of biomedical data, these remain difficult to find and to productively reuse in machine learning and other Artificial Intelligence technologies. In this talk, I will discuss the role of the FAIR Guiding Principles to make AI-ready biomedical data, and their representation as knowledge graphs not only enables powerful ontology-backed semantic queries, but also can be used to predict missing information, as well as to check the quality of knowledge collected.
The main idea of the talk is to introduce the FAIR principles (what they are and what they are not), and how their application with semantic web technologies (ontologies/linked data) creates improved possibilities for large scale data integration, answering sophisticated questions using automated reasoners, and predicting new relations/validating data using graph embeddings. The audience will gain insight into the state of the art in a carefully presented manner that introduces principles, approaches, and outcomes relevant to Health AI.
An efficient approach on spatial big data related to wireless networks and it...eSAT Journals
Abstract
Spatial big data acts as a important key role in wireless networks applications. In that spatial and spatio temporal problems contains the distinct role in big data and it’s compared to common relational problems. If we are solving those problems means describing the three applications for spatial big data. In each applications imposing the specific design and we are developing our work on highly scalable parallel processing for spatial big data in Hadoop frameworks by using map reduce computational model. Our results show that enables highly scalable implementations of algorithms using Hadoop for the purpose of spatial data processing problems. Inspite of developing these implementations requires specialized knowledge and user friendly.
Keywords: Spatial Big Data, Hadoop, Wireless Networks, Map reduce
This document discusses challenges and outlooks related to big data. It begins with an introduction describing how big data is being collected and analyzed in various fields such as science, education, healthcare, urban planning, and more. It then outlines the key phases in big data analysis: data acquisition and recording, information extraction and cleaning, data integration and representation, query processing and analysis, and result interpretation. For each phase, it discusses challenges and how existing techniques can be applied or extended to address big data issues. Some of the major challenges discussed are data scale, heterogeneity, lack of structure, privacy, timeliness, provenance, and visualization across the entire big data analysis pipeline.
Novel holistic architecture for analytical operation on sensory data relayed...IJECEIAES
With increasing adoption of the sensor-based application, there is an exponential rise of the sensory data that eventually take the shape of the big data. However, the practicality of executing high end analytical operation over the resource-constrained big data has never being studied closely. After reviewing existing approaches, it is explored that there is no cost-effective schemes of big data analytics over large scale sensory data processiing that can be directly used as a service. Therefore, the propsoed system introduces a holistic architecture where streamed data after performing extraction of knowedge can be offered in the form of services. Implemented in MATLAB, the proposed study uses a very simplistic approach considering energy constrained of the sensor nodes to find that proposed system offers better accuracy, reduced mining duration (i.e. faster response time), and reduced memory dependencies to prove that it offers cost effective analytical solution in contrast to existing system.
Assignment Of Sensing Tasks To IoT Devices Exploitation Of A Social Network ...Dustin Pytko
The document proposes exploiting a Social Internet of Things (SIoT) paradigm to assign sensing tasks to IoT devices in a mobile crowdsensing (MCS) scenario. A new algorithm is presented to fairly allocate resources and sensing tasks among devices while extending device lifetime. The algorithm creates an energy consumption profile for each device and task that is shared over the SIoT network. Emulations showed the proposed approach extended the time for the first device's battery to deplete by over 40% compared to alternative approaches.
Big Data in Bioinformatics & the Era of Cloud ComputingIOSR Journals
This document discusses the challenges of big data in bioinformatics and how cloud computing can address them. It notes that high-throughput experiments are generating huge amounts of biological data from fields like genomics and proteomics. Storing and analyzing this "big data" requires massive computational resources that are costly for individual organizations. However, cloud computing provides elastic, on-demand access to storage and processing power at an affordable cost. This allows bioinformatics data to be securely stored and shared on the cloud to enable collaborative analysis and overcome issues of data transfer, storage limitations, and infrastructure maintenance.
Trend-Based Networking Driven by Big Data Telemetry for Sdn and Traditional N...josephjonse
Organizations face a challenge of accurately analyzing network data and providing automated action based on the observed trend. This trend-based analytics is beneficial to minimize the downtime and improve the performance of the network services, but organizations use different network management tools to understand and visualize the network traffic with limited abilities to dynamically optimize the network. This research focuses on the development of an intelligent system that leverages big data telemetry analysis in Platform for Network Data Analytics (PNDA) to enable comprehensive trendbased networking decisions. The results include a graphical user interface (GUI) done via a web application for effortless management of all subsystems, and the system and application developed in this research demonstrate the true potential for a scalable system capable of effectively benchmarking the network to set the expected behavior for comparison and trend analysis. Moreover, this research provides a proof of concept of how trend analysis results are actioned in both a traditional network and a software-defined network (SDN) to achieve dynamic, automated load balancing
Trend-Based Networking Driven by Big Data Telemetry for Sdn and Traditional N...josephjonse
This summarizes a research paper on developing an intelligent system that leverages big data telemetry analysis to enable trend-based networking decisions. The system collects data from traditional and SDN networks using SNMP and OpenFlow. Logstash filters the data and sends it to PNDA's Kafka interface. A Jupyter notebook streams the data to OpenTSDB. Benchmarking identifies trends, and the system takes automated action like load balancing across the networks when trends are detected. A GUI provides centralized management. The system demonstrates using big data analytics to monitor networks and make proactive, automated decisions based on observed trends.
SPECIAL SECTION ON HEALTHCARE BIG DATAReceived August 16, .docxrosiecabaniss
SPECIAL SECTION ON HEALTHCARE BIG DATA
Received August 16, 2016, accepted September 11, 2016, date of publication September 26, 2016,
date of current version October 15, 2016.
Digital Object Identifier 10.1109/ACCESS.2016.2613278
Mobile Cloud Computing Model and Big Data
Analysis for Healthcare Applications
LO’AI A. TAWALBEH1,2, (Senior Member, IEEE), RASHID MEHMOOD3, (Senior Member, IEEE),
ELHADJ BENKHLIFA4, AND HOUBING SONG5, (Member, IEEE)
1Department of Computer Engineering, Jordan University of science and Technology, Irbid 22110, Jordan
2Department of Computer Engineering, Umm Al-Qura University, Mecca 21955, Saudi Arabia
3High Performance Computing Centre, King Abdulaziz University, Jeddah 21589, Saudi Arabia
4Staffordshire University, Stoke-on-Trent ST4 2DE, U.K.
5Department of Electrical and Computer Engineering, West Virginia University, Morgantown, WV 26506, USA
Corresponding author: L. A. Tawalbeh ([email protected])
This work was supported by the Long-Term National Science Technology and Innovation Plan (LT-NSTIP) under Grant 13-ELE2527-10
and in part by the King Abdulaziz City for Science and Technology (KACST), Saudi Arabia.
ABSTRACT Mobile devices are increasingly becoming an indispensable part of people’s daily life,
facilitating to perform a variety of useful tasks. Mobile cloud computing integrates mobile and cloud
computing to expand their capabilities and benefits and overcomes their limitations, such as limited memory,
CPU power, and battery life. Big data analytics technologies enable extracting value from data having four
Vs: volume, variety, velocity, and veracity. This paper discusses networked healthcare and the role of mobile
cloud computing and big data analytics in its enablement. The motivation and development of networked
healthcare applications and systems is presented along with the adoption of cloud computing in healthcare.
A cloudlet-based mobile cloud-computing infrastructure to be used for healthcare big data applications
is described. The techniques, tools, and applications of big data analytics are reviewed. Conclusions are
drawn concerning the design of networked healthcare systems using big data and mobile cloud-computing
technologies. An outlook on networked healthcare is given.
INDEX TERMS Healthcare systems, big data analytics, mobile cloud computing, cloudlet infrastructure,
health applications.
I. INTRODUCTION
Recently, there have been many advances in information and
communication technologies that have been transforming the
world; the world is increasingly becoming a small neighbor-
hood. Among these technologies are the cloud computing, the
wireless communications (3G/4G/5G), and the competitive
mobile devices industry. The mobile devices can provide
variety of services to facilitate our living style [1]. They are
integrated in our daily routine to help performing variety
of tasks such as location determination, time management,
image processing, booking hotels, selling and buying onli ...
Understand the Idea of Big Data and in Present ScenarioAI Publications
Big data analytics and deep learning are two of data science's most promising areas of convergence. The importance of Big Data has grown recently as several organizations, both public and commercial, have been amassing large amounts of region-specific data that may provide useful information on topics like as national information, advanced security, blackmail area, development, and prosperity informatics. For Big Data Analytics, where data is often unstructured and unlabeled, Deep Learning's ability to analyze and learn from large amounts of data on its own is a crucial feature. In this review, we look at how Deep Learning can be used to solve some of the most pressing problems in Big Data Analytics, including model isolation from large data sets, semantic querying, data marking, smart data recovery, and the automation of discriminative tasks.
SPECIAL SECTION ON HEALTHCARE BIG DATAReceived August 16, ChereCheek752
SPECIAL SECTION ON HEALTHCARE BIG DATA
Received August 16, 2016, accepted September 11, 2016, date of publication September 26, 2016,
date of current version October 15, 2016.
Digital Object Identifier 10.1109/ACCESS.2016.2613278
Mobile Cloud Computing Model and Big Data
Analysis for Healthcare Applications
LO’AI A. TAWALBEH1,2, (Senior Member, IEEE), RASHID MEHMOOD3, (Senior Member, IEEE),
ELHADJ BENKHLIFA4, AND HOUBING SONG5, (Member, IEEE)
1Department of Computer Engineering, Jordan University of science and Technology, Irbid 22110, Jordan
2Department of Computer Engineering, Umm Al-Qura University, Mecca 21955, Saudi Arabia
3High Performance Computing Centre, King Abdulaziz University, Jeddah 21589, Saudi Arabia
4Staffordshire University, Stoke-on-Trent ST4 2DE, U.K.
5Department of Electrical and Computer Engineering, West Virginia University, Morgantown, WV 26506, USA
Corresponding author: L. A. Tawalbeh ([email protected])
This work was supported by the Long-Term National Science Technology and Innovation Plan (LT-NSTIP) under Grant 13-ELE2527-10
and in part by the King Abdulaziz City for Science and Technology (KACST), Saudi Arabia.
ABSTRACT Mobile devices are increasingly becoming an indispensable part of people’s daily life,
facilitating to perform a variety of useful tasks. Mobile cloud computing integrates mobile and cloud
computing to expand their capabilities and benefits and overcomes their limitations, such as limited memory,
CPU power, and battery life. Big data analytics technologies enable extracting value from data having four
Vs: volume, variety, velocity, and veracity. This paper discusses networked healthcare and the role of mobile
cloud computing and big data analytics in its enablement. The motivation and development of networked
healthcare applications and systems is presented along with the adoption of cloud computing in healthcare.
A cloudlet-based mobile cloud-computing infrastructure to be used for healthcare big data applications
is described. The techniques, tools, and applications of big data analytics are reviewed. Conclusions are
drawn concerning the design of networked healthcare systems using big data and mobile cloud-computing
technologies. An outlook on networked healthcare is given.
INDEX TERMS Healthcare systems, big data analytics, mobile cloud computing, cloudlet infrastructure,
health applications.
I. INTRODUCTION
Recently, there have been many advances in information and
communication technologies that have been transforming the
world; the world is increasingly becoming a small neighbor-
hood. Among these technologies are the cloud computing, the
wireless communications (3G/4G/5G), and the competitive
mobile devices industry. The mobile devices can provide
variety of services to facilitate our living style [1]. They are
integrated in our daily routine to help performing variety
of tasks such as location determination, time management,
image processing, booking hotels, selling and buying onli ...
1) The document discusses an efficient analysis of geo-social media data to make real-time decisions using big data techniques. It proposes a system to analyze large amounts of data from various social networks to understand issues like natural disasters, diseases, and infrastructure development.
2) The system architecture processes social network data like location, time, and user to analyze topics in real-time. It uses a MapReduce framework to divide and process the large datasets concurrently.
3) Experimental results show the system can analyze tweets to provide users information about locations and monitor topics like natural calamities, diseases, accidents, products, and transportation in real-time.
A Deep Dissertion Of Data Science Related Issues And Its ApplicationsTracy Hill
This document summarizes a paper on data science that discusses its definition, processes, applications, and open research issues. It defines data science as extracting, collecting, and analyzing data for business or technical purposes. The paper describes the typical data science process as involving data wrangling, analysis, and communication. It discusses applications of data science in areas like business analytics, prediction, and healthcare. Finally, it outlines open research issues involving integrating data science with emerging technologies like the Internet of Things, cloud computing, and quantum computing.
SEMANTIC TECHNIQUES FOR IOT DATA AND SERVICE MANAGEMENT: ONTOSMART SYSTEMijwmn
In 2020 more than50 billions devices will be connected over the Internet. Every device will be connected to
anything, anyone, anytime and anywhere in the world of Internet of Thing or IoT. This network will
generate tremendous unstructured or semi structured data that should be shared between different
devices/machines for advanced and automated service delivery in the benefits of the user’s daily life. Thus,
mechanisms for data interoperability and automatic service discovery and delivery should be offered.
Although many approaches have been suggested in the state of art, none of these researches provide a fully
interoperable, light, flexible and modular Sensing/Actuating as service architecture. Therefore, this paper
introduces a new Semantic Multi Agent architecture named OntoSmart for IoT data and service
management through service oriented paradigm. It proposes sensors/actuators and scenarios independent
flexible context aware and distributed architecture for IoT systems, in particular smart home systems.
SEMANTIC TECHNIQUES FOR IOT DATA AND SERVICE MANAGEMENT: ONTOSMART SYSTEMijwmn
In 2020 more than50 billions devices will be connected over the Internet. Every device will be connected to
anything, anyone, anytime and anywhere in the world of Internet of Thing or IoT. This network will
generate tremendous unstructured or semi structured data that should be shared between different
devices/machines for advanced and automated service delivery in the benefits of the user’s daily life. Thus,
mechanisms for data interoperability and automatic service discovery and delivery should be offered.
Although many approaches have been suggested in the state of art, none of these researches provide a fully
interoperable, light, flexible and modular Sensing/Actuating as service architecture. Therefore, this paper
introduces a new Semantic Multi Agent architecture named OntoSmart for IoT data and service
management through service oriented paradigm. It proposes sensors/actuators and scenarios independent
flexible context aware and distributed architecture for IoT systems, in particular smart home systems.
SEMANTIC TECHNIQUES FOR IOT DATA AND SERVICE MANAGEMENT: ONTOSMART SYSTEMijwmn
In 2020 more than50 billions devices will be connected over the Internet. Every device will be connected to
anything, anyone, anytime and anywhere in the world of Internet of Thing or IoT. This network will
generate tremendous unstructured or semi structured data that should be shared between different
devices/machines for advanced and automated service delivery in the benefits of the user’s daily life. Thus,
mechanisms for data interoperability and automatic service discovery and delivery should be offered.
Although many approaches have been suggested in the state of art, none of these researches provide a fully
interoperable, light, flexible and modular Sensing/Actuating as service architecture. Therefore, this paper
introduces a new Semantic Multi Agent architecture named OntoSmart for IoT data and service
management through service oriented paradigm. It proposes sensors/actuators and scenarios independent
flexible context aware and distributed architecture for IoT systems, in particular smart home systems.
This document discusses big data challenges for e-governance systems in distributed systems. It proposes a map reducing architecture model to provide the basics for building interoperable big data infrastructure for e-governance. The model includes stages for data management, access control, and security. It explains how this can be implemented using distributed structures and a provisioning model. Key challenges addressed include the exponential growth of data from various sources, need for data sharing across organizations, and ensuring security, access control, and data integrity in distributed systems.
A Novel Integrated Framework to Ensure Better Data Quality in Big Data Analyt...IJECEIAES
With advent of Big Data Analytics, the healthcare system is increasingly adopting the analytical services that is ultimately found to generate massive load of highly unstructured data. We reviewed the existing system to find that there are lesser number of solutions towards addressing the problems of data variety, data uncertainty, and data speed. It is important that an errorfree data should arrive in analytics. Existing system offers single-hand solution towards single platform. Therefore, we introduced an integrated framework that has the capability to address all these three problems in one execution time. Considering the synthetic big data of healthcare, we carried out the investigation to find that our proposed system using deep learning architecture offers better optimization of computational resources. The study outcome is found to offer comparatively better response time and higher accuracy rate as compared to existing optimization technqiues that is found and practiced widely in literature.
This document discusses trends in big data analytics. It covers the rapidly increasing scale of data from sources like social media, sensors, and scientific experiments. This poses challenges for data storage, processing, and analysis. Current hardware platforms for analytics include distributed data centers with compute and storage resources. Emerging hardware trends involve using non-volatile memory technologies closer to processors to improve performance and efficiency. Software systems need to be highly scalable and support a variety of analytic workloads. Overall, big data analytics requires new techniques and architectures to effectively handle massive, diverse datasets.
Big data is a prominent term which characterizes the improvement and availability of data in all three
formats like structure, unstructured and semi formats. Structure data is located in a fixed field of a record
or file and it is present in the relational data bases and spreadsheets whereas an unstructured data file
includes text and multimedia contents. The primary objective of this big data concept is to describe the
extreme volume of data sets i.e. both structured and unstructured. It is further defined with three “V”
dimensions namely Volume, Velocity and Variety, and two more “V” also added i.e. Value and Veracity.
Volume denotes the size of data, Velocity depends upon the speed of the data processing, Variety is
described with the types of the data, Value which derives the business value and Veracity describes about
the quality of the data and data understandability. Nowadays, big data has become unique and preferred
research areas in the field of computer science. Many open research problems are available in big data
and good solutions also been proposed by the researchers even though there is a need for development of
many new techniques and algorithms for big data analysis in order to get optimal solutions. In this paper,
a detailed study about big data, its basic concepts, history, applications, technique, research issues and
tools are discussed.
Big data is a prominent term which characterizes the improvement and availability of data in all three formats like structure, unstructured and semi formats. Structure data is located in a fixed field of a record or file and it is present in the relational data bases and spreadsheets whereas an unstructured data file includes text and multimedia contents. The primary objective of this big data concept is to describe the extreme volume of data sets i.e. both structured and unstructured. It is further defined with three “V” dimensions namely Volume, Velocity and Variety, and two more “V” also added i.e. Value and Veracity. Volume denotes the size of data, Velocity depends upon the speed of the data processing, Variety is described with the types of the data, Value which derives the business value and Veracity describes about the quality of the data and data understandability. Nowadays, big data has become unique and preferred research areas in the field of computer science. Many open research problems are available in big data and good solutions also been proposed by the researchers even though there is a need for development of many new techniques and algorithms for big data analysis in order to get optimal solutions. In this paper, a detailed study about big data, its basic concepts, history, applications, technique, research issues and tools are discussed.
Similar to Next generation big data analytics state of the art (20)
Satta matka fixx jodi panna all market dpboss matka guessing fixx panna jodi kalyan and all market game liss cover now 420 matka office mumbai maharashtra india fixx jodi panna
Call me 9040963354
WhatsApp 9040963354
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART KALYAN CHART
High-Quality IPTV Monthly Subscription for $15advik4387
Experience high-quality entertainment with our IPTV monthly subscription for just $15. Access a vast array of live TV channels, movies, and on-demand shows with crystal-clear streaming. Our reliable service ensures smooth, uninterrupted viewing at an unbeatable price. Perfect for those seeking premium content without breaking the bank. Start streaming today!
https://rb.gy/f409dk
The Role of White Label Bookkeeping Services in Supporting the Growth and Sca...YourLegal Accounting
Effective financial management is important for expansion and scalability in the ever-changing US business environment. White Label Bookkeeping services is an innovative solution that is becoming more and more popular among businesses. These services provide a special method for managing financial duties effectively, freeing up companies to concentrate on their main operations and growth plans. We’ll look at how White Label Bookkeeping can help US firms expand and develop in this blog.
The report *State of D2C in India: A Logistics Update* talks about the evolving dynamics of the d2C landscape with a particular focus on how brands navigate the complexities of logistics. Third Party Logistics enablers emerge indispensable partners in facilitating the growth journey of D2C brands, offering cost-effective solutions tailored to their specific needs. As D2C brands continue to expand, they encounter heightened operational complexities with logistics standing out as a significant challenge. Logistics not only represents a substantial cost component for the brands but also directly influences the customer experience. Establishing efficient logistics operations while keeping costs low is therefore a crucial objective for brands. The report highlights how 3PLs are meeting the rising demands of D2C brands, supporting their expansion both online and offline, and paving the way for sustainable, scalable growth in this fast-paced market.
During the budget session of 2024-25, the finance minister, Nirmala Sitharaman, introduced the “solar Rooftop scheme,” also known as “PM Surya Ghar Muft Bijli Yojana.” It is a subsidy offered to those who wish to put up solar panels in their homes using domestic power systems. Additionally, adopting photovoltaic technology at home allows you to lower your monthly electricity expenses. Today in this blog we will talk all about what is the PM Surya Ghar Muft Bijli Yojana. How does it work? Who is eligible for this yojana and all the other things related to this scheme?
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka Result Satta Matka Guessing Satta Fix jodi Kalyan Final ank Satta Matka Dpbos Final ank Satta Matta Matka 143 Kalyan Matka Guessing Final Matka Final ank Today Matka 420 Satta Batta Satta 143 Kalyan Chart Main Bazar Chart vip Matka Guessing Dpboss 143 Guessing Kalyan night
Efficient PHP Development Solutions for Dynamic Web ApplicationsHarwinder Singh
Unlock the full potential of your web projects with our expert PHP development solutions. From robust backend systems to dynamic front-end interfaces, we deliver scalable, secure, and high-performance applications tailored to your needs. Trust our skilled team to transform your ideas into reality with custom PHP programming, ensuring seamless functionality and a superior user experience.
Prescriptive analytics BA4206 Anna University PPTFreelance
Business analysis - Prescriptive analytics Introduction to Prescriptive analytics
Prescriptive Modeling
Non Linear Optimization
Demonstrating Business Performance Improvement
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Next generation big data analytics state of the art
1. 1551-3203 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2650204, IEEE
Transactions on Industrial Informatics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1
Abstract—The term big data occurs more frequently now than
ever before. A large number of fields and subjects, ranging from
everyday life to traditional research fields (i.e., geography and
transportation, biology and chemistry, medicine and
rehabilitation) involve big data problems. The popularizing of
various types of network has diversified types, issues, and
solutions for big data more than ever before. In this paper, we
review recent research in data types, storage models, privacy, data
security, analysis methods, and applications related to network
big data. Finally, we summarize the challenges and development
of big data to predict current and future trends
Index Terms—Big data, Network, Massive Data
I. INTRODUCTION
e are living in an era of big data—an age characterized
by fast collection of ubiquitous information. Big data
incorporates endless amounts of information [56]. In
many industries, it is growing, providing a means to improve
and streamline business. Many fields and sectors, ranging from
economic and business activities to public administration, from
national security to scientific research in many areas, are
involved in big data problems [59]. Big data has changed the
world in terms of predicting customer behavior. The birth of big
data cannot avoid mentioning another current popular
term—social networks—and the relation between the two is
obvious, yet complicated. Big data and social networks are
interdependent, because most of today’s data are generated
from social networking sites, but big data is not always useful.
The actual challenge of big data is not in collecting it, but in
managing it as well as making sense of it [63]. When we work
on big data, it is crucial to determine whether the benefits
outweigh the costs of storage and maintenance. Several tools
are being designed to better understand the role of huge
amounts of data in improving business. Researchers and
practitioners are trying to look into the future of big data to
extract more benefits.
Big data is used in several research areas related to
healthcare, location-based services, satellite information usage,
online advertising, and retail marketing. In the coming years,
the Internet of Things (IoT) will increase the amount of data in
the world, and an exponential rise in big data will be seen [1].
Zhihan Lv is with the Virtual Environments and Computer Graphics
group at University College London, UK (e-mail: lvzhihan@gmail.com).
Houbing Song is with the Department of Electrical and Computer
Engineering, West Virginia University, Montgomery, WV, USA (e-mail:
h.song@ieee.org).
Pablo Basanta-Val is with the Telematics Engineering Department at
Universidad Carlos III de Madrid, Spain (e-mail: pbasanta@it.uc3m.es).
Anthony Steed is with the Department of Computer Science,
University College London, UK (A.Steed@ucl.ac.uk).
Minho Jo (Corresponding Author) is with the Department of
Computer Convergence Software, Korea University, Sejong Metro City,
S. Korea (e-mail: minhojo@korea.ac.kr).
There have been a few review papers on big data in general [2]
[18] [26] [56] [59] [63] [66] [67] and in diverse specialized
fields, e.g., biology [61], healthcare [62], geography [65], and
the Internet [60]. Unlike previous reviews in the literature, this
paper looks at recent advances in big data from a different view
and with different classifications, i.e., data types, storage
models, analysis models, privacy, data security, and
applications, as shown in Figure 1.
Fig.1. Classification of big data literature
The rest of the paper is organized as follows. Section II presents
data types. Section III gives the storage model. Section IV
discusses the privacy and data security. Section V and Section
VI present various big data analysis methods, and applications,
respectively. Section VII concludes this paper with future
research topics.
II. DATA TYPES
The era of big data has produced a variety of datasets from
different sources in different domains. These datasets consist of
multiple modalities, each of which has a different
representation, distribution, scale, and density. How to unlock
the power of knowledge from disparate datasets is of
paramount importance in big data research, essentially
distinguishing big data from traditional data mining tasks [45].
A. Online Network Data
One of the main focuses of network big data is online social
network (OSN) data, such as Facebook [58] and Second Life
[68]. This focus has expanded with developments in data
analysis.
Many studies have been performed with online social
networks using knowledge about representative characteristics
at the macro level, for instance, small world features. However,
factors for the features of potential micro-processes are not well
represented in these studies. The simplification of math expo is
in general viewed as a model by scholars. One study adheres to
an additional strategy that results in the microscopic process of
knowledge acting in accordance with real online actions [2].
Next-Generation Big Data Analytics: State of the
Art, Challenges, and Future Research Topics
W
2. 1551-3203 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2650204, IEEE
Transactions on Industrial Informatics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2
Knowledge in this respect can be applied to carry out the choice
of easily handled mathematical models in the generation and
transformation of online networks.
Another study proposes a method for judging the intensity
and category of social relations by relying on their spatial–
temporal interrelations [5]. The result suggests that the method
discussed in the paper is capable of successfully recognizing
different social relations between different substances. With the
nonstop development of social networks, more and more
sociologists have become involved in the study of big data.
B. Mobile and IoT Data
Another trend in network big data is the analysis of mobile
and IoT data.
With the development of 5G technology, converged mobile
networks have resulted in significant improvements in
machine-to-machine communications performance. Integrated
mobile webs share unlicensed spectrum bands in cellulite
networks, such as Long Term Evolution-Advanced, by using
cognitive radio technology. This network generates large
volumes of data, compared to former mobile networks [3].
In addition to the increased volume of mobile data, the IoT
also generates large amounts of data in this new context.
Despite this large volume of data, the sensing elements of
wireless body area networks (WBANs) to a certain degree
restricts power use [25][54]. The majority of researchers attach
importance to energy efficiency in media access control (MAC)
agreements in lengthening the lifetime of the sensors. One
study addresses the recognition of classifications of power
consumption attacks in MAC agreements in WBANs. It
describes the straightforward operation of the attacks, resulting
in power consumption in a variety of MAC agreements [4].
This work is a good reference for research on the power
efficiency of MAC agreements in WBANs of the future.
Understanding the connection and interaction of mobile
OSN data has been continuously broadened, although network
big data in the IoT is a relatively new field. The data structure of
the analysis is more likely to be Not Only Structured Query
Language (NoSQL), which is adopted by many IoT systems.
Some studies design the expected functions of a big loader and
a convenient loading NoSQL system. The system allows the
standard conceptual program to be loaded and lets the standard
sources from which the data are supposed to be collected meet
its requirements; finally, this study provides feasible strategies
for the choice of NoSQL system where the conceptual program
can be arranged well [6].
C. Geography Data
OSN data will soon include geographic data along with OSN
interaction, for example, geo-tag real-time geographic data [65].
Location-based data will soon expand beyond terrain.
One study addresses the gauntlets of major forms of
technology for three-dimensional (3D) interaction and
volume-rendering technology on the basis of GPU technology.
This work explores visual software for the hydrological
environment based on data orientation. In addition, it produces
ocean plans, contour mapping of surfaces, element field
mapping, and dynamic simulation of the existing field [7]. To
better present features in space and achieve real-time upgrading
of a large amount of hydrological environment data, the study
constructs nodes on the spot for the control of geometry to
achieve dynamic mapping of high properties.
With full-speed development of the 3D digital city, research
has shifted from model establishment of the 3D city and the
setting up of geo-databases to 3D geo-database services and
maintenance. There is also a paper putting forward an
event-driven spatiotemporal database model in which 3D city
models from the past and present are connected to sentence
meaning and representation of the state, touched off by
changing events defined in advance [8]. In addition, the
dissertation also presents a homologous dynamic renewal
method taking an adaptive matching algorithm as the premise.
The objective, depending on the compound matching of
sentence-meaning, properties, and positions in space, is to carry
out dynamic renewal for complex 3D city models without being
controlled by others.
D. Spatial Temporal Data
Accompanied by online streaming services, network big data
changes from simple OSN data to spatial–temporal data. In
recent years, the volume of available data from space has
increased substantially. For example, in November 2013,
NASA announced the release of hundreds of terabytes of Earth
remote-sensing datasets.
Data are classified in many categories on the basis of features
and differences. Since the differences in data determine the
success of the analysis, they play an essential role. Different
features are also applied to search for the same features. Some
studies attach importance to time-changing data and data with
time sequences. With respect to network big data, some
same-feature searching methods for time-changing data were
discussed and explored [12].
Data in large databases can be retrieved by data mining. In
the case of time-changing data, when time becomes connected,
the data are mined in terms of both time and space. The
exploration of data mining in terms of both time and space has
had a great influence on the study of data derived from mobile
devices [9]. Reality mining is the exploration of social behavior
according to data retrieved from mobile phones. That is to say,
it depends on the data collected by sensors in mobile phones,
security cameras, RFID readers, and so on. NetViz, Gephi, and
Weka have been applied in the conversion and analysis of
Facebook data.
Spatial big data, in general, is involved in vector data, raster
data and network data. The difficulty with using databases from
the perspective of space is that there are many obstacles at the
collateral level. One study reviewed the data algorithms that are
in general use, in particular for cases where the number of
satellite images continuously increases [10]. It proposes a
system under which Hadoop is employed to realize a
MapReduce model, thus making it possible to enhance the
category of large-scale remote sensing images and magnify the
power of big data as applied to space. The research holds that a
novel system architecture is capable of providing support for
3. 1551-3203 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2650204, IEEE
Transactions on Industrial Informatics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 3
alternate data and data in space [11]. This is shown as follows:
(1) the study represents an analysis framework for expanding
data, (2) it illustrates novel structures and algorithms that
leverage modern hardware, such as Solid State Drives(SSD),
and (3) it extends database systems in space, thus lending
support for an evaluation in space through three novel
constituents.
E. Streaming and Real-time Data
Accompanied by the rise in online streaming services,
network big data has evolved from spatial–temporal data to
real-time spatial–temporal data.
Network surveys in general require ongoing data analysis
owing to constant renewal of reports and statistics over
large-capacity data streams. In one study, researchers
introduced DBStream, a system based on SQL that relies on
surveys for continuous data analysis [15]. They also discuss the
respective properties of DBStream and the collateral data
handling engine Spark. It is suggested that, on some occasions,
a single DBStream node is capable of surpassing a set of 10
Spark nodes due to the renewed network survey capacity.
The epoch of big data has begun, and much of the data are
used to analyze the risks of a variety of industrial applications.
There are technological trials in the collection of big data in a
complex indoor industrial environment. Indoor wireless sensor
network (WSN) technology is capable of overcoming such
restrictions by gathering big data obtained from source nodes.
The data are transferred to a data center, at present. In the study,
representative housing, bureaus, and manufacturing
environments were selected [13]. Through analysis of tested
data, it is possible to obtain signal transferring features of an
indoor WSN. On the basis of these features, a big data
collection algorithm that relies on an indoor WSN was put
forward for the analysis of industrial risk processing.
City traffic also changes in real time. Traffic data are
regarded as worthwhile resources in networks of vehicle.
Highlighting the significance of a survey of big data, an
effective framework was put forward for current-time network
data in vehicle networks [14]. The system, in fact, reflects the
newest trends in big-data paradigms. The framework put
forward is composed of concentrated data memory principles
for a series of processes, and dispersed data memory principles
for stream processing in real time.
The present big data streams from social networks and other
associated sensor networks display the potential for relying on
each other, thus enabling a special approach to the analysis of
extended figures. Data from these figures are often gathered
from data servers in various geographic locations, making it
appropriate for dispersed handling in the cloud. While many
measurements for large-scale immobile figure analysis have
been brought forward, providing current-time analysis of the
dynamics of social correlations requires novel methods that
leverage increased stream handling and figure analysis in
flexible cloud environments. Agnihotri and Sharma [16] put
forward feasible measurement that depends on a stream
handling engine referred to as Floe; on top of this framework, it
is possible to implement current-time data handling and figure
renewal to carry out analysis of figures with low delay in
large-scale, constantly changing social networks. The scope
contains multiple fields, involving supervision, anti-terrorist
applications, and public health supervision.
Currently, space-borne sensors channel nearly constant
streams of Earth-survey datasets. These tremendous
multi-modal streams increase at a rapid rate, presently reaching
several petabytes of satellite files. An extended platform for
both geography and space was devised, developed, and
assessed for online and current-time gains of worthwhile
content from big Earth-survey data [15]. The key features of the
analysis platform are the Rasdaman Array Database
Management System for big raster data memory, and the Open
Geospatial Consortium Web Coverage Processing Service for
data inquiry. The system was verified for self-acting handling
of high resolving–power satellite data, as well as for major
geographic and dimensional, environmental, agricultural, and
water engineering use, e.g., precise agriculture, water quality
control, and land-cover mapping.
Besides, there are some pieces of work that may inspire
research in their related fields [69-74].
F. Visual Data
In the era of big data, ever increasing amounts of image data
have posed significant challenges to modern image analysis
and retrieval. Wu et al. [47] proposed weakly semi-supervised
deep learning for the multi-label image annotation (WeSed)
approach, which was inspired by recent advances in deep
learning research. In WeSed, a novel weakly weighted pairwise
ranking loss is effectively utilized to handle weakly labeled
images, while a triplet similarity loss is employed to harness
unlabeled images. WeSed enables users to train a deep
convolutional neural network (CNN) with images from social
networks, where images are either only weakly labeled, have
several labels, or are unlabeled. An efficient algorithm was also
designed to sample high-quality image triplets from large
image datasets to fine-tune the CNN.
It is of great importance to efficiently and effectively index
images with semantic keywords, particularly when confronted
with the fast-evolving properties of the web. Yang et al. [48]
proposed an unsupervised hashing approach, namely, robust
discrete hashing (RDSH), to facilitate large-scale semantic
indexing of image data. Specifically, RDSH simultaneously
learns discrete binary codes as well as robust hash functions
within a unified model. Extensive experiments were conducted
on various real-world image datasets to show its superiority to
state-of-the-art approaches in large-scale semantic indexing.
G. Challenges in Data
Each different data domain raises particular challenges that,
properly addressed, may have an important impact on
next-generation big data systems. In the first place, online
network data is still waiting for better models, with increased
support from sociologists. Mobile data and the IoT, which are
generating large amounts of data, would benefit from the
adoption of a big data infrastructure able to store and process
information in current IoT infrastructures. As for geography
data, a major trend seems to be to offer efficient integration
4. 1551-3203 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2650204, IEEE
Transactions on Industrial Informatics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 4
among geographic data with records from the OSN domain,
demanding efficient infrastructures able to meet the speed
requirements typical of these domains. One challenge imposed
by spatial data is the definition of proper mining algorithms that
can be applied to special data; those algorithms could benefit
from more efficient time-changing data. Besides, streaming and
real-time data challenge current infrastructures, transforming
current offline applications into online ecosystems, thus
requiring the development of new algorithms that take into
account offline and online data. Lastly, the ever-increasing
amounts of image data challenge current learning algorithms to
extract information, and also demand new algorithms to
semantically classify and index images.
In all these cases, all approaches would benefit from increased
performance in big data infrastructures. By increasing
efficiency in the different data domains, one can see how the
amount of functionality improves; this is of special interest for
the next generation of big data systems, which should integrate
online and offline data efficiently.
III. STORAGE MODEL
In the era of big data, the most difficult problems that remain
to be solved are how to efficiently deal with large quantities and
varieties of data. There are many analytical theories and models.
In this section, recent discoveries in big data storage and
analysis models are surveyed.
The acquisition of voluminous data depends on a variety of
users and devices, as well as powerful data centers to store and
process the data. For this reason, establishing an unimpeded
network infrastructure is urgently needed; this infrastructure
would make it possible to gather geologically distributed and
rapidly generated data and send them to data centers for end
users. In one study, participants witnessed the various
challenges in establishing such a network framework [18]. This
study presents the components of the network that must be
established: networks relating to the original data, bridges to
connect and transmit them to data centers, and intricate
networks distributed within the data centers, as well as
independent data centers.
Another study mainly addressed possible issues when using
big data in specific locations, taking social network sites as one
example [19]. The survey shows that users do not choose data
arbitrarily when using this data network. On the contrary, most
consciously try to find a particular site from data centers. This
shows that one can identify the necessary data within a vast and
complicated network.
Load counterpoising technologies, like work usurping, play a
major role in dispersed assignment–arranging systems; these
systems possess various kinds of managers who determine the
arrangement result [20] and facilitate work usurping by
applying both contributed and borrowed alignments. Tasks are
divided into several queues according to size and site. Skills are
organized in MATRIX, a dispersed mission manager for
mission computing. The researchers leveraged dispersed
key-value memory to manage and measure the mission
metadata, mission reliability, and data locality.
A structure that has a dispersed memory level local to the
compute nodes was suggested [21]. This level is in charge of
the majority of input/output (I/O) handlings, and economizes
excessive amounts of data movement between compute and
memory origins. The study describes and implements a system
antetype of this structure, which requires the FusionFS
dispersed document system to sustain metadata-concentrated
and write-concentrated operation, both of which play an
essential role in I/O representation of scientific utilizations.
FusionFS was developed and assessed based on 16K compute
nodes of an IBM Blue Gene/P supercomputer; this suggests an
order of scale-representation enhancement over other document
systems, such as General Parallel File System (GPFS), Parallel
Virtual File System (PVFS), and Hadoop Distributed File
System (HDFS).
Another study shows the collection of dispersed key-value
memory mechanisms in clouds and supercomputers [43].
Specifically, ZHT, a zero-hop dispersed key-value memory
system, is shown rearranging the requests of advanced
computing systems. This system is meant to establish an
alternative to other systems, such as collateral and dispersed
document systems, dispersed working controlling systems, and
collateral planning systems. The study also mentions systems
that have incorporated ZHT in real applications, namely,
FusionFS (a distributed file system), IStore (a storage system
with erasure coding), MATRIX (distributed scheduling),
Slurm++ (distributed High Performance Computing(HPC) job
launch), and Fabriq (distributed message queue management).
Furthermore, it was also shown in another paper that certain
superior computing systems are apt to adopt ZHT for their
requests [22].
The absorbing ability with respect to data networks was
discussed in other papers [23]. The research explored
immediate aerial reception to effect immediate receiving
re-establishment using two procedure line programs to initiate a
multicast tree for a wireless multi-hop meshwork. Based on
different means, the research aims at making every site possess
a recognition capability, so all of them have the ability to
absorb the smallest conveying power through perceiving,
studying, behaving, and determining.
Peer-to-peer (P2P) overlay networks were also used in 3D
geographic big data searches [31]. Major achievements include
drawing geographic and virtual space based on P2P coverage of
the network space; and the spaces are classified by the
quaternary tree approach. The geographic code is determined
by hash value, being applied to index the user list, landform
information, and the message of the model target. It is possible
to devise and realize the sharing of data based on enhancement
of the Kademlia meshwork pattern. In this pattern, an XOR
algorithm is applied to calculate the range of cyberspace. The
pattern, to a certain degree, enhances the hit rate of
three-dimensional geographic data exploration under a P2P
coverage meshwork.
As for storage models, the main challenge seems to be to
efficiently deal with larger amounts of data. In most cases, the
lack of ultra-scalable solutions is hindering the processing of
other data sources, causing inefficiency. The challenge is to
build a more scalable big data technology able to offer data
5. 1551-3203 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2650204, IEEE
Transactions on Industrial Informatics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 5
gathering and distribution among nodes geographically
dispersed across the world. Any improvement in this challenge
is to going to have a direct impact on a variety of developed
applications, which otherwise may suffer from technological
constraints.
IV. PRIVACY AND DATA SECURITY
Because of the scope of big data, safety and privacy
protection is a crucial problem [53]. There may be risks of
privacy violations at each step. There are many methods for
privacy protection, e.g., encryption.
The popularity of big data depends on a complete
understanding of the safety problems inherent within the
system. Safety is a new concern, and this thesis mainly
introduces the concept of privacy using new problems, and
focuses on efficiency and privacy protection. This study
specializes in the structure of big data analytics, demonstrating
the requirements for privacy protection; in addition, it explains
the safety protection cosine similarity agreement in data mining
and requirements.
In body area networks, wearable sensors collect data that is
often sensitive and must be protected. Compared to former
methods, this method provides more reliable and available
privacy protection. The experiments prove that this method has
sufficient privacy protection, even when hackers have adequate
knowledge of the system.
System problems that are related to network intrusion
forecasts are discussed. The article focuses on problems dealing
with big data categories, employing the techniques of
geometric representation learning and modern networks. In
particular, to overcome network traffic problems, this thesis
focuses on the problems associated with the technologies of
supervised learning, representation learning, machine life-long
learning, and big data (Hive and Could, etc.).
An International Data Corporation survey showed that, in
2011, 1.8 trillion gigabytes of data were created and copied, and
that amount is duplicated every two years. In the coming
decade, the total amount of data center–managed information
will be 50 times larger; however, professional IT staff will only
grow by 1.5 times. Conventional tools are not able to process
and deal with the information contained in this amount of data,
nor can they ensure security.
Because of the competitive market, customer management
has become a means to achieving competitive advantage. The
present churn prediction models do not work efficiently in a big
data environment. Additionally, the deciders often face
inaccurate management. In response to these challenges, a
clustering algorithm referred to as semantic-driven was put
forward. The experiment results showed that semantic-driven
subtractive clustering method (SDSCM) possesses stronger
semantic strength than the subtractive and fuzzy means. Thus,
based on the Hadoop MapReduce framework, the SDSCM
algorithm was realized.
Current state-of-the-art applications would benefit from
definition of a clear hierarchy for data security and privacy,
with common off-the-shelf policies designed for big data.
Those policies should be able to efficiently deal with the
definition of what is allowed and what is not, and should take
into account diverse scenarios, ranging from BANs to large
data servers that process larger amounts of data. The
accomplishment of this goal in an organized way will promote
the adoption of high-quality big data systems and applications.
V. ANALYSIS METHODS
At present, the methods used for big data analysis are
MapReduce-related. For data control in the past, instruments
for analyzing data were insufficient in depot and exploring
systems. The models used by big data researchers are usually
inspired by mathematical ease of exposition [60]. By virtue of
the essence of big data, it is memorized in a dispersed document
system framework. Hadoop and HDFS by Apache are
extensively applied in memorizing and controlling big data.
The exploration of big data is fraught with obstacles, since it is
related to large dispersed document systems that are
supposedly featured by mistake endurance, agility, and the
ability to be extended [65]. MapReduce is extensively applied
for productivity exploration of big data [31]. Conventional
database management system technologies including Joins and
Searching and others, like drawing exploration, are utilized in
the division and integration of big data [27]. These technologies
are applied in MapReduce.
The authors of [28] suggests a variety of approaches to
address the issues that stem from a MapReduce structure over a
Hadoop Distributed File System. MapReduce is the technology
that utilizes document searching for drawing, classifying,
shuffling, and decreasing. Researchers have performed
research on MapReduce technologies for big data applications
based on HDFS [28]. One paper also studied MapReduce in a
mobile environment [29].
Another study presents a security framework for large-space
multimedia files [30]. A hybrid safety cloud memory structure
is proposed that relies on the Internet of Things. It applied the
idea of multimedia defense on the basis of role-visiting
domination. In addition, the study also made use of a program
that takes the integration of the multimedia data state and
role-visiting domination as the premise. The Internet of Things
is applied to assess whether circuits are associated and whether
the equipment is running naturally to improve visiting
efficiency, which ensures the safety of multimedia files.
Many researchers have attempted to produce systematic
ways of applying a wide spectrum of advanced machine
learning (ML) programs to industrial-scale problems. Xing et al.
proposed a general-purpose framework, Petuum, which
systematically addresses data- and model-parallel challenges in
large-scale ML by observing that many ML programs are
fundamentally optimization-centric and admit error-tolerant,
iterative-convergent algorithmic solutions [46]. This presents
unique opportunities for an integrative system design, such as
bounded-error network synchronization and dynamic
scheduling based on an ML program structure.
Here, one of the cornerstones is the MapReduce model that
offers support to most analyses. The MapReduce model was
designed to process offline data in a batch-processing engine;
however, next-generation analytics, which also have online
6. 1551-3203 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2650204, IEEE
Transactions on Industrial Informatics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 6
requirements, should benefit from a lightweight version of
MapReduce and the implantation of distributed stream
processing engines.
VI. APPLICATIONS
As data and interactions are generated in every form of
human behavior, big data is used in almost all aspects of life.
Big data increasingly benefits both research and industrial
fields, such as healthcare, financial services, and commercial
recommendations. Big data is used primarily to predict certain
messages, such as transient power [34] and stock prices [35].
A. Transient Power Prediction
The prediction of transient power is valid in both distributed
and streaming data. Machine learning was used in the study. In
the classifier cultivation stage, researchers regard the
tremendous amount of data from the past as a dispersed study
target, and establish evaluation principles regularly. Zhiwei et
al. [34] designed a naive Bayes–category approach based on
MapReduce handling, creating a map-and-decrease procedures
method for calculating the chance rate of being tested in
advance and the chance rate for conditions in dispersed means.
B. User Behavior Prediction
Many of the network big data predictions are based on data
from OSNs. Big data is used for predictions based on ranked
data, such as elections, car performance, and other areas in
business and politics. One study discussed modeling and
analysis approaches to democracy, as well as various cases of
big data from elections; scenarios in established democracies
such as the United States and Canada, and new democracies
such as Tunisia, were studied [36]. Another study gathered and
explored user practices on Facebook. The model is capable of
arranging entities with effectiveness and efficiency (for
example, presidential candidates, specialized sport groups, and
musical bands) according to their popularity [37].
C. Healthcare data storage and analysis
Big data in health and biology to tackle the challenges in new
models [61][62] is becoming significant. One study introduced
two uses of mHealth, which gathers electronic medical records
that are used for health services terminals [38]. One is a blended
system that enhances the user experience in high-pressured
oxygen halls using VR glasses, which creates the feeling of
being inside it. The other is a sound interaction game that is
used by patients as a possible measurement for supplementary
recovery tools. It is possible to analyze recordings of the sounds
made by patients to assess long-term recovery results and
further forecast the recovery process.
D. Content Recommendation
One study presents a movie recommendation system based on
scores provided by users. In view of the movie evaluation
system, the impacts of access control and multimedia security
are analyzed, and a secure hybrid cloud storage architecture is
presented [41]. Mobile-edge computing technology is used in
the public cloud, which guarantees high-efficiency
requirements for the transmission of multimedia content [55].
The processes of the system, including registration, user login,
role assignment, data encryption, and data decryption, are also
described.
Personalized travel sequence recommendation was proposed
in another study [49], which uses travelogues,
community-contributed photos, and heterogeneous metadata
(e.g., tags, geo-location, and date taken) associated with the
photos. This method is not only personalized to user travel
interests but is also able to recommend a travel sequence rather
than individual points of interest (POIs). To recommend
personalized POI sequences, first, well-known routes are
ranked according to similarity between user packages and route
packages. Then, top-ranked routes are further optimized by
similar user travel records. Representative images with
viewpoints and seasonal diversity of POIs are shown to offer a
more comprehensive impression.
E. Smart City
A 3D Shenzhen city web platform based on a network virtual
reality geographic information system (GIS) was put forward
[39]. A 3D worldwide browser is applied to load different kinds
of required data from a city, such as 3D construction model data,
inhabitants' messages, and traffic data from the past and present.
These data are used to analyze and visualize city information on
a 3D platform. A large number of messages are capable of
being visualized on this platform, and a navigational project,
taking the GIS as the premise, makes it possible to obtain a
variety of data sources that are securable.
The enhancive requirement for fluidity has resulted in great
changes in fundamental facilities in transportation [42].
Possessing certain features, such as a large scale, diversified
foreseeability, and timeliness, city traffic data represent the
scope of big data [40]. Traffic visual analysis systems based on
a virtual reality GIS represent the standard by which traffic data
are controlled and developed. Aside from the fundamental GIS
mutual functions, the system put forward also contains smart
functions for visual analysis and forecast accuracy.
There is also a study that addresses the concept of smart and
connected communities (SCCs), which are no longer solely
defined as smart cities [44][51][52]. Big data analytics in
cyber-physical systems, which are engineered systems that are
built from, and depend upon, the seamless integration of
computational algorithms and physical components, will
enable the move from the IoT to real-time control and towards
the SCC [49]. SCCs were conceived to represent earlier
requirements (e.g., protection and redevelopment) in a
cooperative way, and the requirements for current living
(habitability) and planning for the future (sustainability). In
consequence, the final objective of SCCs is to improve
habitability, protection, redevelopment, and the attainability of
a desirable society. This study uses mobile crowdsourcing and
cyber-physical cloud computing for these two essential IoT
technologies.
F. Challenges in Applications
There is also a common challenge in infrastructure-support
applications in terms of efficiency. The more efficient the
underlying infrastructure, the larger the number of facilities the
next generation will support. For all domains (power prediction,
user behavior, healthcare, content recommendation systems,
and the smart city), a more efficient infrastructure is crucial in
7. 1551-3203 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2650204, IEEE
Transactions on Industrial Informatics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 7
order to support efficient machine learning algorithms and to
develop new ones. The models may scale efficiently with the
amount of data represented in the big data ecosystem, as well as
with the algorithms in charge of offering enhanced
performance.
VII. FUTURE RESEARCH TOPICS
We reviewed data types, analysis methods, data security, and
applications related to network big data. This review shows that
the data retrieval process is focused more and more on
streaming and multiple sensor data. The analysis method
mainly relied on a variant of MapReduce and machine learning.
Data security is a potential problem in the era of big data.
The current research outcomes have indicated that data are no
longer just data [57][67][68]. The value in the updates in big
data lies in the data types, analysis algorithms, or new products.
In the past few years, the growth in big data has been closely
related to mobile and smart devices. The increasing popularity
of the Internet of Things has also generated new types of big
data, and various types of networking facilitate the
interconnection of multivariate networking data. The relevant
smart applications for big data have integrated media,
communications, social networking, and sensors. The
expectations for data collection are also getting critical, with
only useful data being collected to solve urgent issues. With
regard to the development of big data, current facilities have
provided greater convenience and mobility, allowing more
flexible and effective processes for terminal devices and
material collection. The digitization of various types of
information has led the circulation, exchange, processing, and
application of the information towards more organized
standards and structures. The application of data has become
more direct and is moving into real time. The digitized trades
have completely changed human trading behaviors and capital
flows, enabling trading data to be maintained and applied to
analyzing economic principles, as well as offering a reference
for future business model designs. The interactions between
humans and machines generate abundant big data, from which
the potential can be extracted to design a fit mode for human
lives. The cost of acquiring data has been lowered, benefiting
real-time collection and processing of big data, changing the
relationship between decision-making and information, and
increasing the chances of extracting the right model from the
data. Data technology has been widely accepted as an
optimization tool, or is intended for complete innovation. Data
collection, updates, recognition, and correlation will become
more automatic. Since a lot of countries have started to adopt
new data security technology and new data protection laws,
supervision of big data security will be stricter. As for data
security, the public is more concerned with protection of
personal privacy, rather than trade secrets. Besides,
governments enjoy having the most data (other than media and
social media apps), with their data covering resources, finance,
transportation, security, medical care, the environment, food,
and so on. The open data policies of governments matter
critically for the development of the entire data industry. All
points where big data lands are linked with the industries, and
those industries (fully influenced by the Internet), such as
finance, medical care, and e-commerce, can easily be digitized.
Big data has been gradually applied to seek solutions for each
industry.
ACKNOWLEDGMENT
This work was partially supported by “Open3D: Collaborative
Editing for 3D Virtual Worlds” (EPSRC (EP/M013685/1)) and
“Distributed Java Infrastructure for Real-Time Big-data”
(CAS14/00118). It was also partially funded by eMadrid
(S2013/ICE-2715), HERMES-SMARTDRIVER
(TIN2013-46801-C4-2-R), and AUDACity
(TIN2016-77158-C4-1-R).
REFERENCES
[1] Min, C., M. Shiwen, and L. Yunhao. "Big Data: A Survey. Mobile
Networks & Applications." Apr2014 19, no. 2: 171-209.
[2] Snijders, Chris, Uwe Matzat, and Ulf-Dietrich Reips. "" Big Data": big
gaps of knowledge in the field of internet science." International Journal
of Internet Science 7, no. 1 (2012): 1-5.
[3] Jo, Minho, Taras Maksymyuk, Rodrigo L. Batista, Tarcisio F. Maciel,
Andre LF de Almeida, and Mykhailo Klymash. "A survey of converging
solutions for heterogeneous mobile networks." IEEE Wireless
Communications 21, no. 6 (2014): 54-62.
[4] Jo, Minho, Longzhe Han, Nguyen Duy Tan, and Hoh Peter In. "A survey:
energy exhausting attacks in MAC protocols in WBANs."
Telecommunication Systems 58, no. 2 (2015): 153-164.
[5] Baratchi, Mitra, Nirvana Meratnia, and Paul JM Havinga. "On the use of
mobility data for discovery and description of social ties." Proceedings of
the 2013 IEEE/ACM International Conference on Advances in Social
Networks Analysis and Mining. ACM, 2013.
[6] Mesiti, Marco, and Stefano Valtolina. "Towards a user-friendly loading
system for the analysis of Big data in the internet of things." In Computer
Software and Applications Conference Workshops (COMPSACW), 2014
IEEE 38th International, pp. 312-317. IEEE, 2014.
[7] Silva, Thiago H., Pedro OS Vaz De Melo, Jussara M. Almeida, and
Antonio AF Loureiro. "Large-scale study of city dynamics and urban
social behavior using participatory sensing." IEEE Wireless
Communications 21, no. 1 (2014): 42-51.
[8] Guo, Han, Xiaoming Li, Weixi Wang, Zhihan Lv, Chen Wu, and Weiping
Xu. "An event-driven dynamic updating method for 3D geo-databases."
Geo-spatial Information Science (2016): 1-8.
[9] Su, Tianyun, Zhu Cao, Zhihan Lv, Chang Liu, and Xinfang Li.
"Multi-dimensional visualization of large-scale marine hydrological
environmental data." Advances in Engineering Software 95 (2016): 7-15.
[10] Jyoti Sunil, and Chelpa Lingam. "Reality mining based on Social
Network Analysis." In Communication, Information & Computing
Technology (ICCICT), 2015 International Conference on, pp. 1-6. IEEE,
2015.
[11] Chebbi, W. Boulila and I. R. Farah, "Improvement of satellite image
classification: Approach based on Hadoop/MapReduce," 2016 2nd
International Conference on Advanced Technologies for Signal and
Image Processing (ATSIP), Monastir, Tunisia, 2016, pp. 31-34.
[12] M. Sarwat, "Interactive and Scalable Exploration of Big Spatial Data -- A
Data Management Perspective," 2015 16th IEEE International
Conference on Mobile Data Management, Pittsburgh, PA, 2015, pp.
263-270.
[13] Ding, Xuejun, Yong Tian, and Yan Yu. "A real-time Big data gathering
algorithm based on indoor wireless sensor networks for risk analysis of
industrial operations." IEEE Transactions on Industrial Informatics 12, no.
3 (2016): 1232-1242.
[14] Simmonds, R. M., Paul Watson, Jonathan Halliday, and Paolo Missier.
"A Platform for Analysing Stream and Historic Data with Efficient and
Scalable Design Patterns." In 2014 IEEE World Congress on Services, pp.
174-181. IEEE, 2014.
[15] Bär, Arian, Alessandro Finamore, Pedro Casas, Lukasz Golab, and Marco
Mellia. "Large-scale network traffic monitoring with DBStream, a system
for rolling Big data analysis." In Big Data (Big Data), 2014 IEEE
International Conference on, pp. 165-170. IEEE, 2014.
[16] Agnihotri, Nishant, and Aman Kumar Sharma. "Proposed algorithms for
effective real time stream analysis in Big data." In 2015 Third
8. 1551-3203 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2650204, IEEE
Transactions on Industrial Informatics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 8
International Conference on Image Information Processing (ICIIP), pp.
348-352. IEEE, 2015.
[17] Jaradat, Shatha, Nima Dokoohaki, and Mihhail Matskin. "OLLDA: A
Supervised and Dynamic Topic Mining Framework in Twitter." In 2015
IEEE International Conference on Data Mining Workshop (ICDMW), pp.
151354-1359. IEEE, 2015.
[18] Yi, Xiaomeng, Fangming Liu, Jiangchuan Liu, and Hai Jin. "Building a
network highway for Big data: architecture and challenges." IEEE
Network 28, no. 4 (2014): 5-13.
[19] Hargittai, Eszter. "Is bigger always better? Potential biases of Big data
derived from social network sites." The ANNALS of the American
Academy of Political and Social Science 659, no. 1 (2015): 63-76.
[20] Wang, Ke, Xraobing Zhou, Tonglin Li, Dongfang Zhao, Michael Lang,
and Ioan Raicu. "Optimizing load balancing and data-locality with
data-aware scheduling." In Big Data (Big Data), 2014 IEEE International
Conference on, pp. 119-128. IEEE, 2014.
[21] Zhao, Dongfang, Zhao Zhang, Xiaobing Zhou, Tonglin Li, Ke Wang,
Dries Kimpe, Philip Carns, Robert Ross, and Ioan Raicu. "Fusionfs:
Toward supporting data-intensive scientific applications on extreme-scale
high-performance computing systems." In Big Data (Big Data), 2014
IEEE International Conference on, pp. 61-70. IEEE, 2014.
[22] Li, Tonglin, Xiaobing Zhou, Kevin Brandstatter, Dongfang Zhao, Ke
Wang, Anupam Rajendran, Zhao Zhang, and Ioan Raicu. "ZHT: A
light-weight reliable persistent dynamic scalable zero-hop distributed
hash table." In Parallel & distributed processing (IPDPS), 2013 IEEE 27th
international symposium on, pp. 775-787. IEEE, 2013.
[23] Kim, Jaein, Nacwoo Kim, Byungtak Lee, Joonho Park, Kwangik Seo, and
Hunyoung Park. "RUBA: Real-time unstructured network Big data
framework." In 2013 International Conference on ICT Convergence
(ICTC), pp. 518-522. IEEE, 2013.
[24] Lu, Rongxing, Hui Zhu, Ximeng Liu, Joseph K. Liu, and Jun Shao.
"Toward efficient and privacy-preserving computing in Big data era."
IEEE Network 28, no. 4 (2014): 46-50.
[25] Lin, Chi, Zihao Song, Houbing Song, Yanhong Zhou, Yi Wang, and
Guowei Wu. "Differential privacy preserving in Big data analytics for
connected health." Journal of medical systems 40, no. 4 (2016): 1-9.
[26] Suthaharan, Shan. "Big data classification: Problems and challenges in
network intrusion prediction with machine learning." ACM
SIGMETRICS Performance Evaluation Review 41, no. 4 (2014): 70-73.
[27] R. Gentz, H. T. Wai, A. Scaglione and A. Leshem, "Detection of data
injection attacks in decentralized learning," 2015 49th Asilomar
Conference on Signals, Systems and Computers, Pacific Grove, CA, 2015,
pp. 350-354.
[28] Arora, Saurabh, and Inderveer Chana. "A survey of clustering techniques
for network Big data." In Confluence The Next Generation Information
Technology Summit (Confluence), 2014 5th International Conference-,
pp. 59-65. IEEE, 2014.
[29] Manikandan, Shankar Ganesh, and Siddarth Ravi. "Network Big data
using apache hadoop." In IT Convergence and Security (ICITCS), 2014
International Conference on, pp. 1-4. IEEE, 2014.
[30] Xueqi, Li Guojie Cheng. "Research Status and Scientific Thinking of Big
Data [J]." Bulletin of Chinese Academy of Sciences 6 (2012): 000.
[31] Yin, Tengfei, Yong Han, Yong Chen, and Ge Chen. "WebVR——web
virtual reality engine based on P2P network." Journal of Networks 6, no. 7
(2011): 990-998.
[32] Elagib, Sara B., Atahur Rahman Najeeb, Aisha H. Hashim, and Rashidah
F. Olanrewaju. "Network Big data Solutions Using MapReduce
Framework." In Computer and Communication Engineering (ICCCE),
2014 International Conference on, pp. 127-130. IEEE, 2014.
[33] J. Wang, Y. Tang, M. Nguyen and I. Altintas, "A Scalable Data Science
Workflow Approach for Big Data Bayesian Network Learning," Big Data
Computing (BDC), 2014 IEEE/ACM International Symposium on,
London, 2014, pp. 16-25.
[34] Zhiwei, Huang, Gao Tian, Zhang Huaving, Han Xu, Cao Junwei, Hu
Ziheng, Yao Senjing, and Zhu Zhengguo. "Transient power quality
assessment based on network Big data." In 2014 China International
Conference on Electricity Distribution (CICED), pp. 1308-1312. IEEE,
2014.
[35] Skuza, Michał, and Andrzej Romanowski. "Sentiment analysis of Twitter
data within Big data distributed environment for stock prediction." In
Computer Science and Information Systems (FedCSIS), 2015 Federated
[36] Akaichi, Jalel. "The road to democracy: modeling and analysis of an
election Big data." In eDemocracy & eGovernment (ICEDEG), 2015
Second International Conference on, pp. 14-15. IEEE, 2015.
[37] Zhang, Kunpeng, Doug Downey, Zhengzhang Chen, Yusheng Xie, Yu
Cheng, Ankit Agrawal, Wei-keng Liao, and Alok Choudhary. "A
probabilistic graphical model for brand reputation assessment in social
networks." In Proceedings of the 2013 IEEE/ACM International
Conference on Advances in Social Networks Analysis and Mining, pp.
223-230. ACM, 2013.
[38] Lv, Zhihan, Javier Chirivella, and Pablo Gagliardo. "Bigdata Oriented
Multimedia Mobile Health Applications." Journal of medical systems 40,
no. 5 (2016): 1-10.
[39] Lv, Zhihan, Xiaoming Li, Baoyun Zhang, Weixi Wang, Yuanyuan Zhu,
Jinxing Hu, and Shengzhong Feng. "Managing big city information based
on WebVRGIS." IEEE Access 4 (2016): 407-415.
[40] Li, Xiaoming, Zhihan Lv, Weixi Wang, Baoyun Zhang, Jinxing Hu, Ling
Yin, and Shengzhong Feng. "WebVRGIS based traffic analysis and
visualization system." Advances in Engineering Software 93 (2016): 1-8.
[41] Yang, Jiachen, Huanling Wang, Zhihan Lv, Wei Wei, Houbing Song,
Melike Erol-Kantarci, Burak Kantarci, and Shudong He. "Multimedia
recommendation and transmission system based on cloud platform."
Future Generation Computer Systems (2016).
[42] Dimitrakopoulos, George, and Panagiotis Demestichas. "Intelligent
transportation systems." IEEE Vehicular Technology Magazine 5, no. 1
(2010): 77-84.
[43] Li, Tonglin, Xiaobing Zhou, Ke Wang, Dongfang Zhao, Iman Sadooghi,
Zhao Zhang, and Ioan Raicu. "A convergence of key‐value storage
systems from clouds to supercomputers." Concurrency and Computation:
Practice and Experience 28, no. 1 (2016): 44-69.
[44] Bi, Wenjie, Meili Cai, Mengqi Liu, and Guo Li. "A Big data clustering
algorithm for mitigating the risk of customer churn." IEEE Transactions
on Industrial Informatics 12, no. 3 (2016): 1270-1281.
[45] Zheng, Yu. "Methodologies for cross-domain data fusion: An
overview." IEEE transactions on Big Data 1.1 (2015): 16-34.
[46] Xing, E. P., Ho, Q., Dai, W., Kim, J. K., Wei, J., Lee, S., ... & Yu, Y.
(2015). Petuum: A new platform for distributed machine learning on Big
data. IEEE Transactions on Big Data, 1(2), 49-67.
[47] Wu, F., Wang, Z., Zhang, Z., Yang, Y., Luo, J., Zhu, W., & Zhuang, Y.
(2015). Weakly Semi-Supervised Deep Learning for Multi-Label Image
Annotation.IEEE Transactions on Big Data, 1(3), 109-122.
[48] Yang, Y., Shen, F., Shen, H. T., Li, H., & Li, X. (2015). Robust Discrete
Spectral Hashing for Large-Scale Image Semantic Indexing. IEEE
Transactions on Big Data, 1(4), 162-171.
[49] Jiang, S., Qian, X., Mei, T., & Fu, Y. (2016). Personalized Travel
Sequence Recommendation on Multi-Source Big Social Media. IEEE
Transactions on Big Data, 2(1), 43-56.
[50] H. Song, D. Rawat, S. Jeschke, C. Brecher, Cyber-Physical Systems:
Foundations, Principles and Applications, ISBN 978-0-12-803801-7,
Academic Press, Boston, MA, USA, 2016
[51] Y. Sun, H. Song, A. J. Jara, & R. Bie (2016). Internet of Things and Big
Data Analytics for Smart and Connected Communities. IEEE Access, 4,
766-773.
[52] H. Song, S. Ravi, T. Sookoor, S. Jeschke, Smart Cities:
Foundations, Principles and Applications, ISBN: 978-1119226390,
Wiley, Hoboken, NJ, USA, 2017.
[53] H. Song, G. Fink, S. Jeschke, Security and Privacy in Cyber-Physical
Systems: Foundations and Applications, Wiley, England, UK, 2017.
[54] Zhang, Y., Sun, L., Song, H., & Cao, X. (2014). Ubiquitous wsn for
healthcare: Recent advances and future prospects. IEEE Internet of
Things Journal, 1(4), 311-318.
[55] L. A. Tawalbeh, W. Bakheder and H. Song, "A Mobile Cloud Computing
Model Using the Cloudlet Scheme for Big Data Applications," 2016
IEEE First International Conference on Connected Health: Applications,
Systems and Engineering Technologies (CHASE), Washington, DC, 2016,
pp. 73-77.
[56] Lohr, S. (2012). The age of Big data. New York Times, 11.
[57] Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable
of Google flu: traps in Big data analysis. Science, 343(6176), 1203-1205.
[58] Menon, A. (2012, September). Big data@ facebook. In Proceedings of
the 2012 workshop on Management of Big data systems (pp. 31-32).
ACM.
[59] Chen, C. P., & Zhang, C. Y. (2014). Data-intensive applications,
challenges, techniques and technologies: A survey on Big
Data. Information Sciences, 275, 314-347.
[60] Snijders, C., Matzat, U., & Reips, U. D. (2012). " Big Data": big gaps of
knowledge in the field of internet science. International Journal of
Internet Science, 7(1), 1-5.
9. 1551-3203 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2650204, IEEE
Transactions on Industrial Informatics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 9
[61] Marx, V. (2013). Biology: The big challenges of Big
data. Nature, 498(7453), 255-260.
[62] Raghupathi, W., & Raghupathi, V. (2014). Big data analytics in
healthcare: promise and potential. Health Information Science and
Systems, 2(1), 1.
[63] Jagadish, H. V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J.
M., Ramakrishnan, R., & Shahabi, C. (2014). Big data and its technical
challenges. Communications of the ACM, 57(7), 86-94.
[64] Hu, H., Wen, Y., Chua, T. S., & Li, X. (2014). Toward scalable systems
for big data analytics: A technology tutorial. IEEE Access, 2, 652-687.
[65] Kitchin, R. (2014). The real-time city? Big data and smart urbanism.
GeoJournal, 79(1), 1-14.
[66] Kambatla, K., Kollias, G., Kumar, V., & Grama, A. (2014). Trends in big
data analytics. Journal of Parallel and Distributed Computing, 74(7),
2561-2573.
[67] Fan, J., Han, F., & Liu, H. (2014). Challenges of big data
analysis. National science review, 1(2), 293-314.
[68] Boellstorff, T. (2015). Coming of age in Second Life: An anthropologist
explores the virtually human. Princeton University Press.
[69] Leonardo Aniello , Roberto Baldoni , Leonardo Querzoni, Adaptive
online scheduling in storm, Proceedings of the 7th ACM international
conference on Distributed event-based systems, June 29-July 03, 2013,
Arlington, Texas, USA [doi>10.1145/2488222.2488267]
[70] Pablo Basanta-Val, Norberto Fernández García, Andy J. Wellings, Neil C.
Audsley: Improving the predictability of distributed stream processors.
Future Generation Comp. Syst. 52: 22-36 (2015) .
[71] Pablo Basanta-Val, Marisol García-Valls: A Distributed Real-Time
Java-Centric Architecture for Industrial Systems. IEEE Trans. Industrial
Informatics 10(1): 27-34 (2014)
[72] Mariluz Congosto, Damaris Fuentes-Lorenzo, Luis Sánchez:
Microbloggers as Sensors for Public Transport Breakdowns. IEEE
Internet Computing 19(6): 18-25 (2015)
[73] P. Basanta-Val; N. C. Audsley; A. Wellings; I. Gray; N.
Fernandez-Garcia, "Architecting Time-Critical Big-Data Systems," in
IEEE Transactions on Big Data, vol.PP, no.99, pp.1-1. doi:
10.1109/TBDATA.2016.2622719
[74] T. Higuera-Toledano, "Java Technologies for Cyber-Physical Systems,"
in IEEE Transactions on Industrial Informatics , vol.PP, no.99, pp.1-1
doi: 10.1109/TII.2016.2630801
Zhihan Lv is an engineer and researcher in
virtual/augmented reality and multimedia,
with a major in mathematics and computer
science, having plenty of work experience
in virtual reality and augmented reality
projects, and is engaged in the application
of computer visualization and computer
vision. His research application fields range
widely, from everyday life to traditional research fields
(geography, biology, medicine). During the past few years, he
has successfully completed several projects on PCs, Websites,
Smartphones, and Smart glasses.
Houbing Song (M’12-SM’14) received his
PhD in electrical engineering from the
University of Virginia, Charlottesville, VA,
in 2012. In August 2012, he joined the
Department of Electrical and Computer
Engineering, West Virginia University,
Montgomery, WV, where he is currently
the Golden Bear Scholar, an Assistant
Professor, and the Founding Director of the Security and
Optimization for Networked Globe Laboratory (SONG Lab,
www.SONGLab.us). He is the editor of four books, including
Smart Cities: Foundations, Principles and Applications,
Hoboken, NJ: Wiley, 2017, Security and Privacy in
Cyber-Physical Systems: Foundations, Principles and
Applications, Chichester, UK: Wiley, 2017, Cyber-Physical
Systems: Foundations, Principles and Applications, Boston,
MA: Academic Press, 2016, and Industrial Internet of Things:
Cybermanufacturing Systems, Cham, Switzerland: Springer,
2016. He is the author of more than 100 articles. His research
interests lie in the Cyber-Physical Systems, Internet of Things,
edge computing, big data analytics, wireless communications,
and networking.
Pablo Basanta-Val is an Associate
Professor at the Telematics Engineering
Department at Universidad Carlos III de
Madrid, Spain. He is also a member of the
UC3M-BS Institute of Financial Big Data
(IFiBiD) at Universidad Carlos III de
Madrid, Spain. He was a member of the
Distributed Real-Time Systems Lab starting in 2001, and since
2014, he has been enrolled in the Web Semantic Lab. His main
interests are efficient big data and the design of efficient
middleware to support big data applications. He has
participated in a number of European projects, such as ARTIST,
ARTIST NoE, ARTIST Design, iLAND, as well as several
national projects. He collaborates in IEEE/ ACM/ Elsevier/
Wiley journals, such as IEEE Transactions on Industrial
Informatics, ACM Transactions on Embedded Systems
Computing, IEEE Transactions on Parallel and Distributed
Computing.
Anthony Steed is a Professor in the Virtual
Environments and Computer Graphics group
in the Department of Computer Science,
University College London. He is currently
head of the group. His research area is
real-time interactive virtual environments,
with a particular interest in mixed-reality
systems, large-scale models, and
collaboration between immersive facilities.
Minho Jo (SM’16) received a BA from the
Department of Industrial Engineering,
Chosun University, South Korea, in 1984,
and a PhD from the Department of Industrial
and Systems Engineering, Lehigh University,
USA, in 1994. He is currently a Professor
with the Department of Computer
Convergence Software, Korea University, Sejong Metro City,
South Korea. He is one of the founders of the Samsung
Electronics LCD Division. He is the Founder and
Editor-in-Chief of KSII Transactions on Internet and
Information Systems [SCI (ISI) and SCOPUS]. He served as
Editor of IEEE Network. He is an Editor of IEEE Wireless
Communications, and an Associate Editor of the IEEE
INTERNET OF THINGS JOURNAL, Security and
Communication Networks, Ad-Hoc & Sensor Wireless
Networks, and Wireless Communications and Mobile
Computing. He was Vice President of the Institute of
Electronics and Information Engineers (IEIE) and is now Vice
President of the Korean Society for Internet Information (KSII).
His current research of interests include IoT, big data in IoT, AI
in IoT, LET-Unlicensed, cognitive radio, mobile cloud
computing, network security, heterogeneous networks in 5G,
wireless energy harvesting, optimization in wireless
communications, and massive MIMO.