A tutorial on secure outsourcing of large scalecomputation for big data
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Frequency and similarity aware partitioning for cloud storage based on space ...redpel dot com
Frequency and similarity aware partitioning for cloud storage based on space time utility maximization model.
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian networkIJECEIAES
Recent progress on real-time systems are growing high in information technology which is showing importance in every single innovative field. Different applications in IT simultaneously produce the enormous measure of information that should be taken care of. In this paper, a novel algorithm of adaptive knowledge-based Bayesian network is proposed to deal with the impact of big data congestion in decision processing. A Bayesian system show is utilized to oversee learning arrangement toward all path for the basic leadership process. Information of Bayesian systems is routinely discharged as an ideal arrangement, where the examination work is to find a development that misuses a measurably inspired score. By and large, available information apparatuses manage this ideal arrangement by methods for normal hunt strategies. As it required enormous measure of information space, along these lines it is a tedious method that ought to be stayed away from. The circumstance ends up unequivocal once huge information include in hunting down ideal arrangement. A calculation is acquainted with achieve quicker preparing of ideal arrangement by constraining the pursuit information space. The proposed algorithm consists of recursive calculation intthe inquiry space. The outcome demonstrates that the ideal component of the proposed algorithm can deal with enormous information by processing time, and a higher level of expectation rates.
Privacy preserving and delegated access control for cloud applicationsredpel dot com
Privacy preserving and delegated access control for cloud applications
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Frequency and similarity aware partitioning for cloud storage based on space ...redpel dot com
Frequency and similarity aware partitioning for cloud storage based on space time utility maximization model.
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian networkIJECEIAES
Recent progress on real-time systems are growing high in information technology which is showing importance in every single innovative field. Different applications in IT simultaneously produce the enormous measure of information that should be taken care of. In this paper, a novel algorithm of adaptive knowledge-based Bayesian network is proposed to deal with the impact of big data congestion in decision processing. A Bayesian system show is utilized to oversee learning arrangement toward all path for the basic leadership process. Information of Bayesian systems is routinely discharged as an ideal arrangement, where the examination work is to find a development that misuses a measurably inspired score. By and large, available information apparatuses manage this ideal arrangement by methods for normal hunt strategies. As it required enormous measure of information space, along these lines it is a tedious method that ought to be stayed away from. The circumstance ends up unequivocal once huge information include in hunting down ideal arrangement. A calculation is acquainted with achieve quicker preparing of ideal arrangement by constraining the pursuit information space. The proposed algorithm consists of recursive calculation intthe inquiry space. The outcome demonstrates that the ideal component of the proposed algorithm can deal with enormous information by processing time, and a higher level of expectation rates.
Privacy preserving and delegated access control for cloud applicationsredpel dot com
Privacy preserving and delegated access control for cloud applications
for more ieee paper / full abstract / implementation , just visit www.redpel.com
A Novel Integrated Framework to Ensure Better Data Quality in Big Data Analyt...IJECEIAES
With advent of Big Data Analytics, the healthcare system is increasingly adopting the analytical services that is ultimately found to generate massive load of highly unstructured data. We reviewed the existing system to find that there are lesser number of solutions towards addressing the problems of data variety, data uncertainty, and data speed. It is important that an errorfree data should arrive in analytics. Existing system offers single-hand solution towards single platform. Therefore, we introduced an integrated framework that has the capability to address all these three problems in one execution time. Considering the synthetic big data of healthcare, we carried out the investigation to find that our proposed system using deep learning architecture offers better optimization of computational resources. The study outcome is found to offer comparatively better response time and higher accuracy rate as compared to existing optimization technqiues that is found and practiced widely in literature.
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
Data mining works to extract information known in advance from the enormous quantities of data which can lead to knowledge. It provides information that helps to make good decisions. The effectiveness of data mining in access to knowledge to achieve the goal of which is the discovery of the hidden facts contained in databases and through the use of multiple technologies. Clustering is organizing data into clusters or groups such that they have high intra-cluster similarity and low inter cluster similarity. This paper deals with K-means clustering algorithm which collect a number of data based on the characteristics and attributes of this data, and process the Clustering by reducing the distances between the data center. This algorithm is applied using open source tool called WEKA, with the Insurance dataset as its input
Granularity analysis of classification and estimation for complex datasets wi...IJECEIAES
Dispersed and unstructured datasets are substantial parameters to realize an exact amount of the required space. Depending upon the size and the data distribution, especially, if the classes are significantly associating, the level of granularity to agree a precise classification of the datasets exceeds. The data complexity is one of the major attributes to govern the proper value of the granularity, as it has a direct impact on the performance. Dataset classification exhibits the vital step in complex data analytics and designs to ensure that dataset is prompt to be efficiently scrutinized. Data collections are always causing missing, noisy and out-of-the-range values. Data analytics which has not been wisely classified for problems as such can induce unreliable outcomes. Hence, classifications for complex data sources help comfort the accuracy of gathered datasets by machine learning algorithms. Dataset complexity and pre-processing time reflect the effectiveness of individual algorithm. Once the complexity of datasets is characterized then comparatively simpler datasets can further investigate with parallelism approach. Speedup performance is measured by the execution of MOA simulation. Our proposed classification approach outperforms and improves granularity level of complex datasets.
Content an Insight to Security Paradigm for BigData on Cloud: Current Trend a...IJECEIAES
The sucesssive growth of collabrative applications producing Bigdata on timeline leads new opprutinity to setup commodities on cloud infrastructure. Mnay organizations will have demand of an efficient data storage mechanism and also the efficient data analysis. The Big Data (BD) also faces some of the security issues for the important data or information which is shared or transferred over the cloud. These issues include the tampering, losing control over the data, etc. This survey work offers some of the interesting, important aspects of big data including the high security and privacy issue. In this, the survey of existing research works for the preservation of privacy and security mechanism and also the existing tools for it are stated. The discussions for upcoming tools which are needed to be focused on performance improvement are discussed. With the survey analysis, a research gap is illustrated, and a future research idea is presented
Big data is a prominent term which characterizes the improvement and availability of data in all three
formats like structure, unstructured and semi formats. Structure data is located in a fixed field of a record
or file and it is present in the relational data bases and spreadsheets whereas an unstructured data file
includes text and multimedia contents. The primary objective of this big data concept is to describe the
extreme volume of data sets i.e. both structured and unstructured. It is further defined with three “V”
dimensions namely Volume, Velocity and Variety, and two more “V” also added i.e. Value and Veracity.
Volume denotes the size of data, Velocity depends upon the speed of the data processing, Variety is
described with the types of the data, Value which derives the business value and Veracity describes about
the quality of the data and data understandability. Nowadays, big data has become unique and preferred
research areas in the field of computer science. Many open research problems are available in big data
and good solutions also been proposed by the researchers even though there is a need for development of
many new techniques and algorithms for big data analysis in order to get optimal solutions. In this paper,
a detailed study about big data, its basic concepts, history, applications, technique, research issues and
tools are discussed.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
Big data Mining Using Very-Large-Scale Data Processing PlatformsIJERA Editor
Big Data consists of large-volume, complex, growing data sets with multiple, heterogenous sources. With the
tremendous development of networking, data storage, and the data collection capacity, Big Data are now rapidly
expanding in all science and engineering domains, including physical, biological and biomedical sciences. The
MapReduce programming mode which has parallel processing ability to analyze the large-scale network.
MapReduce is a programming model that allows easy development of scalable parallel applications to process
big data on large clusters of commodity machines. Google’s MapReduce or its open-source equivalent Hadoop
is a powerful tool for building such applications.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
NEW ALGORITHMS FOR SECURE OUTSOURCING OF LARGE-SCALE SYSTEMS OF LINEAR EQUAT...Nexgen Technology
bulk ieee projects in pondicherry,ieee projects in pondicherry,final year ieee projects in pondicherry
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
A Novel Integrated Framework to Ensure Better Data Quality in Big Data Analyt...IJECEIAES
With advent of Big Data Analytics, the healthcare system is increasingly adopting the analytical services that is ultimately found to generate massive load of highly unstructured data. We reviewed the existing system to find that there are lesser number of solutions towards addressing the problems of data variety, data uncertainty, and data speed. It is important that an errorfree data should arrive in analytics. Existing system offers single-hand solution towards single platform. Therefore, we introduced an integrated framework that has the capability to address all these three problems in one execution time. Considering the synthetic big data of healthcare, we carried out the investigation to find that our proposed system using deep learning architecture offers better optimization of computational resources. The study outcome is found to offer comparatively better response time and higher accuracy rate as compared to existing optimization technqiues that is found and practiced widely in literature.
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
Data mining works to extract information known in advance from the enormous quantities of data which can lead to knowledge. It provides information that helps to make good decisions. The effectiveness of data mining in access to knowledge to achieve the goal of which is the discovery of the hidden facts contained in databases and through the use of multiple technologies. Clustering is organizing data into clusters or groups such that they have high intra-cluster similarity and low inter cluster similarity. This paper deals with K-means clustering algorithm which collect a number of data based on the characteristics and attributes of this data, and process the Clustering by reducing the distances between the data center. This algorithm is applied using open source tool called WEKA, with the Insurance dataset as its input
Granularity analysis of classification and estimation for complex datasets wi...IJECEIAES
Dispersed and unstructured datasets are substantial parameters to realize an exact amount of the required space. Depending upon the size and the data distribution, especially, if the classes are significantly associating, the level of granularity to agree a precise classification of the datasets exceeds. The data complexity is one of the major attributes to govern the proper value of the granularity, as it has a direct impact on the performance. Dataset classification exhibits the vital step in complex data analytics and designs to ensure that dataset is prompt to be efficiently scrutinized. Data collections are always causing missing, noisy and out-of-the-range values. Data analytics which has not been wisely classified for problems as such can induce unreliable outcomes. Hence, classifications for complex data sources help comfort the accuracy of gathered datasets by machine learning algorithms. Dataset complexity and pre-processing time reflect the effectiveness of individual algorithm. Once the complexity of datasets is characterized then comparatively simpler datasets can further investigate with parallelism approach. Speedup performance is measured by the execution of MOA simulation. Our proposed classification approach outperforms and improves granularity level of complex datasets.
Content an Insight to Security Paradigm for BigData on Cloud: Current Trend a...IJECEIAES
The sucesssive growth of collabrative applications producing Bigdata on timeline leads new opprutinity to setup commodities on cloud infrastructure. Mnay organizations will have demand of an efficient data storage mechanism and also the efficient data analysis. The Big Data (BD) also faces some of the security issues for the important data or information which is shared or transferred over the cloud. These issues include the tampering, losing control over the data, etc. This survey work offers some of the interesting, important aspects of big data including the high security and privacy issue. In this, the survey of existing research works for the preservation of privacy and security mechanism and also the existing tools for it are stated. The discussions for upcoming tools which are needed to be focused on performance improvement are discussed. With the survey analysis, a research gap is illustrated, and a future research idea is presented
Big data is a prominent term which characterizes the improvement and availability of data in all three
formats like structure, unstructured and semi formats. Structure data is located in a fixed field of a record
or file and it is present in the relational data bases and spreadsheets whereas an unstructured data file
includes text and multimedia contents. The primary objective of this big data concept is to describe the
extreme volume of data sets i.e. both structured and unstructured. It is further defined with three “V”
dimensions namely Volume, Velocity and Variety, and two more “V” also added i.e. Value and Veracity.
Volume denotes the size of data, Velocity depends upon the speed of the data processing, Variety is
described with the types of the data, Value which derives the business value and Veracity describes about
the quality of the data and data understandability. Nowadays, big data has become unique and preferred
research areas in the field of computer science. Many open research problems are available in big data
and good solutions also been proposed by the researchers even though there is a need for development of
many new techniques and algorithms for big data analysis in order to get optimal solutions. In this paper,
a detailed study about big data, its basic concepts, history, applications, technique, research issues and
tools are discussed.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
Big data Mining Using Very-Large-Scale Data Processing PlatformsIJERA Editor
Big Data consists of large-volume, complex, growing data sets with multiple, heterogenous sources. With the
tremendous development of networking, data storage, and the data collection capacity, Big Data are now rapidly
expanding in all science and engineering domains, including physical, biological and biomedical sciences. The
MapReduce programming mode which has parallel processing ability to analyze the large-scale network.
MapReduce is a programming model that allows easy development of scalable parallel applications to process
big data on large clusters of commodity machines. Google’s MapReduce or its open-source equivalent Hadoop
is a powerful tool for building such applications.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
NEW ALGORITHMS FOR SECURE OUTSOURCING OF LARGE-SCALE SYSTEMS OF LINEAR EQUAT...Nexgen Technology
bulk ieee projects in pondicherry,ieee projects in pondicherry,final year ieee projects in pondicherry
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
On minimizing data forwarding schedule in multi transmit receive wireless mes...redpel dot com
On minimizing data forwarding schedule in multi transmit receive wireless mesh network
for more ieee paper / full abstract / implementation , just visit www.redpel.com
An efficient tree based self-organizing protocol for internet of thingsredpel dot com
An efficient tree based self-organizing protocol for internet of things.
for more ieee paper / full abstract / implementation , just visit www.redpel.com
A parallel patient treatment time prediction algorithm and its applications i...redpel dot com
A parallel patient treatment time prediction algorithm and its applications in hospital.
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Virtual Machine Allocation Policy in Cloud Computing Environment using CloudSim IJECEIAES
Cloud computing has been widely accepted by the researchers for the web applications. During the past years, distributed computing replaced the centralized computing and finally turned towards the cloud computing. One can see lots of applications of cloud computing like online sale and purchase, social networking web pages, country wide virtual classes, digital libraries, sharing of pathological research labs, supercomputing and many more. Creating and allocating VMs to applications use virtualization concept. Resource allocates policies and load balancing polices play an important role in managing and allocating resources as per application request in a cloud computing environment. Cloud analyst is a GUI tool that simulates the cloud-computing environment. In the present work, the cloud servers are arranged through step network and a UML model for a minimization of energy consumption by processor, dynamic random access memory, hard disk, electrical components and mother board is developed. A well Unified Modeling Language is used for design of a class diagram. Response time and internet characteristics have been demonstrated and computed results are depicted in the form of tables and graphs using the cloud analyst simulation tool.
An efficient approach on spatial big data related to wireless networks and it...eSAT Journals
Abstract
Spatial big data acts as a important key role in wireless networks applications. In that spatial and spatio temporal problems contains the distinct role in big data and it’s compared to common relational problems. If we are solving those problems means describing the three applications for spatial big data. In each applications imposing the specific design and we are developing our work on highly scalable parallel processing for spatial big data in Hadoop frameworks by using map reduce computational model. Our results show that enables highly scalable implementations of algorithms using Hadoop for the purpose of spatial data processing problems. Inspite of developing these implementations requires specialized knowledge and user friendly.
Keywords: Spatial Big Data, Hadoop, Wireless Networks, Map reduce
Different Classification Technique for Data mining in Insurance Industry usin...IOSRjournaljce
this paper addresses the issues and techniques for Property/Casualty actuaries applying data mining methods. Data mining means the effective unknown pattern discovery from a large amount database. It is an interactive knowledge discovery procedure which is includes data acquisition, data integration, data exploration, model building, and model validation. The paper provides an overview of the data discovery method and introduces some important data mining method for application to insurance concluding cluster discovery approaches.
Cloud computing has become the mainstream of the emerging technologies for information interchange and accessibility. With such systems, the information accessed from any geographic location on this planet with some decent kind of internet connection. Applying machine learning together with artificial intelligence in dealing with the problem of energy reduction in cloud data center is an innovative idea. A large combination of Artificial intelligence is playing a significant role in cloud environment. For that matter, the Big organization providers like Amazon have taken steps to ensure that they can continue to expand their fast-growing cloud services to commensurate with the fast growth of population. These companies have built large data centers in remote parts of the world to overcome a shortage of information. These centers consume significant amounts of electrical energy. There is often a lot of energy wastage. According to IDC white paper, data centers have tremendously wasted billions of energy regarding billing and cash. Additionally, researchers have argued that by the year 2020 the energy consumption rate would have doubled. Research in this area is still a hot topic. This paper seeks to address the energy efficiency issue at a Cloud Data Center using machine learning methodologies, principles, and practices. This article also aims to bring out possible future implementation methods for artificially intelligent agents that would help reduce energy wastage at a Cloud data center and thus help ameliorate the great big energy problem at hand
Cloud computing has become the mainstream of the emerging technologies for information interchange and accessibility. With such systems, the information accessed from any geographic location on this planet with some decent kind of internet connection. Applying machine learning together with artificial intelligence in dealing with the problem of energy reduction in cloud data center is an innovative idea. A large combination of Artificial intelligence is playing a significant role in cloud environment. For that matter, the Big organization providers like Amazon have taken steps to ensure that they can continue to expand their fast-growing cloud services to commensurate with the fast growth of population. These companies have built large data centers in remote parts of the world to overcome a shortage of information. These centers consume significant amounts of electrical energy. There is often a lot of energy wastage. According to IDC white paper, data centers have tremendously wasted billions of energy regarding billing and cash. Additionally, researchers have argued that by the year 2020 the energy consumption rate would have doubled. Research in this area is still a hot topic. This paper seeks to address the energy efficiency issue at a Cloud Data Center using machine learning methodologies, principles, and practices. This article also aims to bring out possible future implementation methods for artificially intelligent agents that would help reduce energy wastage at a Cloud data center and thus help ameliorate the great big energy problem at hand.
SPECIAL SECTION ON HEALTHCARE BIG DATAReceived August 16, ChereCheek752
SPECIAL SECTION ON HEALTHCARE BIG DATA
Received August 16, 2016, accepted September 11, 2016, date of publication September 26, 2016,
date of current version October 15, 2016.
Digital Object Identifier 10.1109/ACCESS.2016.2613278
Mobile Cloud Computing Model and Big Data
Analysis for Healthcare Applications
LO’AI A. TAWALBEH1,2, (Senior Member, IEEE), RASHID MEHMOOD3, (Senior Member, IEEE),
ELHADJ BENKHLIFA4, AND HOUBING SONG5, (Member, IEEE)
1Department of Computer Engineering, Jordan University of science and Technology, Irbid 22110, Jordan
2Department of Computer Engineering, Umm Al-Qura University, Mecca 21955, Saudi Arabia
3High Performance Computing Centre, King Abdulaziz University, Jeddah 21589, Saudi Arabia
4Staffordshire University, Stoke-on-Trent ST4 2DE, U.K.
5Department of Electrical and Computer Engineering, West Virginia University, Morgantown, WV 26506, USA
Corresponding author: L. A. Tawalbeh ([email protected])
This work was supported by the Long-Term National Science Technology and Innovation Plan (LT-NSTIP) under Grant 13-ELE2527-10
and in part by the King Abdulaziz City for Science and Technology (KACST), Saudi Arabia.
ABSTRACT Mobile devices are increasingly becoming an indispensable part of people’s daily life,
facilitating to perform a variety of useful tasks. Mobile cloud computing integrates mobile and cloud
computing to expand their capabilities and benefits and overcomes their limitations, such as limited memory,
CPU power, and battery life. Big data analytics technologies enable extracting value from data having four
Vs: volume, variety, velocity, and veracity. This paper discusses networked healthcare and the role of mobile
cloud computing and big data analytics in its enablement. The motivation and development of networked
healthcare applications and systems is presented along with the adoption of cloud computing in healthcare.
A cloudlet-based mobile cloud-computing infrastructure to be used for healthcare big data applications
is described. The techniques, tools, and applications of big data analytics are reviewed. Conclusions are
drawn concerning the design of networked healthcare systems using big data and mobile cloud-computing
technologies. An outlook on networked healthcare is given.
INDEX TERMS Healthcare systems, big data analytics, mobile cloud computing, cloudlet infrastructure,
health applications.
I. INTRODUCTION
Recently, there have been many advances in information and
communication technologies that have been transforming the
world; the world is increasingly becoming a small neighbor-
hood. Among these technologies are the cloud computing, the
wireless communications (3G/4G/5G), and the competitive
mobile devices industry. The mobile devices can provide
variety of services to facilitate our living style [1]. They are
integrated in our daily routine to help performing variety
of tasks such as location determination, time management,
image processing, booking hotels, selling and buying onli ...
Service Level Comparison for Online Shopping using Data MiningIIRindia
The term knowledge discovery in databases (KDD) is the analysis step of data mining. The data mining goal is to extract the knowledge and patterns from large data sets, not the data extraction itself. Big-Data Computing is a critical challenge for the ICT industry. Engineers and researchers are dealing with the cloud computing paradigm of petabyte data sets. Thus the demand for building a service stack to distribute, manage and process massive data sets has risen drastically. We investigate the problem for a single source node to broadcast the big chunk of data sets to a set of nodes to minimize the maximum completion time. These nodes may locate in the same datacenter or across geo-distributed data centers. The Big-data broadcasting problem is modeled into a LockStep Broadcast Tree (LSBT) problem. And the main idea of the LSBT is defining a basic unit of upload bandwidth, r, a node with capacity c broadcasts data to a set of [c=r] children at the rate r. Note that r is a parameter to be optimized as part of the LSBT problem. The broadcast data are further divided into m chunks. In a pipeline manner, these m chunks can then be broadcast down the LSBT. In a homogeneous network environment in which each node has the same upload capacity c, the optimal uplink rate r, of LSBT is either c=2 or 3, whichever gives the smaller maximum completion time. For heterogeneous environments, an O(nlog2n) algorithm is presented to select an optimal uplink rate r, and to construct an optimal LSBT. With lower computational complexity and low maximum completion time, the numerical results shows better performance.The methodology includes Various Web applications Building and Broadcasting followed by the Gateway Application and Batch Processing over the TSV Data after which the Web Crawling for Resources and MapReduce process takes place and finally Picking Products from Recommendations and Purchasing it.
SPECIAL SECTION ON HEALTHCARE BIG DATAReceived August 16, .docxrosiecabaniss
SPECIAL SECTION ON HEALTHCARE BIG DATA
Received August 16, 2016, accepted September 11, 2016, date of publication September 26, 2016,
date of current version October 15, 2016.
Digital Object Identifier 10.1109/ACCESS.2016.2613278
Mobile Cloud Computing Model and Big Data
Analysis for Healthcare Applications
LO’AI A. TAWALBEH1,2, (Senior Member, IEEE), RASHID MEHMOOD3, (Senior Member, IEEE),
ELHADJ BENKHLIFA4, AND HOUBING SONG5, (Member, IEEE)
1Department of Computer Engineering, Jordan University of science and Technology, Irbid 22110, Jordan
2Department of Computer Engineering, Umm Al-Qura University, Mecca 21955, Saudi Arabia
3High Performance Computing Centre, King Abdulaziz University, Jeddah 21589, Saudi Arabia
4Staffordshire University, Stoke-on-Trent ST4 2DE, U.K.
5Department of Electrical and Computer Engineering, West Virginia University, Morgantown, WV 26506, USA
Corresponding author: L. A. Tawalbeh ([email protected])
This work was supported by the Long-Term National Science Technology and Innovation Plan (LT-NSTIP) under Grant 13-ELE2527-10
and in part by the King Abdulaziz City for Science and Technology (KACST), Saudi Arabia.
ABSTRACT Mobile devices are increasingly becoming an indispensable part of people’s daily life,
facilitating to perform a variety of useful tasks. Mobile cloud computing integrates mobile and cloud
computing to expand their capabilities and benefits and overcomes their limitations, such as limited memory,
CPU power, and battery life. Big data analytics technologies enable extracting value from data having four
Vs: volume, variety, velocity, and veracity. This paper discusses networked healthcare and the role of mobile
cloud computing and big data analytics in its enablement. The motivation and development of networked
healthcare applications and systems is presented along with the adoption of cloud computing in healthcare.
A cloudlet-based mobile cloud-computing infrastructure to be used for healthcare big data applications
is described. The techniques, tools, and applications of big data analytics are reviewed. Conclusions are
drawn concerning the design of networked healthcare systems using big data and mobile cloud-computing
technologies. An outlook on networked healthcare is given.
INDEX TERMS Healthcare systems, big data analytics, mobile cloud computing, cloudlet infrastructure,
health applications.
I. INTRODUCTION
Recently, there have been many advances in information and
communication technologies that have been transforming the
world; the world is increasingly becoming a small neighbor-
hood. Among these technologies are the cloud computing, the
wireless communications (3G/4G/5G), and the competitive
mobile devices industry. The mobile devices can provide
variety of services to facilitate our living style [1]. They are
integrated in our daily routine to help performing variety
of tasks such as location determination, time management,
image processing, booking hotels, selling and buying onli ...
A time efficient approach for detecting errors in big sensor data on cloudNexgen Technology
TO GET THIS PROJECT COMPLETE SOURCE CODE PLEASE CALL BEOLOW CONTACT DETAILS
MOBILE: 9791938249, 0413-2211159, WEB: WWW.NEXGENPROJECT.COM ,EMAIL:Praveen@nexgenproject.com
NEXGEN TECHNOLOGY provides total software solutions to its customers. Apsys works closely with the customers to identify their business processes for computerization and help them implement state-of-the-art solutions. By identifying and enhancing their processes through information technology solutions. NEXGEN TECHNOLOGY help it customers optimally use their resources.
Analysing Transportation Data with Open Source Big Data Analytic Toolsijeei-iaes
Big data analytics allows a vast amount of structured and unstructured data to be effectively processed so that correlations, hidden patterns, and other useful information can be mined from the data. Several open source big data analytic tools that can perform tasks such as dimensionality reduction, feature extraction, transformation, optimization, are now available. One interesting area where such tools can provide effective solutions is transportation. Big data analytics can be used to efficiently manage transport infrastructure assets such as roads, airports, bus stations or ports. In this paper an overview of two open source big data analytic tools is first provided followed by a simple demonstration of application of these tools on transport dataset.
Similar to A tutorial on secure outsourcing of large scalecomputation for big data (20)
Web Service QoS Prediction Based on Adaptive Dynamic Programming Using Fuzzy ...redpel dot com
Web Service QoS Prediction Based on Adaptive Dynamic Programming Using Fuzzy Neural Networks for Cloud Services
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Performance evaluation and estimation model using regression method for hadoo...redpel dot com
Performance evaluation and estimation model using regression method for hadoop word count.
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Multiagent multiobjective interaction game system for service provisoning veh...redpel dot com
Multiagent multiobjective interaction game system for service provisoning vehicular cloud
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Efficient multicast delivery for data redundancy minimization over wireless d...redpel dot com
Efficient multicast delivery for data redundancy minimization over wireless data centers
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Cloud assisted io t-based scada systems security- a review of the state of th...redpel dot com
Cloud assisted io t-based scada systems security- a review of the state of the art and future challenges.
for more ieee paper / full abstract / implementation , just visit www.redpel.com
I-Sieve: An inline High Performance Deduplication System Used in cloud storageredpel dot com
I-Sieve: An inline High Performance Deduplication System Used in cloud storage
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Architecture harmonization between cloud radio access network and fog networkredpel dot com
Architecture harmonization between cloud radio access network and fog network
for more ieee paper / full abstract / implementation , just visit www.redpel.com
A cloud gaming system based on user level virtualization and its resource sch...redpel dot com
A cloud gaming system based on user level virtualization and its resource scheduling.
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Top k query processing and malicious node identification based on node groupi...redpel dot com
Top k query processing and malicious node identification based on node grouping in manets
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
A tutorial on secure outsourcing of large scalecomputation for big data
1. SPECIAL SECTION ON THEORETICAL FOUNDATIONS FOR BIG DATA APPLICATIONS:
CHALLENGES AND OPPORTUNITIES
Received March 10, 2016, accepted March 25, 2016, date of publication April 4, 2016, date of current version April 21, 2016.
Digital Object Identifier 10.1109/ACCESS.2016.2549982
A Tutorial on Secure Outsourcing of Large-scale
Computations for Big Data
SERGIO SALINAS1, (Member, IEEE), XUHUI CHEN2, (Student Member, IEEE),
JINLONG JI2, (Student Member, IEEE), AND PAN LI2, (Member, IEEE)
1Department of Electrical Engineering and Computer Science, Wichita State University, Wichita, KS 67260, USA
2Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA
Corresponding author: P. Li (lipan@case.edu)
This work was supported by the Division of Computer and Network Systems through the U.S. National Science
Foundation under Grant CNS-1149786 and CNS-1343220.
ABSTRACT Today’s society is collecting a massive and exponentially growing amount of data that can
potentially revolutionize scientific and engineering fields, and promote business innovations. With the advent
of cloud computing, in order to analyze data in a cost-effective and practical way, users can outsource
their computing tasks to the cloud, which offers access to vast computing resources on an on-demand and
pay-per-use basis. However, since users’ data contains sensitive information that needs to be kept secret for
ethical, security, or legal reasons, many users are reluctant to adopt cloud computing. To this end, researchers
have proposed techniques that enable users to offload computations to the cloud while protecting their data
privacy. In this paper, we review the recent advances in the secure outsourcing of large-scale computations
for a big data analysis. We first introduce two most fundamental and common computational problems,
i.e., linear algebra and optimization, and then provide an extensive review of the data privacy preserving
techniques. After that, we explain how researchers have exploited the data privacy preserving techniques to
construct secure outsourcing algorithms for large-scale computations.
INDEX TERMS Big data, privacy, cloud computing, linear algebra, optimization.
I. INTRODUCTION
The amount of data that is being collected by today’s society
is exponentially growing. The digital world will increase
from 4.4 zetabytes (trillion gigabytes) in 2013 to 44 zetabytes
by 2020, more than doubling every two years [1]. Such
massive data holds the potential to rapidly advance scientific
and engineering knowledge and promote business innova-
tions [2]–[5]. For example, e-commerce companies can
improve product recommendations by mining billions of
customer transactions [6]; power engineers can monitor the
electric grid in real-time based on the enormous amount of
sensor data [7]; and financial firms can increase the returns of
their portfolios by analyzing the daily deluge of stock market
data. Obviously, we have massive data in all these fields,
which needs to be stored, managed, and more importantly,
analyzed. However, users face an enormous challenge in
trying to analyze such huge amounts of data in a timely and
cost-effective way.
Specifically, due to the limited computing and
RAM (random access memory) capacity of traditional hard-
ware, users are unable to analyze large-scale data sets in a
feasible amount of time. In fact, analyzing massive data usu-
ally requires vast and sophisticated computing infrastructure.
For example, governments and universities have successfully
employed supercomputers to solve very heavy computing
tasks, such as predicting climate change and simulating
protein folding. Unfortunately, because of the high cost of
supercomputers, e.g., hundreds of millions of dollars or even
more, most users lack access to adequate computing facilities
for big data applications. Even small in-house computing
clusters are too expensive, and their computing resources may
still be insufficient [8], [9].
Recently, cloud computing has been proposed as an effi-
cient and economical way for resource-limited users to
analyze massive data sets. In this computing paradigm,
cloud users outsource their computing tasks to a cloud
server [10]–[14], which contains a large amount of computing
resources and offers them on an on-demand and pay-per-use
basis [15]. Therefore, users can share the cloud’s resources
with each other, and avoid purchasing, installing, and main-
taining sophisticated and expensive computing hardware and
software. For instance, the video streaming company Netflix
1406
2169-3536
2016 IEEE. Translations and content mining are permitted for academic research only.
Personal use is also permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
VOLUME 4, 2016
www.redpel.com +917620593389
www.redpel.com +917620593389
2. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
employs the cloud to accommodate users’ huge and highly
dynamic demand in a cost-effective way [16].
Although users recognize the advantages of cloud comput-
ing, many of them are reluctant to adopt it due to privacy
concerns [17]. To be more prominent, in many cases, users’
data is very sensitive and should be kept secret from the cloud
for ethical, security, or legal reasons [18]–[20]. For exam-
ple, outsourcing product recommendations in e-commerce
can give unauthorized access to users’ shopping habits [21];
a power company’s data may reveal the topology of the
system, thus enabling attacks on the electric grid [22]; and
outsourcing portfolio optimization may compromise finan-
cial firms’ market research. In fact, users’ private data is
vulnerable to malicious cloud service providers, who can
directly snoop into its users’ data, and third-party adver-
saries, who can launch a number of attacks against the
cloud [23]–[27]. Therefore, to enable scientists and engineers
to revolutionize their fields through the analysis of large-scale
data, it is important to design secure outsourcing tools that
preserve their data privacy.
We notice that two types of fundamental mathematical
computations frequently appear in large-scale data analytics
and computing: linear algebra and optimization. In particular,
linear algebra operations are arguably the most fundamen-
tal and frequently used computations in big data analysis.
For example, least-squares estimation, which is widely used
for linear regression, involves the solution of an overdeter-
mined linear system of equations; and iterative solutions for
non-linear systems, such as the Newton-Raphson method,
usually employ solving a linear system of equations as a
subroutine. Besides, to perform more sophisticated analysis,
optimization is frequently used in big data applications.
For example, scheduling routes in intelligent transporta-
tion systems can be done by solving a linear program;
and support vector machines, a commonly used machine
learning algorithm, employ both linear and quadratic
programs.
In this tutorial, we review secure outsourcing techniques
for for large-scale data analytics and computing. Specif-
ically, we first introduce linear algebra and optimization
problems and their respective solution methods. We then
review the most common privacy-preserving techniques for
large-scale data sets, and explain how researchers have used
them to securely outsource linear algebra and optimization
computations.
The rest of this paper is organized as follows. In Section II,
we introduce two most fundamental computational prob-
lems employed in large-scale data analysis. In Section III,
we introduce the cloud computing system architecture and the
threat model. In Section IV, we present privacy-preserving
techniques for secure outsourcing of large-scale computa-
tions. Section V explains how linear algebra and optimiza-
tion computations can be outsourced to the cloud while
preserving data privacy. We finally conclude this paper
in Section VI.
II. FUNDAMENTAL COMPUTATIONAL
PROBLEMS IN BIG DATA
In this section, we present two most fundamental com-
putational problems, i.e., linear algebra and optimization,
employed in large-scale data analysis.
A. LINEAR ALGEBRA
Many problems that involve large-scale data analysis
are fundamentally based on solving linear systems of
equations (LSEs). For example, in image processing, power
flow estimation, and portfolio optimization, least-squares
problems can be reformulated as an LSE; in social net-
work analysis and web search engine design, the eigenvector
centrality problem is naturally posed as an LSE; and lin-
ear regression, which is a very common statistical analysis
technique, can also be computed through solving an LSE.
We formally define an LSE as follows:
Ax = b, (1)
where A ∈ Rn×m is the coefficient matrix, x ∈ Rn×1 is
the solution vector, and b ∈ Rm×1 is the constant vector.
We assume that A is square, i.e., m = n, non-singular,
symmetric, and positive-definite. In fact, given an arbitrary
coefficient matrix A, the user can always find an equivalent
LSE by employing A = A A as the coefficient matrix and
b = A b as the constant vector, where matrix A is guaran-
teed to be non-singular, symmetric, and positive-definite.
1) MATRIX INVERSION
To solve the large-scale LSEs in (1) by matrix inversion,
a user first finds matrix A−1 such that
AA−1
= I (2)
where I ∈ Rn×n is the identity matrix, and then computes
x = A−1b.
By computing A−1 once and storing it, the user can effi-
ciently solve several LSEs that share the same coefficient
matrix A, but have different constant vectors b. Besides,
matrix inversion often appears as a stand alone computation.
For instance, in large-scale data visualization, matrix inver-
sion is employed in 3-D rendering processes, such as screen-
to-world ray casting and world-to-subspace-to-world object
transformations [28].
Unfortunately, since the most efficient matrix inversion
algorithms are based on matrix multiplications, the complex-
ity of computing A−1 is too high for the user, particularly in
big data applications.
2) ITERATIVE SOLUTION METHODS
Iterative methods can solve LSEs with lower computational
complexity by avoiding matrix multiplications. We briefly
review two of the most frequently used iterative solution
methods: the Jacobi method and the conjugate gradient
method (CGM).
VOLUME 4, 2016 1407
www.redpel.com +917620593389
www.redpel.com +917620593389
3. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
The main idea of the Jacobi method is to iteratively update
the solution vector based on a transformed coefficient matrix.
Specifically, we replace A with D + R in (1), where D and R
are the matrices containing the diagonal and the off-diagonal
elements of A, respectively. Then, by rearranging terms, we
reach the following Jacobi iteration to solve (1), i.e.,
xk+1
= Txk
+ c (3)
where T = −D−1R and c = D−1b.
To carry out the Jacobi method, the user assigns random
values to the initial solution vector x0, and computes (3) until
the following condition is met,
||xk
− xk+1
||2 ≤ , (4)
where || · ||2 is the 2-norm and 0 < < 1 is the tolerance
parameter. If the coefficient matrix A is diagonally dominant,
the Jacobi iteration converges in K n iterations. Otherwise,
the algorithm may diverge [29].
Another iterative solution method is the conjugate gra-
dient method (CGM), which offers guaranteed convergence
if the coefficient matrix is symmetric and positive-definite.
To illustrate the main idea of the CGM, we first note that solv-
ing (1) is equivalent to solving the following unconstrained
optimization program
min f (x) =
1
2
x Ax − bx (5)
when A is nonsingular, symmetric and positive-definite [30].
The CGM employs a set of vectors P = {p0, p1, . . . , pn}
that are conjugate with respect to A, that is, at iteration k the
following condition is met:
pk Api = 0 for i = 0, . . . , k − 1. (6)
Using the conjugacy property of vectors in P, we can find
the solution in at most n steps by computing a sequence of
solution approximations as follows:
xk+1 = xk + αkpk (7)
where αk is the one-dimensional minimizer of (5)
along xk + αkpk. The minimizer αk is given by
αk =
−rk pk
pk Apk
(8)
where rk = Axk − b is called the residual.
Moreover, we can iteratively find the residual based on (7)
as follows:
rk+1 = Axk+1 − b
= A(xk + αkpk) − b = rk + αkApk. (9)
The CGM finds a new conjugate vector pk+1 at iteration k
by a linear combination of the negative residual, i.e., the
steepest descent direction of f (x), and the current conjugate
vector pk, that is,
pk+1 = −rk+1 + βk+1pk (10)
where βk+1 is chosen in such a way that pk+1 and pk meet
condition (6).
Since xk minimizes f (x) along pk, it can be shown that
rk pi = 0 for i = 0, 1, . . . , k − 1 [30]. Using this fact and
equation (10), a more efficient computation for (8) can be
found, namely,
αk =
−rk (−rk + βkpk−1)
pk Apk
=
rk rk
pk Apk
.
Similarly, using (9), we can find the formulation for βk+1
βk+1 =
rk+1rk+1
rk rk
.
The iterations stop when the following stopping criteria is met
rk rk ≤ ν||b||2. (11)
where ν is a tolerance value.
To summarize the above, the CGM algorithm is as follows.
At iteration k = 0, we have
r0 = Ax0 − b (12)
p0 = −r0 (13)
k = 0 (14)
and at iteration k ≥ 0 we have the following iterative
equations:
αk =
rk rk
pk Apk
(15)
rk+1 = rk + αkApk (16)
xk+1 = xk + αkpk (17)
βk+1 =
rk+1rk+1
rk rk
(18)
pk+1 = −rk+1 + βk+1pk (19)
Compared to other methods, e.g., matrix inversion, Gaussian
eliminations, QR decomposition, the Jacobi iteration and the
CGM offer a feasible algorithm for extremely large-scale
systems.
B. OPTIMIZATION
In large-scale data analysis, many computations can be
posed as an optimization problem that has a linear objec-
tive and affine constraints, i.e., a linear program (LP), or a
quadratic objective and affine constraints, i.e., a quadratic
program (QP). For example, in machine learning, support
vector machines employ both linear and quadratic pro-
grams [31]; in financial analysis, linear programs have been
successfully used to manage stock portfolios and evaluate
risk; and in power grids, linear optimization models are
used to control generation dispatch and perform real-time
monitoring [32]. In what follows, we introduce linear and
quadratic programs, and describe their respective Lagrange
dual problems.
1408 VOLUME 4, 2016
www.redpel.com +917620593389
www.redpel.com +917620593389
4. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
1) LINEAR PROGRAM
An optimization problem with linear objective and linear
constraints, i.e., an LP, is defined as follows [33]:
min
x
z = cT
x (20a)
subject to Ax = b (20b)
x ≥ 0 (20c)
where c ∈ Rn is the cost vector and x ∈ Rn is the variable
vector. Matrix A ∈ Rm×n, vector b, and (20c) describe a
set of affine constraints. The user aims to find the optimal
solution x∗ such that the objective in (20a) is minimized and
the constraints (20b) and (20c) are satisfied. We assume that
A is full-rank and the constraints are feasible.
Linear programs can be efficiently solved through iterative
solution methods such as the Simplex algorithm [33] and
interior point methods [34].
2) QUADRATIC PROGRAM
A QP is an optimization problem that has a quadratic objec-
tive and affine constraints [34], i.e.,
min
x
f (x) =
1
2
x Qx − b x (21a)
subject to Ax ≤ c (21b)
where x ∈ Rn×1 is the optimization variable, and Q ∈ Rn×n
and b ∈ Rn×1 are the quadratic and affine coefficients,
respectively. The matrix A ∈ Rm×n, and the vector c ∈ Rm×1
define the set of affine linear constraints. Similar to an LP,
the user intends to find the optimal solution x∗ such that the
objective in (21a) is minimized and the constraints (21b) are
satisfied. We assume that the coefficient matrix Q is positive
definite and symmetric, and hence the QP in (21) has a unique
solution. We also assume that A is full-rank.
3) THE LAGRANGE DUAL PROBLEM
The Lagrange dual problem is an equivalent optimization
problem that is often easier to solve and provides useful
insights into the original problem. For example, as we will see
later, the Lagrange dual problem usually has a much simpler
constraint set that allows us to employ more efficient solution
methods. Therefore, we can solve the Lagrange dual problem
to either verify the solution to the original problem or solve
the original problem. In what follows, we derive the Lagrange
dual problems for both LPs and QPs.
The Lagrange Dual Problem for LPs: To find the dual
problem of (20), we first form the Lagrangian L : Rn×1 ×
Rm×1 × Rn×1 → R as follows:
L(x, λ, ν) = c x + λ (Ax − b) − ν x (22)
where λ ∈ Rm×1 and ν ∈ Rn×1 are the equality and inequality
dual variable vectors, respectively.
We define the Lagrange dual function g : Rm×1 ×Rn×1 →
R as the infimum value of (22) over x (regarding λ ∈ Rm×1
and ν ∈ Rn×1), i.e.,
g(λ, ν) = inf
x
L(x, λ, ν)
= −b λ + inf
x
(c + A λ − ν) x. (23)
To solve the Lagrange dual function, we observe that (23) has
a lower bound when the coefficient of x is equal to zero, and
otherwise it is unbounded, that is
g(x, λ, ν) =
−b λ A λ − ν + c = 0,
−∞ otherwise.
(24)
By using the first case of (24) as the objective function, and
requiring A λ−ν +c = 0 and ν to be nonnegative, we arrive
at the dual problem of (20):
max
λ,ν
g = −bT
λ
subject to A λ − ν + c = 0
ν ≥ 0 (25)
which can be further simplified into
max
λ
g = −bT
λ
subject to A λ + c ≥ 0 (26)
Since we have assumed that A is full-rank and that the
affine constraints are feasible, we have that the strong duality
holds [34] and thus
c x∗
= −b λ∗
. (27)
The Lagrange Dual Problem for QPs: Since directly
solving the optimization problem in (21) requires expensive
solution methods such as interior point methods, it is often
preferable to solve the Lagrangian dual problem, which only
has non-negativity constraints and can be more efficiently
solved.
Following a similar procedure to the above, we first form
the Lagrangian L : Rn×1 × Rm×1 → R as follows [34]:
L(x, λ) = f (x) + λ (Ax − b) (28)
where λ ∈ Rm×1 is the vector of dual variables, and A and b
are defined in (21a).
We define the Lagrange dual function g : Rm×1 → R
as the minimum value of the Lagrangian over x
(for λ ∈ Rm×1), i.e.,
g(λ) = inf
x
L(x, λ) (29)
To solve the Lagrange dual function, we take the derivative of
equation (28) with respect to x and set it to zero as follows:
x Q − b + λ A = 0 (30)
where we have used the definition of f (·) in (21a). Taking the
transpose in (30) and solving for x, we get
x = Q−1
(b − A λ) (31)
Then, by plugging (31) into (28) and considering that Q is
symmetric, we can rewrite the Lagrange dual function as
g(λ) = −
1
2
λ AQ−1
A λ
− λ (c − AQ−1
b)
−
1
2
b Q−1
b (32)
VOLUME 4, 2016 1409
www.redpel.com +917620593389
www.redpel.com +917620593389
5. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
FIGURE 1. A typical architecture for secure outsourcing in big data analysis.
By using equation (32) as the objective function and requir-
ing λ to be non-negative, we arrive at the dual problem of the
QP in (21), i.e.,
min
λ
g(λ) =
1
2
λ Pλ + λ r (33a)
subject to λ ≥ 0 (33b)
where P = AQ−1A and it is positive definite and symmet-
ric, and r = c − AQ−1b. We denote the solution to (33)
as λ∗.
Since the problem (21) is convex and the affine constraints
are feasible, then the strong duality holds [34] and we have
that
x∗
= Q−1
(b − A λ∗
). (34)
Moreover, to solve the optimization problem in (33),
we can use the Gauss-Seidel iteration as follows:
λj(t + 1) = arg min
λj≥0
g(λ1(t + 1), . . . , λj−1(t + 1),
λj, λj+1(t), . . . , λm(t)) (35)
where j ∈ [1, m]. Intuitively, the algorithm updates λj (for all
j ∈ [1, m]) one at a time, and uses the most recent updates as
they become available.
Let (λ1(t + 1), . . . , λj−1(t + 1), λj, λj+1(t), . . . , λm(t)) be
denoted as λj(t). We can solve equation (35) analytically by
taking the partial derivative of g(λj(t)) in (33a) with respect
to λj and setting it to zero:
∂g(λj(t))
∂λj
= rj + pjλj(t) = 0
where rj is the jth element of r, and pj is the jth row of P.
Then, the unconstrained minimum of g(·) along the
jth coordinate starting from λj(t) is attained at
λj(t + 1) = λj(t) −
1
pj,j
(rj + pjλj(t)) ∀j ∈ [1, m]
where t is the iteration index.
Thus, taking into account the non-negativity constraint
and holding all the other variables constant, we get that the
Gauss-Seidel iteration, when λj is updated, is
λj(t + 1) = max{0, λj(t + 1)}
= max{0, λj(t) −
1
pj,j
(rj + pjλj(t))}, (36)
λi(t + 1) = λi(t) ∀i = j. (37)
After all the λi(t + 1)’s (1 ≤ i ≤ m) are updated, the
GSA iteration proceeds to the next. The iterations continue
until the following stopping criteria is satisfied:
||λ(t + 1) − λ(t)||2 ≤ (38)
where 0 < < 1 is the tolerance parameter.
III. SYSTEM ARCHITECTURE
In this section, we present the most common cloud computing
architecture for secure outsourcing of large-scale computa-
tions, and describe the threat model.
A. SYSTEM ARCHITECTURES
As shown in Fig 1, the system architecture for secure
outsourcing of large-scale computations consists of a
resource-constrained user and a remote resource-rich cloud
server. The user aims to solve one of the large-scale problems
described in Section II. However, due to the massive data
involved in the problem, the user is unable to solve it by itself
in a feasible amount of time. Therefore, the user outsources
its most computationally expensive tasks to the cloud.
B. THREAT MODELS
The literature assumes two threat models for the cloud,
i.e., a semi-honest model and a malicious model. In the semi-
honest model, the cloud attempts to learn the user’s private
data from its outsourced data and from the results of its
own computations. In the malicious model, the cloud has the
additional ability to deviate from the proposed protocols or
return erroneous results.
Before the cloud can help perform any computations, the
user first needs to upload some of its data to the cloud.
However, to prevent the cloud from learning private informa-
tion, the user’s data must be well protected. In particular, in
the user’s data, the non-zero elements’ values and positions
both carry private information. Besides, we usually need
to keep the zeros’ in users’ data, which can enable more
efficient computations, e.g., sparse matrix computations or
parallel computations [35]. In the following, we adopt the
privacy notion of computational indistinguishability [36], and
introduce a few similar but different definitions.
First, in the computational problems described
in Section II, we observe that the numerical values in the
matrices and vectors are private. Specifically, in LSEs,
1410 VOLUME 4, 2016
www.redpel.com +917620593389
www.redpel.com +917620593389
6. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
the values of the elements in the coefficient matrix A
and constant vector b are private. Similarly, in LPs,
the cost function c, and the constraint set defined by A and b
should be kept secret from the cloud. In the case of QPs,
the objective coefficients Q and b, as well as the constraint
set defined by A and c are private. In all the LSE, LP, and
QP problems, the solution x∗ is private. We have the follow-
ing two definitions of computational indistinguishability in
value.
Definition 1: Computational Indistinguishability in Value:
Let R ∈ Rm×n be a random matrix with entries in its
jth column sampled from a uniform distribution with interval
[−Rj, Rj] ∀j ∈ [1, n]. Matrices R and Q are computationally
indistinguishable in value if for any probabilistic polynomial
time distinguisher D(·) there exists a negligible function µ(·)
such that [37]
|P[D(ri,j) = 1] − Pr[D(qi,j) = 1]| ≤ µ ∀ i, j (39)
where ri,j is the element in the ith row and jth column of R,
and qi,j is the element in the ith row and jth column of Q.
Distinguisher D(·) outputs 1 when it identifies the input as
a non-uniform distribution in the range [−Rj, Rj], and zero
otherwise.
Definition 2: Computational Indistinguishability in Value
under a CPA: We say that a matrix transformation scheme has
indistinguishable transformations in value under a chosen-
plaintext attack (or is CPA-secure in value) if for all
probabilistic polynomial-time adversaries A there exists a
negligible function µ, such that the probability of distinguish-
ing two matrix transformations in value in a CPA indistin-
guishability experiment is less than 1/2 + µ [35].
Note that if a matrix transformation scheme results in a
matrix satisfying Definition 1, then it satisfies Definition 2
as well.
Moreover, the positions of the non-zero elements in a
matrix (i.e., the matrix’s structure) contain private informa-
tion that should also be hidden from the cloud. For example,
in power system state estimation, the system matrix contains
the topology of the network, which can be used to launch an
attack against the grid [22]. To protect a matrix’s structure,
we can permute its rows and columns in such a way that the
non-zero elements occupy positions that are indistinguishable
from those of the non-zero elements in another permuted
matrix. We give the definition of secure permutation below.
Definition 3: Indistinguishability in Structure under a
CPA: We say that a permutation scheme has indistinguish-
able permutations under a chosen-plaintext attack (or is
CPA-secure) if for all probabilistic polynomial-time adver-
saries A there exists a negligible function µ, such that the
probability of distinguishing two permutations in a CPA
indistinguishability experiment is less than 1/2 + µ [35].
IV. DATA PRIVACY PRESERVING APPROACHES
To delegate a computing task to the cloud, the user first needs
to perform some computations on its data. These computa-
tions should require a moderate effort from the user, hide the
data from the cloud, and allow the cloud to return a mean-
ingful result. To this end, researchers have proposed mainly
two kinds of approaches: cryptography-based techniques and
perturbation-based techniques. In this section, we describe
the state-of-the-art of privacy-preserving approaches.
A. CRYPTOGRAPHY-BASED APPROACHES
In their seminal work, Gennaro et. al [38] develop the first
fully homomorphic encryption (FHE) scheme which allows
users to securely outsource any computations to the cloud.
However, it requires an expensive preprocessing phase at the
user and adds significant computing overhead at the cloud.
Although some efforts have been made to improve the prac-
ticality of FHE [39], it remains very expensive for big data
applications.
To overcome this challenge, a user can employ partially
homomorphic encryption (PHE) to conceal its data and allow
the cloud to securely carry out a specific computational task.
In other words, compared with FHE that supports arbitrary
computation on ciphertexts, PHE only allows homomorphic
computation of some operations on ciphertexts (e.g., addi-
tions, multiplications, quadratic functions, etc.) and hence
is usually less complex. For example, consider the Paillier
homomorphic cryptosystem, a frequently used PHE. Letting
E(·) denote the encryption function, we have that
E(m1 + m2) = E(m1) · E(m2)
and
E(m1 · c) = E(m1)c
where m1 and m2 are plaintexts, and c is a constant.
Other PHE cryptosystems include El-Gamal [40] and
Goldwasser and Micali [41].
Notice that although FHE and PHE schemes satisfy
Definition 2, they may still not be applicable for big data
applications due to the expensive encryptions/decryptions as
well as the complex operations on ciphertexts.
B. PERTURBATION-BASED APPROACHES
Perturbation-based approaches conceal a user’s sensitive
data by performing linear algebra operations between its
private matrices and carefully designed random matrices.
In this section, we introduce three main perturbation-
based approaches employed to protect the user’s privacy:
matrix addition, matrix multiplication, and row and column
permutations.
1) MATRIX ADDITION
A user can hide the values of the elements in its private
matrix H ∈ Rm×n by performing the following matrix
addition [37]:
¯H = H + Z (40)
where Z ∈ Rm×n is a random matrix, and ¯hi,j = hi,j + zi,j
(∀i ∈ [1, m], j ∈ [1, n]).
VOLUME 4, 2016 1411
www.redpel.com +917620593389
www.redpel.com +917620593389
7. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
To reduce the user’s computational complexity, the random
matrix Z is formed by a vector outer-product, i.e.,
Z = uv (41)
where u ∈ Rm×1 is a vector of uniformly distributed random
variables and vector v ∈ Rn×1 is a vector of arbitrary
positive constants. Note that element zi,j = uivj (∀i ∈ [1, m],
j ∈ [1, n]), is the product of a random variable and a positive
constant, and hence also a random variable.
It has been proven that the above matrix addition scheme
is computationally indistinguishable in value as defined
in Definition 1 [37]. Note that this scheme does not pre-
serve the zeros in matrix H during the matrix transformation
process.
2) MATRIX MULTIPLICATION
Consider a user’s private matrix H ∈ Rm×n, with
non-zero elements hi,j for (i, j) ∈ SH, where SH is the set
of the coordinates of non-zero elements in H. Then, the user
can hide H’s non-zero values by performing the following
multiplications [35]:
˜H = DHF (42)
where D ∈ Rm×m is a diagonal matrix with non-zero elements
generated by a pseudorandom function, and F ∈ Rn×n is also
a diagonal matrix but with non-zero elements set to arbitrary
positive constants.
Salinas et al. [35] prove that this matrix multiplication
scheme is computationally indistinguishable in value under
a CPA as defined in Definition 2. Note that this scheme does
preserve the zeros in matrix H during the matrix transforma-
tion process.
3) MATRIX PERMUTATION
Although the matrix transformation in equation (42) hides the
values of the non-zero elements in H, it reveals their original
positions, i.e., H’s structure, which are also private.
By randomly permuting the rows and columns of ˜H, the
user can conceal H’s structure. Specifically, the user performs
the following multiplications [35]:
ˆH = E ˜HU (43)
where E ∈ Rm×m and U ∈ Rn×n are pseudorandom permu-
tation matrices, and their elements are defined by
ei,j = δπ(i),j ∀i ∈ [1, m], j ∈ [1, m]
ui,j = δπ(i),j ∀i ∈ [1, n], j ∈ [1, n]
where i and j are the row and column indexes, respectively,
and the function π(·) maps an original index i to its pseu-
dorandomly permuted index, i.e., π(i) = ˆei (for i ∈ [1, m])
and π(i) = ˆui (for i ∈ [1, n]). Besides, the Kronecker delta
function is given by
δi,j =
1, i = j
0, i = j.
The user is able to recover the original structure by apply-
ing the inverse permutation, i.e.,
ˆH = E ˜HU
where denotes the matrix transpose operation. To reach
this result, we have used the orthogonality property of the
permutation matrices, i.e., E E = I and UU = I, where I
is the identity matrix.
Note that [35] shows that this matrix permutation scheme
is computationally indistinguishable in structure under a CPA
as defined in Definition 3.
Therefore, taking advantage of both matrix multiplication
and matrix permutation, the user can hide the values of the
non-zero elements in H as well as its structure as follows:
ˆH = LHR (44)
where L = ED and R = FU.
V. SECURE OUTSOURCING OF LARGE-SCALE
COMPUTATIONS
In this section, we review how researchers have employed the
data privacy preserving techniques in Section IV to securely
outsource large-scale computations introduced in Section II.
A. LINEAR ALGEBRA
Lei et al. [42] construct a method for securely outsourcing
matrix inversion described in (2). In particular, the user first
conceals the values and structure of a private matrix A by
computing matrix multiplications and permutations, i.e.,
ˆA = MAN (45)
where M and N are essentially both the multiplication of
a random diagonal matrix and a permutation matrix, and
then uploads ˆA to the cloud. After computing the inverse
matrix ˆA−1, the cloud returns the result to the user, who
recovers A−1 as follows:
A−1
= N ˆA−1
M.
Thus, this secure matrix inversion outsourcing by Lei et al.
can be used to solve LSEs of the form (1) as described
in Section II-A1. Besides, Chen et al. [43] employ a simi-
lar procedure to solve linear regression problems. Note that
neither of these schemes provides formal proof for privacy
preservation.
Since it may still be a very challenging task for the cloud
to solve LSEs in (1) with the matrix inversion method,
researchers have proposed secure outsourcing algorithms for
large-scale LSEs based on iterative procedures, which are
introduced below.
Wang et al. [44] develop an algorithm based on the Jacobi
iteration and partial homomorphic encryption, where the user
and the cloud collaborate to solve an LSE. Specifically,
the user computes matrix T as described in Section II-A2,
encrypts its values using the Paillier cryptosystem described
in Section IV-A, and uploads it to the cloud. To protect the
1412 VOLUME 4, 2016
www.redpel.com +917620593389
www.redpel.com +917620593389
8. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
solution vector’s values, the user replaces x with ˆx = x + r
in (1), where r is an n × 1 random matrix.
The user and the cloud collaboratively carry out the Jacobi
iteration (3) as follows. At the iteration step k = 0, the user
computes the initial solution vector ˆx0 = x0 + r and sends it
to the cloud, who then computes
E((Tˆx0
)i) =
n
j=1
E(ti,j)
ˆx0
j
where (Tˆx0)i is the ith element in Tˆx0, and ti,j is the ele-
ment in the ith row and jth column of T. The user down-
loads E((Tˆx0)i)’s and finds ˆx1 by computing the rest of
equation (3). At the end of the kth iteration, the user trans-
mits ˆxk to the cloud for the next iteration k +1. The iterations
continue until the sequence of solution vectors meets the
stopping criteria (4). Finally, to find the solution of (1), the
user computes x∗ = ˆx∗ − r.
Although encrypting the values of matrix T is a
one-time operation, it is a computationally expensive task,
especially in big data applications, and common users may
lack resources to compute it. To overcome this problem, the
user may employ a private cloud, e.g., a trusted computing
infrastructure within its organization, to carry out the encryp-
tion. Besides, there is no formal proof for privacy preservation
in [44] either.
To relax the assumption of a trusted private cloud and
further reduce computational complexity, Salinas et al. [37]
construct a secure outsourcing algorithm for large-scale LSEs
that only employs perturbation-based approaches. In particu-
lar, the user generates a masked coefficient matrix ˆA = A+Z
where Z is an n × n random matrix constructed as described
in Section IV-B1, and sends it with the initial solution vector
x0 = q to the cloud, where q is a random n × 1 vector. Then,
the cloud computes the following secure version of (12):
h0 = ˆAx0
= (A + Z)x0.
Upon receiving h0, the user computes the residual vector as
follows:
r0 = Ax0 − b
= h0 − u v x0 − b.
At the end of the initialization step, the user sets the conjugate
vector p0 = −r0, and transmits it with r0 to the cloud.
To protect the user’s privacy, the cloud carries out computa-
tions with the transformed matrix ˆA, instead of A, as follows.
To compute αk, the user and the cloud carry out equation (15)
in two steps. First, based on the pk received from the user, the
cloud computes an intermediate vector
tk = pk
ˆApk = pk (A + Z)pk. (46)
Second, the user finds αk using tk as follows
αk =
rk rk
tk − (pk u)(v pk)
.
Similarly, the user exploits the cloud’s resources to
find rk+1. The cloud first calculates an intermediate
vector:
fk = ˆApk = (A + Z)pk (47)
which allows the user to compute rk+1 as follows:
rk+1 = rk + αk fk − u(v pk) .
Since equations (17)-(19) only require vector-vector oper-
ations, they all can be computed efficiently by the user
itself. At the end of the kth iteration, the user transmits
pk+1 to the cloud for the next iteration k + 1. When the
condition (11) is met, the algorithm converges and the user
recovers the solution vector x∗. Noticeably, [37] proves that
the proposed scheme is computationally indistinguishable in
value as defined in Definition 1.
B. OPTIMIZATION
1) LINEAR PROGRAMS
To securely outsource an LP of the form in (20),
Wang et al. [45] propose the following equivalent LP, i.e.,
minimize
ˆx
ˆz = ˆcT
ˆx (48a)
subject to ˆAˆx = ˆb (48b)
ˆBˆx ≥ 0 (48c)
where ˆx = N−1(x + r) and r is a random n × 1 vector,
ˆc = γ N c (γ is a random number), ˆA = MAN with M
and N being random dense matrices, and ˆb = M(b + Ar)
(with b + Ar = 0). Note that the bound constraint is Bx ≥ 0
instead of x ≥ 0 in (20c). The authors propose to find a
matrix λ such that λˆb = Br, and set ˆB = (B − λMA)N for
| ˆB| = 0.
The user can outsource the transformed problem (48a) to
the cloud. The cloud then solves (48a) and its Lagrange dual
problem, and sends the results to the user, who computes
x∗ = Nˆx∗ − r. To verify the correctness of the cloud’s
computations, the user can check if the objective of (48a) is
equal to that of the Lagrange dual problem. No formal proof
for privacy preservation is provided. Besides, Wang et al. [46]
and Nie et al. [47] propose similar techniques to securely
outsource LPs.
2) QUADRATIC PROGRAMS
To securely outsource a QP of the form in (21), Zhou and Li
derive an equivalent QP as follows:
min
ˆx
ˆf (x) =
1
2
ˆx ˆQˆx − ˆb ˆx (49a)
subject to ˆAˆx ≤ ˆc (49b)
where ˆx = N−1(x + r) is the secure optimization variable,
r is a random n × 1 vector, ˆQ = N QN, ˆb = (r QN +
b N) , ˆA = MAN, and ˆc = M(b + Ar) (with b + Ar = 0).
Matrices M and N are random matrices that can be con-
structed as in equation (44). After receiving the transformed
VOLUME 4, 2016 1413
www.redpel.com +917620593389
www.redpel.com +917620593389
9. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
problem (49), the cloud solves it and returns the solution ˆx∗ to
the user. The user can find the solution to (21) by computing
x∗ = Nˆx∗ −r. In addition, the user can verify the correctness
of x∗ by verifying if the Karush-Khun-Tucker conditions
hold. Note that this scheme does not provide formal privacy
proof.
Salinas et al. [35] design a secure outsourcing algorithm
by first finding the dual problem of (21), and then conceal-
ing the dual problem’s matrices via the perturbation-based
approaches in Section IV-B. Specifically, in collaboration
with the cloud, the user first computes P and r in (33) and
then conceals them as follows:
ˆP = LRpPL−1
(50)
ˆr = LRpr (51)
ˆλ = Lλ0 (52)
where L is the same as that in (44), Rp > 0 is a diagonal
matrix whose elements are generated by a pseudorandom
function, and the elements of λ0 are non-negative which satis-
fies the constraints of (21). After the cloud receives the above
matrix and vector, it carries out the Gauss-Seidel iteration
in (36) as follows:
ˆλk(t + 1) = max{0, ˆλk(t) −
1
ˆpk,k
(ˆrk + ˆpk ˆλ(t))}
where ˆλk and ˆrk are the kth elements of ˆλ and ˆr, respectively,
ˆpk is the kth row of ˆP, and the iteration step is denoted by t.
To allow the cloud to enforce the non-negativity constraints
in (37), the user also shares the signs of the elements in L.
The iterations continue until the stopping criteria in (38) is
satisfied.
The user can find the solution to the original QP problem
by first computing λ∗ = L−1 ˆλ
∗
and then x∗ = Q−1
(b − Aλ∗) where Q−1 can be efficiently calculated with the
help of the cloud. Furthermore, the authors have developed
a parallel computing algorithm that can enable the cloud to
complete the Gauss-Seidel iterations in a parallel manner.
Finally, the user verifies the correctness of the computation
by checking whether the Karun-Khun-Tucker optimality con-
ditions hold.
Moreover, the authors formally prove that the proposed
scheme is CPA-secure both in value and in structure as
defined in Definition 2 and Definition 3, respectively.
C. SECURITY VALIDATION
In addition to constructing secure outsourcing algorithms,
previous works describe the security properties of their
schemes with respect to the threat models presented
in Section III-B. Specifically, in works that are based on
homomorphic encryption such as [44], the authors base the
security properties of their algorithms on the employed cryp-
tosystem. To show the security of outsourcing schemes based
on perturbation techniques, the works in [42], [45], and [48]
briefly describe how an attacker would need an infeasible
amount of computing resources and time to extract informa-
tion from the user’s outsourced matrices and vectors, but no
formal proof for privacy is given. The works in [35] and [37]
state theorems about the security of their outsourcing algo-
rithms and formally show that they are computationally
indistinguishable.
D. COMPLEXITY ANALYSIS
To preserve the computational gains offered by cloud comput-
ing, secure outsourcing schemes require the user to perform
computing tasks that are less expensive than solving the
original computations. In this section, we review the compu-
tational complexity of solving linear algebra and optimization
problems, and report the user’s computational complexity in
the secure outsourcing algorithms.
We define the computational complexity as the num-
ber of floating-point (flops) operations (additions, subtrac-
tions, multiplications, and divisions), bitwise operations, and
encryptions that the party needs to perform. We note that an
encryption takes many flops, and a flop takes some bitwise
operations.
Matrix inversion has complexity O(nρ) for 2 < ρ < 3,
given an n × n matrix. In the secure outsourcing scheme for
matrix inversion by Lei et al. [42], the user computes sparse
matrix multiplications, which have complexity O(n2).
In general, solving an LSE takes complexity O(n3). In the
scheme by Wang et al. [44], the user only performs vector
computations with complexity O(n) and outsources O(n2)
Paillier encryptions to a private cloud. The secure outsourcing
algorithm by Salinas et al. [37] requires the user to perform
operations with complexity O(n2), and do not need the help
of a private cloud.
Solving LPs and QPs requires O(n3) computing opera-
tions. The algorithm for secure outsourcing of linear pro-
grams presented by Wang et al. [45] requires the user to
perform operations with complexity O(nρ) for 2 < ρ < 3.
The user’s complexity when securely outsourcing QPs
using the algorithms proposed by Zhou et al. [48] and
Salinas et al. [35] is O(n2).
VI. CONCLUSIONS
Scientists and engineers have the opportunity to revolutionize
their fields of study by analyzing the massive data collected
by today’s society. To analyze such large-scale data sets,
cloud computing has been proposed as a cost-effective and
practical computing paradigm. However, since the data car-
ries private information, it should be kept secret from the
cloud and external attackers for ethical, security, or legal rea-
sons. Moreover, we notice that many large-scale data analysis
techniques are based on two most fundamental computational
problems, i.e., linear algebra and optimization. In this tuto-
rial, we have described the most frequently employed linear
algebra and optimization computations in large-scale data
analysis. We have presented an extensive tutorial of privacy-
preserving techniques, and how they are used to securely
outsource large-scale computations.
1414 VOLUME 4, 2016
www.redpel.com +917620593389
www.redpel.com +917620593389
10. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
REFERENCES
[1] J. F. Gantz, D. Reinsel, S. Minton, and V. Turner. (2014). The digital
universe of opportunities: Rich data and the increasing value of the Internet
of Things. IDC. [Online]. Available: http://idcdocserv.com/1678
[2] National Research Council, Frontiers in Massive Data Analysis.
Washington, DC, USA: The National Academies Press, 2013.
[3] Q. Hardy. (2015). IBM, G.E. and Others Create Big Data Alliance.
[Online]. Available: http://bits.blogs.nytimes.com/2015/02/17/ibm-g-e-
and-others-create-big-data-alliance/
[4] Big Data: Seizing Opportunities, Preserving Values. [Online]. Available:
https://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_
report_may_1_2014.pdf
[5] A. McAfee and E. Brynjolfsson, Big Data: The Management Revolution.
Boston, MA, USA: Harvard Business Review, Oct. 2012.
[6] G. Adomavicius and A. Tuzhilin, ‘‘Toward the next generation of rec-
ommender systems: A survey of the state-of-the-art and possible exten-
sions,’’ IEEE Trans. Knowl. Data Eng., vol. 17, no. 6, pp. 734–749,
Jun. 2005.
[7] A. Ipakchi and F. Albuyeh, ‘‘Grid of the future,’’ IEEE Power Energy Mag.,
vol. 7, no. 2, pp. 52–62, Mar./Apr. 2009.
[8] President’s Council of Advisors on Science and Technology. (May 2014).
Big Data and Privacy: A Technological Perspective. [Online]. Available:
http://www.whitehouse.gov/sites/default/files/microsites/ostp/PCAST/
pcast_big_data_and_privacy_-_may_2014.pdf
[9] T. Kraska, ‘‘Finding the needle in the big data systems haystack,’’ IEEE
Internet Comput., vol. 17, no. 1, pp. 84–86, Jan./Feb. 2013.
[10] Y. Simmhan et al., ‘‘Cloud-based software platform for big data ana-
lytics in smart grids,’’ Computing Sci. Eng., vol. 15, no. 4, pp. 38–47,
Jul./Aug. 2013.
[11] E. E. Schadt, M. D. Linderman, J. Sorenson, L. Lee, and G. P. Nolan,
‘‘Computational solutions to large-scale data management and analysis,’’
Nature Rev. Genet., vol. 11, no. 9, pp. 647–657, Sep. 2010.
[12] H. Demirkan and D. Delen, ‘‘Leveraging the capabilities of service-
oriented decision support systems: Putting analytics and big data in cloud,’’
Decision Support Syst., vol. 55, no. 1, pp. 412–421, 2013.
[13] E. Kohlwey, A. Sussman, J. Trost, and A. Maurer, ‘‘Leveraging the
cloud for big data biometrics: Meeting the performance requirements of
the next generation biometric systems,’’ in Proc. IEEE World Congr.
Services (SERVICES), Washington, DC, USA, Jul. 2011,
pp. 597–601.
[14] U. Kang, D. H. Chau, and C. Faloutsos, ‘‘Pegasus: Mining billion-
scale graphs in the cloud,’’ in Proc. IEEE Int. Conf. Acoust.,
Speech Signal Process. (ICASSP), Kyoto, Japan, Mar. 2012,
pp. 5341–5344.
[15] S. Sakr, A. Liu, D. M. Batista, and M. Alomari, ‘‘A survey of
large scale data management approaches in cloud environments,’’
IEEE Commun. Surveys Tuts., vol. 13, no. 3, pp. 311–336,
Mar. 2011.
[16] Amazon Web Services. (2015). Netflix Case Study. [Online]. Available:
https://aws.amazon.com/solutions/case-studies/netflix/
[17] Cloud Security Allliance. (2011). Security Guidance for Critical
Areas of Focus in Cloud Computing V3.0. [Online]. Available:
https://cloudsecurityalliance.org/guidance/csaguide.v3.0.pdf
[18] A. Gumbus and F. Grodzinsky, ‘‘Era of big data: Danger of
descrimination,’’ SIGCAS Comput. Soc., vol. 45, no. 3, pp. 118–125,
Sep. 2015.
[19] H. K. Patil and R. Seshadri, ‘‘Big data security and privacy issues in
healthcare,’’ in Proc. IEEE Int. Congr. Big Data, Anchorage, AK, USA,
Jun./Jul. 2014, pp. 762–765.
[20] D. Eckhoff and C. Sommer, ‘‘Driving for big data? Privacy concerns in
vehicular networking,’’ IEEE Security Privacy, vol. 12, no. 1, pp. 77–79,
Jan. 2014.
[21] A. Narayanan and V. Shmatikov, ‘‘Robust de-anonymization of large
sparse datasets,’’ in Proc. IEEE Symp. Secur. Privacy, Oakland, CA, USA,
May 2008, pp. 111–125.
[22] Y. Liu, P. Ning, and M. K. Reiter, ‘‘False data injection attacks against state
estimation in electric power grids,’’ in Proc. ACM Conf. Comput. Commun.
Secur., Chicago, IL, USA, 2009, pp. 21–32.
[23] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage, ‘‘Hey, you, get
off of my cloud: Exploring information leakage in third-party compute
clouds,’’ in Proc. ACM Conf. Comput. Commun. Secur., Chicago, IL, USA,
Nov. 2009, pp. 199–212.
[24] S. Subashini and V. Kavitha, ‘‘A survey on security issues in service
delivery models of cloud computing,’’ J. Netw. Comput. Appl., vol. 34,
no. 1, pp. 1–11, 2011.
[25] D. Perez-Botero, J. Szefer, and R. B. Lee, ‘‘Characterizing hypervisor
vulnerabilities in cloud computing servers,’’ in Proc. Int. Workshop Secur.
Cloud Comput., Hangzhou, China, 2013, pp. 3–10.
[26] R. Miao, R. Potharaju, M. Yu, and N. Jain, ‘‘The dark menace: Character-
izing network-based attacks in the cloud,’’ in Proc. Conf. Internet Meas.,
Tokyo, Japan, 2015, pp. 169–182.
[27] A. Duncan, S. Creese, M. Goldsmith, and J. S. Quinton, ‘‘Cloud com-
puting: Insider attacks on virtual machines during migration,’’ in Proc.
12th IEEE Int. Conf. Trust, Secur. Privacy Comput. Commun. (TrustCom),
Melbourne, VIC, Australia, Jul. 2013, pp. 493–500.
[28] S. F. F. Gibson and B. Mirtich, ‘‘A survey of deformable modeling in
computer graphics,’’ Mitsubishi Electr. Res. Lab., Cambridge, MA, USA,
Tech. Rep. TR-97-19, 1997.
[29] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Compu-
tation: Numerical Methods. Englewood Cliffs, NJ, USA: Prentice-Hall,
1989.
[30] J. Nocedal and S. J. Wright, Numerical Optimization. New York, NY, USA:
Springer, 2006.
[31] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector
Machines and Other Kernel-based Learning Methods. Cambridge, U.K.:
Cambridge Univ. Press, 2000.
[32] A. Monticelli, State Estimation in Electric Power Systems: A Generalized
Approach. Norwell, MA, USA: Kluwer, 1999.
[33] R. J. Vanderbei, Linear Programming: Foundations and Extensions. New
York, NY, USA: Springer, 2007.
[34] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.:
Cambridge Univ. Press, 2004.
[35] S. Salinas, C. Luo, W. Liao, and P. Li, ‘‘Efficient secure outsourcing
of large-scale quadratic programs,’’ in Proc. ACM Asia Conf. Comput.
Commun. Secur., Xi’an, China, 2016.
[36] J. Katz and Y. Lindell, Introduction to Modern Cryptography. London,
U.K.: Chapman & Hall, 2008.
[37] S. Salinas, C. Luo, X. Chen, and P. Li, ‘‘Efficient secure outsourcing
of large-scale linear systems of equations,’’ in Proc. IEEE Conf. Comput.
Commun. (INFOCOM), Hong Kong, Apr./May 2015, pp. 1035–1043.
[38] R. Gennaro, C. Gentry, and B. Parno, ‘‘Non-interactive verifiable com-
puting: Outsourcing computation to untrusted workers,’’ in Proc. 30th
Annu. Conf. Adv. Cryptol. (CRYPTO), Santa Barbara, CA, USA, 2010,
pp. 465–482.
[39] K.-M. Chung, Y. Kalai, and S. Vadhan, ‘‘Improved delegation of computa-
tion using fully homomorphic encryption,’’ in Proc. 30th Annu. Conf. Adv.
Cryptol., Santa Barbara, CA, USA, 2010, pp. 483–501.
[40] T. ElGamal, ‘‘A public key cryptosystem and a signature scheme based
on discrete logarithms,’’ IEEE Trans. Inf. Theory, vol. 31, no. 4,
pp. 469–472, Jul. 1985.
[41] S. Goldwasser and S. Micali, ‘‘Probabilistic encryption & how to
play mental poker keeping secret all partial information,’’ in Proc.
ACM Symp. Theory Comput., San Francisco, CA, USA, 1982,
pp. 365–377.
[42] X. Lei, X. Liao, T. Huang, H. Li, and C. Hu, ‘‘Outsourcing large matrix
inversion computation to a public cloud,’’ IEEE Trans. Cloud Comput.,
vol. 1, no. 1, p. 1, Jan./Jun. 2013.
[43] F. Chen, T. Xiang, X. Lei, and J. Chen, ‘‘Highly efficient linear regres-
sion outsourcing to a cloud,’’ IEEE Trans. Cloud Comput., vol. 2, no. 4,
pp. 499–508, Oct. 2014.
[44] C. Wang, Q. Wang, K. Ren, and J. Wang, ‘‘Harnessing the cloud for
securely outsourcing large-scale systems of linear equations,’’ IEEE Trans.
Parallel Distrib. Syst., vol. 24, no. 6, pp. 1172–1181, Jun. 2013.
[45] C. Wang, K. Ren, and J. Wang, ‘‘Secure optimization computation out-
sourcing in cloud computing: A case study of linear programming,’’ IEEE
Trans. Comput., vol. 65, no. 1, pp. 216–229, Jan. 2016.
[46] C. Wang, B. Zhang, K. Ren, and J. M. Roveda, ‘‘Privacy-assured outsourc-
ing of image reconstruction service in cloud,’’ IEEE Trans. Emerg. Topics
Comput., vol. 1, no. 1, pp. 166–177, Jun. 2013.
[47] H. Nie, X. Chen, J. Li, J. Liu, and W. Lou, ‘‘Efficient and verifiable
algorithm for secure outsourcing of large-scale linear programming,’’ in
Proc. IEEE 28th Int. Conf. Adv. Inf. Netw. Appl. (AINA), Victoria, BC,
Canada, May 2014, pp. 591–596.
[48] L. Zhou and C. Li, ‘‘Outsourcing large-scale quadratic program-
ming to a public cloud,’’ IEEE Access, vol. 3, pp. 2581–2589,
Dec. 2015.
VOLUME 4, 2016 1415
www.redpel.com +917620593389
www.redpel.com +917620593389
11. S. Salinas et al.: Tutorial on Secure Outsourcing of Large-scale Computations for Big Data
SERGIO SALINAS (M’09) received the B.S. degree
in telecommunications engineering from Jackson
State University, Jackson, in 2010, and the
Ph.D. degree in electrical and computer engineer-
ing from Mississippi State University, Starkville,
MS, in 2015. He is currently an Assistant Professor
with the Department of Electrical Engineering
and Computer Science, Wichita State University,
Wichita, KS. His research interests include secu-
rity and privacy in cyberphysical systems, cloud
computing, and big data.
XUHUI CHEN (S’14) received the B.E. degree in
information engineering from Xidian University,
China, in 2012, and the M.E. degree in electrical
and computer engineering from Mississippi State
University, Starkville, in 2015. She is currently
pursuing the Ph.D. degree with the Department
of Electrical Engineering and Computer Science,
Case Western Reserve University. Her research
interests include big data, and mobile cloud
computing.
JINLONG JI (S’16) received the B.E. and
M.E. degrees from Xidian University, China, in
2010 and 2013, respectively. He is currently pursu-
ing the Ph.D. degree with the Department of Elec-
trical Engineering and Computer Science, Case
Western Reserve University. His research inter-
ests include cybersecurity, big data computing, and
smart health systems.
PAN LI (S’06–M’09) received the B.E. degree in
electrical engineering from the Huazhong Univer-
sity of Science and Technology, Wuhan, China,
in 2005, and the Ph.D. degree in electrical and
computer engineering from the University of
Florida, Gainesville, in 2009. Since Fall 2015,
he has been with the Department of Electrical
Engineering and Computer Science, Case Western
Reserve University. He was an Assistant Professor
with the Department of Electrical and Computer
Engineering, Mississippi State University, in 2009 and 2015. His research
interests include network science and economics, energy systems, security
and privacy, and big data. He has served as an Editor of the IEEE JOURNAL
ON SELECTED AREAS IN COMMUNICATIONS–COGNITIVE RADIO SERIES and the
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, a Feature Editor of the IEEE
WIRELESS COMMUNICATIONS, and a Technical Program Committee Co-Chair
for Ad-hoc, Mesh, Machine-to-Machine and Sensor Networks Track, IEE
VTC 2014, Physical Layer Track, Wireless Communications Symposium,
WTS 2014, the Wireless Networking Symposium, and the IEEE ICC 2013.
He received the NSF CAREER Award in 2012 and is a member of ACM.
1416 VOLUME 4, 2016
www.redpel.com +917620593389
www.redpel.com +917620593389