Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Internet of Things (IoT) in the Fog

167 views

Published on

Data Processing Challenges Presented by IoT Data
in Distributed Computing

Published in: Data & Analytics
  • Be the first to comment

Internet of Things (IoT) in the Fog

  1. 1. Data Processing Challenges Presented by IoT Data in Distributed Computing Tom Donoghue School of Computing National College of Ireland Dublin, Ireland Email: x16103491@student.ncirl.ie Abstract—The Internet of Things (IoT) growth is charted to accelerate over the next few years. The nature of IoT is such that they exist as heterogeneous unfettered devices and sensors, capable of emitting erratic and unyielding amounts of data. The challenges of processing unrelenting data from the IoT are similar to those of earlier Big data and Cloud computing but the scale is magnified. Streaming masses of disparate data to cloud hubs may well stifle and overwhelm infrastructure and service capability. The IoT excel in generating data but possess limited resources when it comes to data processing. We conduct a literature review with an interest in how distributed computing may assist in overcoming some of the challenges of processing IoT data. I. INTRODUCTION As the internet continues to evolve, each cycle of growth appears to consume yet another group of entities capable of generating more data than the previous set [1]. One such group of entities is the Internet of Things (IoT). There is an abundance of descriptions covering what the IoT are and without an accepted definition it falls down to the setting in which the IoT are used to obtain clarification [2],[3]. For our purposes, we adopt an IoT description from [4] who suggest that IoT objects share an internet connected relationship which, enables them to converse through the transmission of data concerning the context of their local environment. In this paper we review a sample of literature which encompass some of the challenges associated with processing IoT data and how distributed computing may relieve certain pinch points? Distributed computing (referring to compute and storage services) offered through clouded implementations go some way to close the gaps encountered in processing IoT data [5]. The literature refers to the estimated extent of IoT growth as supplied by industrial and vendor research [6],[7]. For example, a current IoT growth prediction is estimated to be in the order of 30 billion IoT connected devices by 2020 with a data footprint of 180 zettabytes being emitted annually by 2025. 1 II. IOT DATA PROCESSING CHALLENGES The IoT is still a relatively new area and may appear redolent of the advance of cloud computing. The sheer increase in IoT deployment and uptake create distinct data processing 1https://www.ibm.com/blogs/internet-of-things/enabling-iot-business- outcomes/ [Accessed 9 March 2017] [8] challenges. The figures indicating data emissions may warrant closer attention by those implementing IoT and their end users than the figures powerpointed by vendors and commercial research [9]. The IoT span a vast area and much of it is beyond the scope of our focus (e.g. data security, privacy and connectivity concerns which are relevant to IoT have been precluded) on the data processing challenges mentioned in this literature review and summarised below. A. Volume of Data Generated Each new deployment of an IoT device adds to the amount of data emitted [4],[9]. The approach to systems architecture as [7] observe will need to adapt to the demands of processing IoT data. The IoT data multiplier effect will impact systems which are not sized for the data profiles they expect to serve (including built in burst capability to cater for future data volume requirements) and will be at risk of becoming a point of failure should ingestion gateways become flooded with data. B. Uneven Frequency of Emission Devices may be event driven, always on or a mixture of both, hence the data generated is not necessarily uniform which [1],[10] recognise and [1] further elaborate on possible approaches to manage the irregular profile of IoT data. Another issue could be related to cost where infrastructure is in place awaiting data from event driven devices which are infrequently triggered or as [11] suggest, devices emitting redundant data to the cloud when it is superfluous to requirements. C. Speed of Arrival Devices producing real time or near real time streams as [6],[9],[12] confer present an additional processing challenge. Poor throughput and processing blockages will preclude timely reporting of sensitive information emitted from event driven sensors. Information which arrives too late may be of little value [3]. D. Heterogeneity A plethora of IoT devices producing differing output as [7],[9],[12],[13] suggest will add to the data processing burden. Connecting to a clouded infrastructure may relieve some of the IoT device heterogeneity connectivity concerns through the concept of the Cloud of Things [5],[11].
  2. 2. However, customised pre-processing may be required to contend with the variety of flavours and formats of data emitted [1],[6],[12],[14]. E. Quality Data quality may be eroded for several reasons as [6] mention this may be due to missing values, duplication, unknown meaning and sparse data. The level of impact will vary depending on the domain. Missing data may be due to erratic connectivity, device malfunction or failures [6],[9]. Devices may generate data that is rated as poor because of over dilution with superfluous values which are not part of the end user requirements [14]. F. Data Locality Devices generally are incapable of processing their own data, hence data is transmitted to the cloud for processing [4],[11],[12]. It is optimal to retain data locality, that the compute is proximal to the data. For IoT this will require boosting compute services towards the network edges. Without the ability to identify, isolate and pre-process the data the value it may contain could escape unnoticed [14]. Amongst a recent review of IoT literature [6] suggest that it tends to conduct a broad synopsis of the areas of concern without addressing the key item impacting IoT, that being how the handling of data is accomplished. We find literature that focus on IoT architecture [15] or IoT middleware [3], however, in general the data element is recognised, but perhaps not to the rigor desired. III. DISTRIBUTED PROCESSING OF IOT DATA The Cloud of Things (CoT) mentioned earlier, is the confluence of IoT and the Cloud [4],[5],[11],[12]. There are necessary shared services that IoT implementations obtain an advantage from which, according to [5] and related to our focus, is the provision of a substantial processing resource. However, based on the challenges mentioned above, dependence on a central clouded resource is unlikely to perform to the demands placed on it by a disparate heterogeneous IoT population. What follows is some of the distributed computing assists which we see in our review of the literature. A. Fog Computing Fog computing, as [11],[16] concur, is the juxtaposition of cloud services to primarily, but not only, the outer reaches of the network touching many disparate devices. Bringing compute closer to the device through the Fog may alleviate many of the IoT data challenges [11]. Proximal Fog endpoints which [11] suggest enable data locality, initial inspection, prejudiced selection of data and processing of high priority real time data at the edge. Lower priority data (and where necessary Fog processed data) undergoing a store and forward basis prior to transmission to the central clouded facilities for downstream processing. Bringing multiple Fog end points into play across a wide array of IoT deployments and selective partitioning of data by priority, leads to data only appearing where it is needed. It is not apparent how an extensive Fog cloud would be implemented [4] consider that the Fog lacks the resources required to conduct lengthy or convoluted data processing, and advocate the Lambda architectural design [17] that provides processing and machine learning capabilities from device data. The provision of storage and processing brought to work in unison at these Fog edges as [18] recognise creates the setting for parallelisation through federated mini clouds configured to suit the local IoT needs, with data coalescing to central clouded facilities. However, the dispersal of services to the edge gives rise to issues of command and control of the diaspora. The eradication of such issues is one of the alluring factors for the move to the cloud in the first place [18]. B. Contextualising Data Adding context to IoT data to enable a better understanding of the data which as [3],[4] suggest has been highlighted due to the expansion of the IoT. Addition of such context could be achieved by early simple machine learning in the Fog, providing the resources are in place. Conducting a cluster analysis of the IoT data in real time as [1] suggest would be one such method of bringing meaning to data which could be distributed across compute. The use of Map Reduce on IoT real time streams is not suitable as [12] point out and suggest that a new design pattern which provides parallel processing of IoT real time streams is long overdue. IV. CONCLUSION The IoT present many challenges which are typical of those witnessed by the surge of Big Data. It appears that the IoT data footprint is bigger by magnitudes. A review of the literature associated with the challenges of IoT data processing and what distributed computing might contribute to the alleviation of such issues was conducted. The current approach of a centralised cloud may not be capable of fully keeping in step with the demands of IoT data processing. Applying a Fog computing implementation could be designed to overcome many of the challenges mentioned. Data locality being a main attraction, hence assisting the introduction of parallelisation. Obtaining a degree of meaning about the data through its contextualisation could enable better management of data volume through partitioning. Both these areas present opportunities for further research in particular the devolution of machine learning of real time IoT data processing in the Fog. REFERENCES [1] D. Puschmann, P. Barnaghi, and R. Tafazolli, “Adaptive clustering for dynamic iot data streams,” IEEE Internet of Things Journal, vol. 4, no. 1, pp. 64–74, 2017. [2] L. Atzori, A. Iera, and G. Morabito, “A Survey of the Internet of Things,” Proceedings of the 1st International Conference on E-Business Intelligence (ICEBI2010), vol. 54, pp. 358–366, 2010. [3] M. A. Razzaque, M. Milojevic-Jevric, A. Palade, and S. Cla, “Middle- ware for internet of things: A survey,” IEEE Internet of Things Journal, vol. 3, no. 1, pp. 70–95, 2016.
  3. 3. [4] M. D´ıaz, C. Mart´ın, and B. Rubio, “State-of-the-art, challenges, and open issues in the integration of internet of things and cloud computing,” Journal of Network and Computer Applications, vol. 67, pp. 99–117, 2016. [5] A. Botta, W. De Donato, V. Persico, and A. Pescap´e, “Integration of Cloud computing and Internet of Things: A survey,” Future Generation Computer Systems, vol. 56, pp. 684–700, 2016. [6] Y. Qin, Q. Z. Sheng, N. J. G. Falkner, S. Dustdar, H. Wang, and A. V. Vasilakos, “When things matter: A survey on data-centric internet of things,” Journal of Network and Computer Applications, vol. 64, pp. 137–153, 2016. [7] T. Chun-Wei, L. Chin-Feng, C. Ming-Chao, and Y. Laurence, “Data Mining for Internet of Things,” IEEE Communications Surveys & Tutorials, vol. 16, no. 1, pp. 77–97, 2014. [8] IBM, “Enabling IoT Platforms to Deliver Business Outcomes.” [Online]. Available: https://www.ibm.com/blogs/internet-of-things/enabling-iot- business-outcomes [9] A. Sheth, “Internet of Things to Smart IoT Through Semantic, Cognitive, and Perceptual Computing,” IEEE Intelligent Systems, vol. 31, no. 2, pp. 108–112, 2016. [10] C. Perera, A. Zaslavsky, P. Christen, and D. Georgakopoulos, “Context Aware Computing for The Internet of Things,” IEEE Communications Surveys & Tutorials, vol. 16, no. 1, pp. 414–454, 2014. [11] M. Aazam and E. N. Huh, “Fog computing and smart gateway based communication for cloud of things,” Proceedings - 2014 International Conference on Future Internet of Things and Cloud, FiCloud 2014, pp. 464–470, 2014. [12] H. Cai, B. Xu, L. Jiang, and A. V. Vasilakos, “IoT-based Big Data Storage Systems in Cloud Computing: Perspectives and Challenges,” IEEE Internet of Things Journal, vol. PP, no. 99, p. 1, 2016. [13] L. Jiang, L. D. Xu, H. Cai, Z. Jiang, F. Bu, and B. Xu, “An IoT- Oriented Data Storage Framework in Cloud Computing Platform,” IEEE Transactions on Industrial Informatics, vol. 10, no. 2, pp. 1443–1451, 2014. [14] F. Chen, P. Deng, J. Wan, D. Zhang, A. V. Vasilakos, and X. Rong, “Data mining for the internet of things: Literature review and challenges,” International Journal of Distributed Sensor Networks, vol. 11, no. 8, p. 431047, 2015. [15] S. Li, L. D. Xu, and S. Zhao, “The internet of things: a survey,” Information Systems Frontiers, vol. 17, no. 2, pp. 243–259, 2015. [16] F. Bonomi, R. Milito, J. Zhu, and S. Addepalli, “Fog Computing and Its Role in the Internet of Things,” Proceedings of the first edition of the MCC workshop on Mobile cloud computing, pp. 13–16, 2012. [17] N. Marz and J. Warren, Big data: principles and best practices of scalable realtime data systems. London;Greenwich, Conn;: Manning, 2013. [18] X. Masip-Bruin, E. Marn-Tordera, G. Tashakor, A. Jukan, and G. J. Ren, “Foggy clouds and cloudy fogs: a real need for coordinated management of fog-to-cloud computing systems,” IEEE Wireless Communications, vol. 23, no. 5, pp. 120–128, October 2016.

×