Access security on cloud computing implemented in hadoop system


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Access security on cloud computing implemented in hadoop system

  1. 1. 2011 Fifth International Conference on Genetic and Evolutionary Computing Access Security on Cloud Computing Implemented in Hadoop System Bao Rong Chang Hsiu Fen Tsai Zih-Yao Lin Chi-Ming Chen Dept. of Computer Science and Dept. of Marketing Dept. of Computer Science and Dept. of Computer Science and Information Engineering Management Information Engineering Information Engineering National University of Sue-Te University, National University of National University of Kaohsiung, Taiwan Taiwan Kaohsiung, Taiwan Kaohsiung, Taiwan Abstract—In this paper, we introduce a Hadoop cloud work into small pieces because it is of the incompetence in a computing together with access security by using the single computer, and then these pieces are carried out by a fingerprint identification and face recognition. Hadoop cloud number of computers. After that, compiling their findings to computing is connected to serve plenty of mobile devices or complete the work is done. In addition, we have devoted to thin clients by the wired or wireless network. In fact a connect a variety of different platforms, different controller (master) may be linked to several nodes (slavers) to architectures, different levels of the computer through the form a Hadoop cloud computing, where the cloud computing network such that all of computers are cooperated with each initiates the services like SaaS, PaaS, and/or IaaS. This study other or network makes the computer to do services more far employed a way of low-capacity Linux embedded platform and wide in the cyber space, but the difference is that "cloud linked to Hadoop via Ethernet, WiFi or 3G. At client side, JamVM virtual machine is utilized to form the J2ME computing" has emphasized, even existing the limited environment, and GNU Classpath is viewed as Java Class resources in a local context, to make use of the Internet to Libraries. In order to verify the cloud system effectiveness and access remote computing resources. efficiency in access security, the rapid identification on Cloud computing is divided into two categories, namely fingerprint and face in Hadoop has been done successfully "cloud services" and "cloud technology". "Cloud services" is within 2.2 seconds to exactly cross-examine the subject identity. achieved through the network connection to the remote service. Such services provide users installation and use a Keywords- Hadoop cloud computing; Access security; variety of operating systems, for example Amazon Web Fingerprint identification; Face recognition; Linux Embedded Services (CE2 and S3) services. This type of cloud Platform; JamVM Virtual Machine. computing can be viewed as the concepts: "Infrastructure as a Service" (IaaS) "Storage as a service" (StaaS), respectively. I. INTRODUCTION Both of them are derived from the concept of "Software as a Service" (SaaS) that is the biggest area for cloud services in Cloud computing is an emerging and increasingly demand, while "Platform as a Service" (PaaS) concept is an popular computing paradigm, which provides the users alternative for cloud computing service. Using these services, massive computing, storage, and software resources on users can even simply to rely on a cell phone or thin client to demand. How to program the efficient distributed parallel do many of things that can only be done on a personal applications is complex and difficult. How to dispatch a computer in the past, which means that cloud computing is large scale task executed in cloud computing environments universal. The "cloud technology" is aimed at the use of is challenging as well. This is because applications running virtualization and automation technologies to create and in cloud computing environment need to conquer some spread computer in a variety of computing resources. This problems such as the network bandwidth, the faults type can be considered as traditional data centers (Data tolerance, and the heterogeneity. Different solutions to Center) extension; it does not require external resources programming and tasking in cloud computing environments provided by third parties and can be utilized throughout the have been proposed by several independent software companys internal systems, indicating that cloud computing vendors (ISV), each has its own strengths and weakness. also has the specific expertise. Currently on the market the Cloud computing service providers also have their own most popular cloud computing services are divided into programming model and APIs to be used by users. Here we public clouds, private clouds, community/open clouds, and will briefly introduce the different cloud computing hybrid clouds, where Goggle App Eng [1], Amazon Web perspectives and show our approach in this study. Cloud Services [2], Microsoft Azure [3] - the public cloud; IBM computing currently under development consists of three Blue Cloud [4] - the private cloud; Open Nebula [5], areas, massive computing, connectivity, and smart terminal. Eucalyptus [6], Yahoo Hadoop [7] and the NCDM The aim of cloud computing is towards low-cost (saving you Sector/Sphere [8] - open cloud; IBM Blue Cloud [4] - hybrid money), green energy (energy efficiency), and the cloud. ubiquitous (at any time, any place, any device to access any of the services). II. MOTIVATION AND BACKGROUND Nowadays cloud computing become a popular term with high-tech concept. In fact, these technologies are not entirely From the green energy point of view, cloud computing new, probably inherited from the nature of "distributed means the following four characteristics: large amounts of computing" and "grid computing". That is, we divide a large data, low cost, efficiency and reliability. The main purpose978-0-7695-4449-6/11 $26.00 © 2011 IEEE 77DOI 10.1109/ICGEC.2011.27
  2. 2. of this study is to build a Hadoop cloud computing [9]. In environment for Python or Java, an Eclipse IDE [13] isorder to verify the cloud system effectiveness and efficiency applied to develop the application program (AP) at local sitein access security, the fingerprint identification and face as shown in Fig. 6. It is noted that please remember to installrecognition by using rapid identification have achieved in Java JDK [14] before you setup an Eclipse IDE in local site.this study. If AP has been done and is waiting for dispatch on Hadoop In many applications, embedded devices often require cloud computing server, we deploy this AP via the path ofhuge computing power and storage space, just the opposite LAN or WiFi. Finally, as shown in Fig. 7, we take a look atof the hardware of embedded devices. Thus the only way to HBase in Hadoop server to make sure that the cloudachieve this goal is that it must be structured in the "cloud computing is ready for the task.computing" and operated in "cloud services". The idea ishow to use the limited resources of embedded devices to B. Establishing Thin Clientachieve the "cloud computing", in addition to using the In terms of thin client, JamVM is treated as thewired Ethernet connection, and further use of wireless framework of programming development; however themobile devices IEEE802.3b / g or 3G to connect, as shown virtual machine JamVM has no way to perform the drawingin Fig. 1. even through their core directly, and thus it must call other First, we use the standard J2ME [10] environment for graphics library to achieve the drawing performance. Hereembedded devices, where JamVM [11] virtual machine is some options we have are available, for example,employed to achieve J2ME environment and GNU GTK+DirectFB, GTK+X11, QT/Embedded, and so on, asClasspath [12] is used as the Java Class Libraries. In order to shown in Figs. 8, 9, and 10 below. The problem wereduce the amount of data transmission, the acquisition of encountered is that GTK needs a few packages to workinformation processed is done slightly at the front-end together required many steps for installation, compilingembedded devices and then processed data through the different packages to build system is also difficult, and it isnetwork is uploaded to the back-end, private small-cloud often time-consuming for the integration of a few packagescomputing. After the processing at the back-end is no guarantee to complete the work. Therefore this study hascompleted, the results sent back to the front-end embedded chosen QT/Embedded framework instead of GTK series, indevices. As shown in Fig.2, an open source Hadoop such a way that achieves GUI interface functions. In Fig. 11,packages is utilized to establish the private cloud computing no matter SWT or AWT in JamVM they apply Java Nativeeasily; in such a way that we can focus on installing the Interface (JNI) to communicate C- written graphics library.back-end cluster controllers and cloud controller in order to Afterward QT/Embedded gets through the kernel driver tobuild a private small-cloud computing. achieve graphic function as shown in Fig. 12. According to An embedded platform in conjunction with a cloud the pictures shown in Fig. 11 and Fig. 12, we can stringcomputing environment is applied to testing the capabilities them together to be the structure of a node device as shownof fingerprint identification and facial recognition as the in Fig. 10. This part will adopt a low-cost, low-poweraccess security system. The basic structure of Hadoop cloud embedded platform to realize a thin client.computing is developed and has been deployed as well. We C. Installing Access Security Systemwill then test the performance of the embedded platformoperating in cloud computing to check whether or not it can When establishing a Hadoop cloud computing is done,achieve immediate and effective response to required we will test the cloud employing an embedded platform in afunctions. Meanwhile, we continue to monitor the online cloud environment to perform fingerprint identification [15]operation and evaluate system performance in statistics, such and face recognition [16] to fulfill the access security systemas the number of files, file size, the total process of MB, the [17]. Meanwhile the development of basic structure andnumber of tasks on each node, and throughput. In a cluster deployment for Hadoop cloud computing are valid and evenimplementation of cloud computing, the statistical more we test the service performance for an embeddedassessment by the size of each node is listed. According to platform collaborated with cloud computing, checking anthe analysis of the results, we will adjust the system immediate and effective response to client. The accessfunctions if changes are required. security system is shown in Fig. 13. III. ACCESS SECURITY IN HADOOP IV. EXPERIMENTAL RESULTS AND DISCUSSIONS In order to verify the cloud system effectiveness andA. Deploying Hadoop Cloud Computing efficiency in access security, the experiment on fingerprint Once a Hadoop cloud computing server has been identification and face recognition by using rapidestablished in server site, we have to test the functionality of identification in Hadoop cloud computing has been donecloud computing in Hadoop system. As shown in Fig. 3, we successfully within 2.2 seconds in average accessfirst test the Hadoop Administration Interface at the website identification to exactly cross-examine the subject identity.http://hadoop1:50030. Next, Task Tracker Status at As a result the proposed Hadoop cloud computing has beenhttp://hadoop1:50060 will be examined as shown in Fig. 4. performed very well when it has deployed in local area.After that we move on to check Hadoop Distributed File Steps are as follows: (a) the operation for face recognition isSystem (HDFS) Status, as shown in Fig. 5, at quickly to open the video camera for the first, and then presshttp://hadoop1:50070. In order to setup a programming the capture button, the program will execute binarization 78
  3. 3. automatically as shown in Fig. 14; (b) the rapid fingerprintidentification is first to turn on device, then press the dealbutton for feature extraction that reduces the amount ofinformation as shown in Fig. 15; (c) at first the terminaldevice test the connection if Internet is working properly,and then we press the identify button and information sent tothe cloud, and at last the cloud will return the identificationresults as shown in Fig. 16. V. CONCLUSIONS In this study Hadoop cloud computing together withaccess security by applying the rapid identification onfingerprint and face has been realized so that cloudcomputing initiates the services like SaaS, PaaS, and/or IaaS.The connection between client and server has employed away of low-capacity Linux embedded platform linked toHadoop via Ethernet, WiFi or 3G. At client side, JamVMvirtual machine is utilized to form the J2ME environment,and GNU Classpath is viewed as Java Class Libraries. In Figure 1. Cloud computing server connected Figure 2. Hadoop server linked to mobile devices over with mobile device, PC, and notebook. WiFi.order to verify the cloud system effectiveness and efficiencyin access security, the rapid identification on fingerprint andface in Hadoop has been done successfully within 2.2seconds to exactly cross-examine the subject identity. ACKNOWLEDGMENT This work is fully supported by the National ScienceCouncil, Taiwan, Republic of China, under grant number:NSC99-2218--E-390-002. REFERENCES[1] Google App Engine, 2010. appengine[2] Amazon Web Services (AWS), 2010. Figure 3. Testing Hadoop Administration IF at http://hadoop1:50030.[3] Windows Azure- A Microsoft Solution to Cloud, 2010.[4] IBM Cloud Compputing, 2010.[5] OpenNebula, 2010.[6] Eucalyptus, 2010.[7] Welcome to Apache Hadoop, 2010.[8] Sector/Sphere, National Center for Data Mining, 2009.[9] Welcome to Apache Hadoop, 2010.[10] Java 2 Platform, Micro Edition (J2ME), 2010.[11] JamVM -- A compact Java Virtual Machine, 2010.[12] GNU Classpath , GNU Classpath, Essential Libraries for Java, in Figure 4. Testing Task Tracker Status at http://hadoop1:50060. 2010.[13] Eclipse Summit, 2011.[14] Java JDK, 2011. 136632.html[15] VeriFinger SDK, Neuro Technology, 2010.[16] VeriLook SDK, Neuro Technology, 2010.[17] opencv (open source), 2010. %B5&variant=zh-tw Figure 5. Testing HDFS Status at http://hadoop1:50070. 79
  4. 4. Figure 6. A Snapshot of Eclipse IDE. (b) Figure 13. System architecture.Figure 7. Key in commend “hbase shell” under Linux Environment. Java Apps SWT AWT Java Apps SWT AWT Java Apps SWT AWT CDC+PP CDC+PP GTK GTK QT/Embedded Clib Pango Atk Clib Pango Atk DirectFB X11 OS(kernel) OS(kernel) OS(kernel)Figure 8. Terminal node with Figure 9. Terminal node Figure 10. Terminal node Figure 14. Binarization processing automatically running in program.GTK + DirectFB. with GTK + X11. with QT / Embedded.Figure 11. Communication between SWT/AWT and QT/Embedded. Figure 15. Processing fingerprint features to reduce the amount of information.Figure 12. QT/embedded communicates with the Linux Framebuffer. (a) Figure 16. Information sent to the cloud and cloud returns the results of recognition to the consol. 80