Your SlideShare is downloading. ×
Big data analytics and its impact on internet users
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Big data analytics and its impact on internet users

239

Published on

Big Data Analytic tools are promising techniques for a future prediction in many aspects of our life. The need for such predictive techniques has been exponentially increasing. even though, there are …

Big Data Analytic tools are promising techniques for a future prediction in many aspects of our life. The need for such predictive techniques has been exponentially increasing. even though, there are many challenges and risks are still of concern of researchers and decision makers, the outcome from the use of these techniques will considerable revolutionize our world to a new era of technology.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
239
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Big Data Analytics and Its Impact on Internet Users Salaheddin Khiri M.Beskri1, Sharafaldeen Mohamed Ashoury 2 UniversityUtara Malaysia Email Address 1 Eng.salaheddin@Gmail.com 2 shraf82@Gmail.com I. INTRODUCTION In the new era of globalization, the amount of digital data has been tremendously bursting and that, consequently, has made the process of analyzing it a serious challenge when it comes to productivity growth, innovation and consumer surplus whereby they rely on data analysis to make a right decision. Big data has become 21th century challenge, and it consists of verities of imperfect complex and unstructured data. It’s development communities’ and policymakers’ concern to figure out a solution for this matter. Big data doesn’t mean only how much data in the store. Big Data is about predicting current and future issues in all aspects of life, to answer questions that used to be considered out of reach. Big Data is a combination of 3 components (Three V’s), as shown in Fig. 1, namely: 1) Volume: the incoming amount of data expedites exponentially in terms of size to the Petabytes1. Smart meters and heavy industrial equipment like machine sensors generate similar data volumes as well as social media and millions of millions of other traditional databases. 2) Variety: data is generated in different formats from different sources (e.g. Images, Audio, Video and Texts) and that reflects one of Big Data characteristics. 3) Velocity: it is about the speed of how much data incomes. Different sources with different speeds of incoming data from outer objects makes data becomes bigger faster and harder to be analysed. Fig. 1 A Big Data Characteristic 1 1 Petabyte= 1,000,000 Megabytes II. TYPE OF BIG DATA There are various sources that data is flown and generated from through different types of channels every second. Weblogs, social media, email, sensors, photographs and transactional data are some examples of type of big data and they are classified, as follows [5] : Traditional enterprise data: including ERP databases and traditional information systems. Machine-generated /sensor data: surveillance cameras, industrial systems, weblogs. Social data: consumers’ comments, Twitter, social media platforms like Facebook, Space, Instagram. However, Multimedia contents have played a main role in the data’s extreme growth. Each second of high definition video, for instance, adds 2000 times as many bytes as a page of text [4]. In addition, social media (e.g. Facebook), since the rate of new users per month has reached to 100,000 users in 2012 [3], lots of various pieces of data have been uploaded to the internet and shared between users globally. Also, medical images, maps, video files, data that is stored in data warehouses are other types of Big Data. In other words, big data is can be in any form of data, namely structured and unstructured data such pictures, audio video files, text, and log files and many more. III. BENEFITS OF BIG DATA Our world encounters everyday new issues, in terms of digital world and numerous techniques are invented to overcome those issues to benefit our digital world in order to achieve its goals. Big Data, as a solution, will bring in numerous opportunities to our world in different aspects of life. Basically, advanced analytics enables you to create and develop models that can be used to predict answers for many critical questions. For instance, a developed model for statistics that consists of combination of consumer buying behaviour with consumer profiles can then utilized to predict future behaviour of consumers. Companies that struggle everyday to maintain their competitive advantage in the market are the ones who will definitely benefit from Big Data in many ways [6] such as:
  • 2. A. Detect, prevent and remediate financial fraud. Big Data is used by many organizations to reveal and indicate any attempts or criminal fraud that are trying to harm their networking system and servers and also predicts most likely future attacks against their systems. So that, organizations have embraced the most powerful analytics technique. B. Maintaining customer life time value Marketing campaigns are mainly used by companies to gain customers value and to avoid losses. Financial services companies operate and run campaigns to billions of potential customers. Therefore the amount of information is tremendously increasing and the attempts to process data with traditional tools have become failure. Consequently, that failure limits companies’ businesses growth since customers lifetime value is short and limited Businesses managers have turned into high-performance analytics. Tremendous gains institutions have achieved by developing and compressing their analytic model and implementing further validation for the variables in the model to obtain greater reliability in their models [6]. C. Improve delinquent collections. Prepaid phone services are widely spread among consumers around the world. In the US mobile telecom market, post-pay phone services are dominant among Americans. So, that will impose telecom companies to conduct an exhaustive search to trace consumers’ debts. The term high-performance analytics, again, changed the way in which the US mobile telecom companies calculate and determine how much credits consumers owe them and how much credits left [6]. IV. CHALLENGES IN BIG DATA ANALYTICS In the past, the amount of data wasn’t that huge. Variety, velocity and volume were never a serious issue when it comes to data process and analysis. Nowadays, new data types are founded and used in our systems. Besides, the amount of data income has been tremendously increasing and that resulted in huge size of repositories need to be processed. Moreover, decision makers have been looking for techniques to process and analyse their data and that, consequently, led to the use of Big Data analytics. To utilize Big Data analytics, organizations should take major challenges into account prior to Big Data analytics techniques deployment. Whereby, there are many challenges organizations could encounter in dealing with big data, though, in below discussions, the main challenges are highlighted and explored, as followings: A. Internet Users privacy. Privacy is the most sensitive issue for organizations as well as for people around the world. “Because privacy is a pillar of democracy, we must remain alert to the possibility that it might be compromised by the rise of new technologies, and put in place all necessary safeguards.” [3]. Social networks and mobile phones have been widely used to exchange information and that is likely to result in misusing that information spontaneously or intently by others. B. Data Acquisition and sharing Backing up data in magnetic tapes and keeping it in a secret store, will exhaust the operation of accessing the data. “An Indonesian mobile carrier estimated that it would take up to half a day of work to extract one day’s worth of backup data currently stored on magnetic tapes.” [3]. After all, the stored data cannot be accessed or transferred. In addition, due to seeking for a success in businesses and to achieve competitive advantage, companies always look for the right partners to compete their rivals and bring in a success to their businesses. For being a partner, private data should be shared and exchanged between partners and these are two issues have be secured during the engagement period to fulfill the promise namely, reliable access to data streams and get access to back up data for retrospective analysis and data training purposes. Also, inter-comparability of data and interoperability of systems are other technical issues that are encountered in accessing and sharing data but they are less problematic than obtaining access or license to access data by partners [3]. C. Data analysis. The analysis type that was implemented and the type of decision that is going to be informed are main variables that revolutionize the relevance and severity of the number of analytical challenges. In science research, scientists build their decision on a collected data from different sources and, moreover, policymakers may ask a question such as what is data telling us? So that, the answer will tell them either change the companies policies or not. The human analyst’s input is critical whether it is fabricated or real. In many cases, decisions have gone mistakenly incorrect because of data analysis mismatch. A good example is Google Flu Trends, whose ability to “detect influenza epidemics in areas with a large population of web search users” was. Data were compared by a group of medical experts from Google Flu Trends from 2003 to 2008 used data from two different networks2 and found that the Google Flu Trends researchers did not predict actual flu very well even though “they did a very good job at predicting nonspecific respiratory illnesses (bad colds and other infections like SARS) that seem like the flu. The mismatch was due to the presence of infections causing symptoms that resemble those of influenza, and the fact that influenza is not always associated with influenza-like symptoms” [3]. 2 The CDC's influenza-like-illness surveillance network and the CDC's virologic surveillance system.
  • 3. V. BIG DATA ANALYTICS RISKS A. Data Privacy In fact, Data acquired from different sources to be analysed belongs to Internet users. Knowingly or unknowingly institutions use Internet user’s public and private data to make bad or good decisions. B. Making False Decisions In addition, Big Data analytics’ results are predictions. Any failure in analysing data or mistake throughout a process of Big Data analysis will result in false decisions. Thereby, institutions takes false actions based on false decisions. Even though results were achieved and critical questions are answered, the need for results verification is still and critical step in the whole process. Decision makers do not want to dare risking their businesses by relying on unverified results as a matter of fact. So that, data scientists review the whole process of the knowledge discovery phase by retracing the used methods, understanding the results and critically undergo the analysis into quality’s tests. C. Over-dependence on data Big Data Analytics are just predictions. Therefore, treating results as necessity may help institution avoid lots of losses. Big Data Analytic’s processes require talent, lots of work and validation. In other words, it’s an iterative process and the need for other analytics’ techniques is necessary. VI. BIG DATA ANALYTIC PROCESS Since Big Data consists of tremendous amount of data whereby traditional processors cannot handle it, big data analytic process must be carried out into two phases [2], as shown in Fig.2. A. Knowledge Discover In this phase, data has to be undergone certain preprocesses to make it meaningfully coherent. There are 5 steps an organization should go through before proceeding to the second phase, namely: 1) Acquisition: Acquiring data needed to be analysed from different repositories is the first step. In order to acquire the data, an access to information and also methods to gather the data are required. Tracking websites, machine sensors, and system or application log files and writing inquiries to a search engine. In some cases, data from external sources of an institution might be required as well. 2) Pre-processing To achieve trustworthy and useful results, data has to be organized and classified based on its format. 3) Integration In this step, data is completely retrieved and organized. Whereby, redundant and clustering data are eliminated and data becomes smaller representative sample. 4) Analysis In the analysis step, data is analysed by describing and predicting broad trends. In addition, researchers start searching for relationships among data looking for answers for their questions. Answers could be factors such as customers’ tendency in buying mobile phone. 5) Interpretation Fig. 2 Big Data Analytic Processes B. Application implementation In this phase, after the data has went through series of processes with help of certain algorithms, Data, in this phase, are fed into an application owned by an institution to determine what and how the institution should act. For instance, It may predict the customer behaviour while buying their products. It may predict what shops or market they prefer and what may buy. Locations also can be predicted while travelling or driving on a road. So that, based on that, companies decide the next and proper actions to increase customer value lifetime and draw customers’ attentions to their products . In one word, the institution reaps the benefits in this phase. VII. TECHNIQUE/APPROACH TO OVERCOME THE CHALLENGES Consumers are mainly affected by the all above discussed issues whereby businesses architectures consists of private information about their stakeholders and the privacy is growing as the value of big data becomes more apparent. On the hand, organizations that would like to deliver the value of big data have to adopt a flexible multidisciplinary approach. Thus, there are verities of techniques and approaches have been developed by either academics or companies to analyse, manage, gather and represent data visually. Below are some of techniques and solutions, which are deployed by [1], for the discussed challenges and issues, namely:
  • 4. A. NOSQL Databases NOSQL databases are segregated from Structured Query Language (SQL) that relational databases (RDBMS) use. SQL is a complementary for relational databases and is considered as the domain-specific language for ad hoc queries, whereas non-relational databases can use whatever they want because SQL is not included and it can be, if needed, included [1]. Relational databases can’t maintain its performance when it comes to a tremendous amount of data and lots of transactions in a very small time unit However, No-SQL databases have created a divided solutions consisted of [1]: 1) Not Only SQL (NoSQL) solutions: 2) SQL solutions NoSQL and SQL solutions have been combined to achieve the highest performance at processing transactions. Whereby, institution can maintain privacy. “Oracle corporation’ solutions, for instance, implemented these techniques in its solutions for enterprises and successfully met all the challenges” [1], as shown in Fig. 3. research to collect data is another added costs for institutions whereas using the Ineternet to elicit customers feedback from various and active communities reduces the amount of time and that consequently will reduce the staffing costs and research expenses. However, all of the techniques that we’ve listed are part of numerous techniques that can be applied to big data. VIII. BIG DATA ANALYTIC AND ITS SERIOUS IMPACTS ON THE INTERNET USERS Data is manmade. Every day billions of new data are entered to the Internet from different sources beginning with social media, which is the most used source, ending with machine sensors (e.g. Surveillance Cameras). Therefore, Big Data Analytics definitely impact the Internet users as a bottom line for companies, for instance. Internet users can be influenced by this technique in many way namely: • In marketing district: Companies have been looking for the best ways to sell their products using advertisements and the Internet has become the most targeted place for advertising their products and services. So that, Big Data Analytics have given another reason for companies to mainly use the Internet as a medium to reach the bottom lines. A pregnant who is walking on a street passing a baby affairs’ shop might receive a message on her phone using GPS from a shop advertising their products since that woman is considered as a potential customer [3]. Moreover, Consumer Tendency can be predicted using Big Data Analytics. • In politics district: Internet users’ tendency towards candidates in presidency election can be predicted using Big Data Analytics. “Before the votes were cast, New York Times blogger Nate Silver predicted, with 90%+ confidence, that Obama would win the election” said [7]. The blogger used a Big Data Analytic tool to predict the result. Fig. 3. Oracle’s Big Data Solutions B. Cloud Computing Cloud computing has established a new era of less expensive and ease-to-use technology. It’s a pool of computing resources which can be utilized on users’ demand. Consequently, it has become a target for analysing and predicting users’ insights. Cloud computing has become a significant place for Big Data analytics tools [4]. Since customers who subscribed to cloud applications, cloud application providers have benefitted from the external data sources by analysing them with their operational systems[4]. C. Crowdsourcing Data is gained from numerous sources. Based on certain criteria that institutions have predetermined, labours are needed to collect data and implement series of processes to prepare data for analysis. Using formal focus group or tren • In smart healthcare district: The lack of follow-up with patients after leaving a hospital and failure to provide patients with necessary information upon leaving a hospital have increased patients’ readmission rates. Patients, since s/he admitted to a hospital and filled in an admission form then after staying for period of time and left the hospital, a plenty of information about a patients can help predicting the likelihood of the patient’s readmission. Using Big Data analytics, hospitals are managed to reduce admissions that might occur in the next 30 days after the patient’s discharge [2]. • In education enhancing district: Big Data Analytics have intelligently impacted the students learning amount. Big Data Analytics have showed teachers which student needs more attention, exercises and learning materials as well as any changes in classes if it is needed. In addition, Big Data analytics brought enhancements into education system by predicting students’ admission rate and students’ dropout rate [2].
  • 5. • In network security, Internet is a network of networks where billions of users conduct everyday different transactions on it whether harmful or useful. Hence, the need for network security has been rising as well as the risks have. Big Data Analytics has improved the network security, protecting networks from malfunctions, attacks and suspicious activities. Moreover, Big Data Analytics predicts any future attack or threat likely to harm a network system [2]. Big Data analytic gathers system’s log files of a network and it is processed in few steps as shown in Fig. 2. Since log files includes all attempts to access a server, attempts to download or upload a file, attempts to access specific files, system logins or any email emissions, predicting potential threats will be estimated and the network administrators takes cautious measures against them. IX. BIG DATA ANALYTICS LIMITATIONS Big Data analytics has brought a new era of prediction’s techniques. Whereby, Big Data saves lots of time and money in favour of decision makers to predict a potential future of any aspect of our life. Bid Data still has limitations namely: 1) Specific context is critical. Big Data might save time and money but without the specific context it is useless [8]. If an institution is trying to figure out an answer for a question using Big Data Analytic and the answer is not in the acquired data, Big Data becomes useless in that situation. 2) The three V’s of data should be considered. Big Data three Vs should be considered before thinking of using its technologies. If an institution’s data is just an amount of data, Big Data analytic results are unreliable since velocity and variety do not exist in data [8]. 3) Traditional Analytics cannot be replaced Traditional analytic methods such as Oracle, MySQL, MS SQL and more can’t be replaced with Big Data analytic [8]. Traditional applications and systems are still used widely to answers questions decision makers may seek for. Most institutions run their own applications and systems and they assigned strong databases to maintain tremendous amount of data. Therefore, answers can be found easily in their systems rather than wasting time on proving Big Data Analytics requirements and implementing its complex processes. However, large-scale institution may still need both traditional analytics and Big Data analytics. Thus, they are considered complementary. CONCLUSION Big data analytic is a promising predicting technology for many aspects of life such as marketing, politics, healthcare systems, network security, education, and many more. Big Data analytics will benefit many institutions that have incompletely unanswered questions. In spite of its advantages, companies should take into account its risks and challenges prior to adoption’s phase. Privacy, making false decisions, over dependence on Big Data analytics’ results might be repercussions that unaware institutions might encounter. REFERENCES [1]. Dijcks, J.-P. (2012). Big Data for the Enterpise. Oracle and its Affiliates, 1–14. [2]. Hunton and Williams LLP. (2013). Big Data and Analytics: Seeking Foundations for Effective Privacy Guidance. Center Information Policy Leadership, (February), 1–16. [3]. Letouze, E. (2012). Big Data for Development : Challenges & Opportunities. [4]. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Hung Buyers, A. (2011). Big data : The next frontier for innovation , competition , and productivity. McKinsey & Company, (June), 1–143. [5]. Oracle, A., Paper, W., & August, E. A. (2012). Oracle Information Architecture : An Architect ’ s Guide to Big Data, ( August). [6]. Spakes, G. (2012, 4 16). Four ways big data can benefit your business. Retrieved from SAS: http://www.sas.com/news/feature/big-data-benefits.html [7]. Jim, R. (2013). Obama Wins and a big data lesson for the customer experience. Customer Relationship Metrics. Retrieved October 12, 2013, from http://metrics.net/blog/2012/11/obamawins-big-data-lesson-customer-experience/ [8]. Jean, Y. (2013). Big Data , Bigger Opportunities collaborate in the era of big data. Retrieved from http://www.meritalk.com/pdfs/bdx/bdx-whitepaper-090413.pdf.

×