SlideShare a Scribd company logo
1 of 15
Issues, OppOrtunItIes And ChAllenges
In BIg dAtA
To Be Presented
By
Zaharaddeen Karami Lawal
Department Of Computer Science And Engineering.
Jodhpur National University.
Content
Introduction
Characteristics of Big Data.
Hadoop and HDFS.
Map Reduced and Its Component.
Issues in Big Data.
Opportunities with Big Data.
Tackling Big Data Challenges.
Conclusion.
References.
Introduction
The concept of big data has been endemic within computer science since the
earliest days of computing. “Big Data” originally meant the volume of data
that could not be processed (efficiently) by traditional database methods
and tools.
In a broad term Big data can be describe as a data sets which is so large
or complex that can not be handle by traditional data processing
applications. More especially unstructured or semi-structured data.
 Big Data size: Ranges from terabytes (1012
bytes) to petabytes (1015
bytes).
 Around 2.5 quintillion (1018
) bytes of new data is created every day.
 90% of the world’s data has been created in just last 2 to 3 years.
Characteristics
Big data can be described by the following characteristics:
1. Volume – The quantity of data that is generated.
2. Variety - This means that the category to which Big Data belongs to is also
a very essential fact that needs to be known by the data analysts.
3. Velocity - The term ‘velocity’ in the context refers to the speed of
generation of data or how fast the data is generated and processed to meet
the demands and the challenges which lie ahead in the path of growth and
development.
4. Veracity - The quality of the data being captured can vary greatly.
Accuracy of analysis depends on the veracity of the source data. i.e
Uncertainty of data.
Techniques for Big Data Mining
Hadoop and HDFS
Hadoop is a scalable, open source, fault-tolerant Virtual Grid operating
system architecture for data storage and processing. It runs on commodity
hardware, it uses HDFS which is fault-tolerant high-bandwidth clustered
storage architecture. It runs MapReduce for distributed data processing and
is works with structured and unstructured data.
Map Reduce
MapReduce is a programming model for processing large-scale datasets
in computer clusters. The MapReduce programming model consists of two
functions, map() and reduce().
Hadoop Ecosystem
Issues in Big Data
1. Storage and Transport Issues
The quantity of data has exploded each time we have invented a new storage
medium. What is different about the most recent explosion – due largely to social
media – is that there has been no new storage medium. Moreover, data is being
created by everyone and everything (e.g., facebook, twitter, Whatsapp).
2. Management Issues
Management will, perhaps, be the most difficult problem to address with big data.
This problem first surfaced a decade ago in the UK eScience initiatives where data
was distributed geographically and “owned” and “managed” by multiple entities.
3. Processing Issues
Assume that an exabyte of data needs to be processed in its entirety. For
simplicity, assume the data is chunked into blocks of 8 words, so 1 exabyte = 1K
petabytes. Assuming a processor expends 100 instructions on one block at 5
gigahertz, the time required for end-to-end processing would be 20 nanoseconds. To
process 1K petabytes would require a total end-to-end processing time of roughly
635 years.
Big Data Opportunities
Manageability - when data can grow in a single file system namespace the
manageability of the system increases significantly and a single data
administrator can now manage a petabyte or more of storage versus 50 or
100 terabytes on a scale up system.
Elimination of stovepipes - since these systems scale linearly and do not
have the bottlenecks that scale up systems create, all data is kept in a single
file system in a single grid eliminating the stovepipes introduced by the
multiple arrays and files systems required .
Just in time scalability - as my storage needs grow I can add an
appropriate number of nodes to meet my needs at the time I need them.
Increased utilization rates - since the data servers in these scales out
systems can address the entire pool of storage there is no stranded capacity
Big Data Challenges
• Heterogeneity and Incompleteness
The difficulties of big data analysis derive from its large scale as well as
the presence of mixed data based on different patterns or rules
(heterogeneous mixture data) in the collected and stored data. In the case of
complicated heterogeneous mixture data, the data has several patterns and
rules and the properties of the patterns vary greatly. Data can be both
structured and unstructured. Now a days 80% of the data generated by
organizations are unstructured.
• Scale and complexity
Managing large and rapidly increasing volumes of data is a challenging
issue. Traditional software tools are not enough for managing the increasing
volumes of data. Data analysis, organization, retrieval and modeling are
also challenges due to scalability and complexity of data that needs to be
analyzed.
Continue
• Timeliness
As the size of the data sets to be processed increases, it will take more
time to analyze. In some situations, results of the analysis are required
immediately. For example, if a fraudulent credit card transaction is
suspected, it should ideally be flagged before the transaction is completed
by preventing the transaction from taking place at all. Any delay in Stock
exchange can cause huge lost
• Data Ownership
Data ownership presents a critical and ongoing challenge, particularly in
the social media arena. While petabytes of social media data reside on the
servers of Facebook, MySpace, and Twitter. It is not really owned by them
(although they may argue that, because of residency). Certainly, the
“owners” of the pages or accounts believe they own the data. This
dichotomy will have to be resolved in court.
Continue
Other Challenges are:-
 Availability
 Inconsistency
 Performance
 Privacy
 Security
 Infrastructure faults
 Extreme Data Distribution
 Dynamic Design Challenges
Tackling Big Data Challenges
Hadoop
Hadoop and HDFS by Apache is widely used for storing and managing Big Data.
Analyzing Big Data is a challenging task as it involves large distributed file systems
which should be fault tolerant, flexible and scalable.
Spark
Ability to handle advanced data processing tasks such as real time stream processing and
machine learning is way ahead of that of Hadoop.
NoSQL Databases
 they do not relay on the relational model and do not use the SQL languages;
 they tend to run on cluster architectures;
 they do not have a fixed schema, allowing to store data in any record.
Presto
 An efficient Big Data system developed by data engineers at the popular social
networking site, Facebook.
 An open source distributed SQL query engine for running interactive analytical
queries against data sources of all sizes ranging from gigabytes to petabytes.
Presto Architecture
Conclusion
The concept of big Data and HDFS component was introduced.
The issues regarding the Storage, Processing and Management of Big Data
was discussed.
The Opportunities and Challenges facing the Big Data from Data Storage
and Analytics perspectives was also discussed.
The measures that can be taken to overcome those challenges was also
discussed.
References
[1] S. Kaisler, F. Armour and J. A. Espinosa, "Big Data: Issues and Challenges Moving Forward," in 46th
Hawaii International Conference on System Sciences, Hawaii, 2013.
[2] " www.studymafia.org," [Online].
[3] D. J. S. Kiran, M. Sravanthi, K. Preethi and M. Anusha, "Recent Issues and Challenges on Big Data in
Cloud Computing," International Journal of Computer Science And Technology(IJCST), vol. Vol. 6, no.
Issue 2, April - June 2015.
[4] " (http://www.aip.org/fyi/2010/)," American Institute of Physics (AIP) College Park, MD , 2010.
[Online].
[5] D. Borthakur, The Hadoop Distributed File System Architecture and Design, 2007.
[6] A. Johnson , H. P.H, V. Paul and M. S. P.N, "Big Data Processing Using Hadoop MapReduce,"
International Journal of Computer Science and Information Technologies,(IJCSIT), vol. Vol. 6 (1), pp.
127-132, 2015.
[7] "Hadoop," ”http://wiki.apache.org/hadoop/PoweredBy, [Online]. Available:
http://wiki.apache.org/hadoop/PoweredBy.
[8] C. Ordonez, Algorithms and Optimizations for Big Data Analytics, University of Houston, USA.:
Cubes, Tech Talks.

More Related Content

What's hot

Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
 
Big data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing PlatformsBig data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing PlatformsIJERA Editor
 
Big data issues and challenges
Big data issues and challengesBig data issues and challenges
Big data issues and challengesDilpreet kaur Virk
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...IJSRD
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesKaran Deep Singh
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challengesfazail amin
 
IRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articlesIRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articlesIRJET Journal
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its ChallengesKathirvel Ayyaswamy
 
A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremAnthonyOtuonye
 

What's hot (20)

Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Big data mining
Big data miningBig data mining
Big data mining
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
Big data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing PlatformsBig data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing Platforms
 
Big data issues and challenges
Big data issues and challengesBig data issues and challenges
Big data issues and challenges
 
Big data
Big dataBig data
Big data
 
Data mining on big data
Data mining on big dataData mining on big data
Data mining on big data
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
 
Are you ready for BIG DATA?
Are you ready for BIG DATA?Are you ready for BIG DATA?
Are you ready for BIG DATA?
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and Issues
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
 
Datamining
DataminingDatamining
Datamining
 
IRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articlesIRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articles
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its Challenges
 
A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE Theorem
 
Research paper on big data and hadoop
Research paper on big data and hadoopResearch paper on big data and hadoop
Research paper on big data and hadoop
 
Bigdata
Bigdata Bigdata
Bigdata
 
Big data
Big dataBig data
Big data
 

Viewers also liked

Social Big Data in Government
Social Big Data in GovernmentSocial Big Data in Government
Social Big Data in GovernmentAdegboyega Ojo
 
Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)Ashok Royal
 
Big data
Big dataBig data
Big datahsn99
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Ashok Royal
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Mahantesh Angadi
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Ziyad Saleh
 
Ibm's watson
Ibm's watsonIbm's watson
Ibm's watsonHdavey01
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)Prof. Dr. Diego Kuonen
 
Cognitive analytics: What's coming in 2016?
Cognitive analytics: What's coming in 2016?Cognitive analytics: What's coming in 2016?
Cognitive analytics: What's coming in 2016?IBM Analytics
 

Viewers also liked (9)

Social Big Data in Government
Social Big Data in GovernmentSocial Big Data in Government
Social Big Data in Government
 
Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)
 
Big data
Big dataBig data
Big data
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1
 
Ibm's watson
Ibm's watsonIbm's watson
Ibm's watson
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)
 
Cognitive analytics: What's coming in 2016?
Cognitive analytics: What's coming in 2016?Cognitive analytics: What's coming in 2016?
Cognitive analytics: What's coming in 2016?
 

Similar to Seminar presentation

Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
research publish journal
research publish journalresearch publish journal
research publish journalrikaseorika
 
research publish journal
research publish journalresearch publish journal
research publish journalrikaseorika
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabatinabati
 
6.a survey on big data challenges in the context of predictive
6.a survey on big data challenges in the context of predictive6.a survey on big data challenges in the context of predictive
6.a survey on big data challenges in the context of predictiveEditorJST
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Thingspateelhs
 
JPJ1417 Data Mining With Big Data
JPJ1417   Data Mining With Big DataJPJ1417   Data Mining With Big Data
JPJ1417 Data Mining With Big Datachennaijp
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...IJSRD
 
How do data analysts work with big data and distributed computing frameworks.pdf
How do data analysts work with big data and distributed computing frameworks.pdfHow do data analysts work with big data and distributed computing frameworks.pdf
How do data analysts work with big data and distributed computing frameworks.pdfSoumodeep Nanee Kundu
 
data mining with big data
data mining with big datadata mining with big data
data mining with big dataswathi78
 
Big data introduction, Hadoop in details
Big data introduction, Hadoop in detailsBig data introduction, Hadoop in details
Big data introduction, Hadoop in detailsMahmoud Yassin
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond Rajesh Kumar
 

Similar to Seminar presentation (20)

Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
research publish journal
research publish journalresearch publish journal
research publish journal
 
research publish journal
research publish journalresearch publish journal
research publish journal
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
6.a survey on big data challenges in the context of predictive
6.a survey on big data challenges in the context of predictive6.a survey on big data challenges in the context of predictive
6.a survey on big data challenges in the context of predictive
 
BigData
BigDataBigData
BigData
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Things
 
JPJ1417 Data Mining With Big Data
JPJ1417   Data Mining With Big DataJPJ1417   Data Mining With Big Data
JPJ1417 Data Mining With Big Data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Unit 1
Unit 1Unit 1
Unit 1
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
 
How do data analysts work with big data and distributed computing frameworks.pdf
How do data analysts work with big data and distributed computing frameworks.pdfHow do data analysts work with big data and distributed computing frameworks.pdf
How do data analysts work with big data and distributed computing frameworks.pdf
 
1
11
1
 
Big data storage
Big data storageBig data storage
Big data storage
 
data mining with big data
data mining with big datadata mining with big data
data mining with big data
 
E018142329
E018142329E018142329
E018142329
 
Big data introduction, Hadoop in details
Big data introduction, Hadoop in detailsBig data introduction, Hadoop in details
Big data introduction, Hadoop in details
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
Datamining with big data
 Datamining with big data  Datamining with big data
Datamining with big data
 

Recently uploaded

Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 

Recently uploaded (20)

Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 

Seminar presentation

  • 1. Issues, OppOrtunItIes And ChAllenges In BIg dAtA To Be Presented By Zaharaddeen Karami Lawal Department Of Computer Science And Engineering. Jodhpur National University.
  • 2. Content Introduction Characteristics of Big Data. Hadoop and HDFS. Map Reduced and Its Component. Issues in Big Data. Opportunities with Big Data. Tackling Big Data Challenges. Conclusion. References.
  • 3. Introduction The concept of big data has been endemic within computer science since the earliest days of computing. “Big Data” originally meant the volume of data that could not be processed (efficiently) by traditional database methods and tools. In a broad term Big data can be describe as a data sets which is so large or complex that can not be handle by traditional data processing applications. More especially unstructured or semi-structured data.  Big Data size: Ranges from terabytes (1012 bytes) to petabytes (1015 bytes).  Around 2.5 quintillion (1018 ) bytes of new data is created every day.  90% of the world’s data has been created in just last 2 to 3 years.
  • 4. Characteristics Big data can be described by the following characteristics: 1. Volume – The quantity of data that is generated. 2. Variety - This means that the category to which Big Data belongs to is also a very essential fact that needs to be known by the data analysts. 3. Velocity - The term ‘velocity’ in the context refers to the speed of generation of data or how fast the data is generated and processed to meet the demands and the challenges which lie ahead in the path of growth and development. 4. Veracity - The quality of the data being captured can vary greatly. Accuracy of analysis depends on the veracity of the source data. i.e Uncertainty of data.
  • 5. Techniques for Big Data Mining Hadoop and HDFS Hadoop is a scalable, open source, fault-tolerant Virtual Grid operating system architecture for data storage and processing. It runs on commodity hardware, it uses HDFS which is fault-tolerant high-bandwidth clustered storage architecture. It runs MapReduce for distributed data processing and is works with structured and unstructured data. Map Reduce MapReduce is a programming model for processing large-scale datasets in computer clusters. The MapReduce programming model consists of two functions, map() and reduce().
  • 7. Issues in Big Data 1. Storage and Transport Issues The quantity of data has exploded each time we have invented a new storage medium. What is different about the most recent explosion – due largely to social media – is that there has been no new storage medium. Moreover, data is being created by everyone and everything (e.g., facebook, twitter, Whatsapp). 2. Management Issues Management will, perhaps, be the most difficult problem to address with big data. This problem first surfaced a decade ago in the UK eScience initiatives where data was distributed geographically and “owned” and “managed” by multiple entities. 3. Processing Issues Assume that an exabyte of data needs to be processed in its entirety. For simplicity, assume the data is chunked into blocks of 8 words, so 1 exabyte = 1K petabytes. Assuming a processor expends 100 instructions on one block at 5 gigahertz, the time required for end-to-end processing would be 20 nanoseconds. To process 1K petabytes would require a total end-to-end processing time of roughly 635 years.
  • 8. Big Data Opportunities Manageability - when data can grow in a single file system namespace the manageability of the system increases significantly and a single data administrator can now manage a petabyte or more of storage versus 50 or 100 terabytes on a scale up system. Elimination of stovepipes - since these systems scale linearly and do not have the bottlenecks that scale up systems create, all data is kept in a single file system in a single grid eliminating the stovepipes introduced by the multiple arrays and files systems required . Just in time scalability - as my storage needs grow I can add an appropriate number of nodes to meet my needs at the time I need them. Increased utilization rates - since the data servers in these scales out systems can address the entire pool of storage there is no stranded capacity
  • 9. Big Data Challenges • Heterogeneity and Incompleteness The difficulties of big data analysis derive from its large scale as well as the presence of mixed data based on different patterns or rules (heterogeneous mixture data) in the collected and stored data. In the case of complicated heterogeneous mixture data, the data has several patterns and rules and the properties of the patterns vary greatly. Data can be both structured and unstructured. Now a days 80% of the data generated by organizations are unstructured. • Scale and complexity Managing large and rapidly increasing volumes of data is a challenging issue. Traditional software tools are not enough for managing the increasing volumes of data. Data analysis, organization, retrieval and modeling are also challenges due to scalability and complexity of data that needs to be analyzed.
  • 10. Continue • Timeliness As the size of the data sets to be processed increases, it will take more time to analyze. In some situations, results of the analysis are required immediately. For example, if a fraudulent credit card transaction is suspected, it should ideally be flagged before the transaction is completed by preventing the transaction from taking place at all. Any delay in Stock exchange can cause huge lost • Data Ownership Data ownership presents a critical and ongoing challenge, particularly in the social media arena. While petabytes of social media data reside on the servers of Facebook, MySpace, and Twitter. It is not really owned by them (although they may argue that, because of residency). Certainly, the “owners” of the pages or accounts believe they own the data. This dichotomy will have to be resolved in court.
  • 11. Continue Other Challenges are:-  Availability  Inconsistency  Performance  Privacy  Security  Infrastructure faults  Extreme Data Distribution  Dynamic Design Challenges
  • 12. Tackling Big Data Challenges Hadoop Hadoop and HDFS by Apache is widely used for storing and managing Big Data. Analyzing Big Data is a challenging task as it involves large distributed file systems which should be fault tolerant, flexible and scalable. Spark Ability to handle advanced data processing tasks such as real time stream processing and machine learning is way ahead of that of Hadoop. NoSQL Databases  they do not relay on the relational model and do not use the SQL languages;  they tend to run on cluster architectures;  they do not have a fixed schema, allowing to store data in any record. Presto  An efficient Big Data system developed by data engineers at the popular social networking site, Facebook.  An open source distributed SQL query engine for running interactive analytical queries against data sources of all sizes ranging from gigabytes to petabytes.
  • 14. Conclusion The concept of big Data and HDFS component was introduced. The issues regarding the Storage, Processing and Management of Big Data was discussed. The Opportunities and Challenges facing the Big Data from Data Storage and Analytics perspectives was also discussed. The measures that can be taken to overcome those challenges was also discussed.
  • 15. References [1] S. Kaisler, F. Armour and J. A. Espinosa, "Big Data: Issues and Challenges Moving Forward," in 46th Hawaii International Conference on System Sciences, Hawaii, 2013. [2] " www.studymafia.org," [Online]. [3] D. J. S. Kiran, M. Sravanthi, K. Preethi and M. Anusha, "Recent Issues and Challenges on Big Data in Cloud Computing," International Journal of Computer Science And Technology(IJCST), vol. Vol. 6, no. Issue 2, April - June 2015. [4] " (http://www.aip.org/fyi/2010/)," American Institute of Physics (AIP) College Park, MD , 2010. [Online]. [5] D. Borthakur, The Hadoop Distributed File System Architecture and Design, 2007. [6] A. Johnson , H. P.H, V. Paul and M. S. P.N, "Big Data Processing Using Hadoop MapReduce," International Journal of Computer Science and Information Technologies,(IJCSIT), vol. Vol. 6 (1), pp. 127-132, 2015. [7] "Hadoop," ”http://wiki.apache.org/hadoop/PoweredBy, [Online]. Available: http://wiki.apache.org/hadoop/PoweredBy. [8] C. Ordonez, Algorithms and Optimizations for Big Data Analytics, University of Houston, USA.: Cubes, Tech Talks.