SlideShare a Scribd company logo
Tutorial 9
a)
NewSQLisa class of relational database managementsystemsthatseektoprovide the scalabilityof
NoSQLsystemsforonline transactionprocessing(OLTP) workloadswhile maintainingthe ACID
guaranteesof a traditional database system.... NewSQLsystemsattempttoreconcile the conflicts.
b)
There are three definingpropertiesthatcan helpbreakdownthe term.Dubbedthe three Vs;
volume,velocity,andvariety,these are keytounderstandinghow we canmeasure bigdataand just
howvery different‘bigdata’istooldfashioneddata.
Volume
The most obviousone iswhere we’ll start.Bigdatais aboutvolume.Volumesof datathatcan reach
unprecedentedheightsinfact.It’sestimatedthat2.5quintillionbytesof dataiscreatedeachday,
and as a result,there will be 40zettabytesof data createdby2020 – whichhighlightsanincrease of
300 timesfrom2005. As a result,itisnow not uncommonforlarge companiestohave Terabytes –
and evenPetabytes –of data instorage devicesandonservers.Thisdatahelpsto shape the future
of a companyand itsactions,all while trackingprogress.
Velocity
The growth of data, and the resultingimportanceof it,haschangedthe way we see data.There once
was a time whenwe didn’tsee the importance of datainthe corporate world,butwiththe change
of howwe gatherit,we’ve come torelyon it dayto day. Velocityessentiallymeasureshow fastthe
data iscomingin.Some data will come ininreal-time,whereasotherwill come infitsandstarts,
sentto us inbatches.Andas not all platformswill experience the incomingdataatthe same pace,
it’simportantnotto generalise,discount,orjumptoconclusionswithouthavingall the factsand
figures.
Variety
Data was once collectedfromone place anddeliveredinone format.Once takingthe shape of
database files - suchas, excel,csvandaccess - it isnow beingpresentedinnon-traditionalforms,like
video,text,pdf,andgraphicsonsocial media,aswell asviatechsuch as wearable devices.Although
thisdata isextremelyusefultous,itdoescreate more workand require more analytical skillsto
decipherthisincomingdata,make itmanageable andallow ittowork.
1) Seta bigdata strategy
At a highlevel,abigdata strategyisa plandesignedtohelpyouoversee andimprovethe wayyou
acquire,store,manage,share anduse data withinandoutside of yourorganization.A bigdata
strategysetsthe stage for businesssuccessamidanabundance of data.Whendevelopingastrategy,
it’simportantto considerexisting –and future – businessandtechnologygoalsandinitiatives.This
callsfor treatingbigdata like anyothervaluable businessassetratherthanjusta byproductof
applications.
Big Data Infographic
Clickon the infographictolearnmore aboutbigdata.
2) Knowthe sourcesof bigdata
Streamingdatacomesfrom the Internetof Things(IoT) andotherconnecteddevicesthatflow into
IT systemsfromwearables,smartcars,medical devices,industrial equipmentandmore.Youcan
analyze thisbigdata as itarrives,decidingwhichdatatokeepornot keep,andwhichneedsfurther
analysis.
Social mediadatastemsfrominteractionsonFacebook,YouTube,Instagram, etc.Thisincludesvast
amountsof big data inthe form of images,videos,voice,textandsound –useful formarketing,sales
and supportfunctions.Thisdataisofteninunstructuredorsemistructuredforms,soitposesa
unique challenge forconsumptionandanalysis.
Publiclyavailabledatacomesfrommassive amountsof opendatasourceslike the USgovernment’s
data.gov,the CIA World Factbookor the EuropeanUnionOpenData Portal.
Otherbigdata may come from data lakes,clouddatasources,suppliersandcustomers.
3) Access,manage and store bigdata
Moderncomputingsystemsprovide the speed,powerandflexibilityneededto quicklyaccess
massive amountsandtypesof bigdata. Alongwithreliable access,companiesalsoneedmethodsfor
integratingthe data,ensuringdataquality,providingdatagovernance andstorage,andpreparing
the data for analytics.Some datamaybe storedon-premisesinatraditional datawarehouse –but
there are also flexible,low-costoptionsforstoringandhandlingbigdataviacloudsolutions,data
lakesandHadoop.
4) Analyze bigdata
Withhigh-performance technologieslike gridcomputingorin-memoryanalytics,organizationscan
choose to use all theirbigdata for analyses.Anotherapproachistodetermine upfrontwhichdatais
relevantbefore analyzingit.Eitherway,bigdataanalyticsishow companiesgainvalue andinsights
fromdata. Increasingly,bigdatafeedstoday’sadvancedanalyticsendeavorssuchasartificial
intelligence.
5) Make intelligent,data-drivendecisions
Well-managed,trusteddataleadstotrustedanalyticsandtrusteddecisions.Tostaycompetitive,
businessesneedto seize the full valueof bigdataand operate ina data-drivenway – making
decisionsbasedonthe evidence presentedbybigdataratherthan gut instinct.The benefitsof being
data-drivenare clear.Data-drivenorganizationsperformbetter,are operationallymore predictable
and are more profitable.
c)
HDFS AssumptionandGoals
I. Hardware failure
Hardware failure isnomore exception;ithasbecome aregularterm.HDFS instance consistsof
hundredsorthousandsof servermachines,eachof whichisstoringpart of the file system’sdata.
There existahuge numberof componentsthatare verysusceptibletohardware failure.Thismeans
that there are some componentsthatare alwaysnon-functional.Sothe core architectural goal of
HDFS isquickand automaticfaultdetection/recovery.
II.Streamingdata access
HDFS applicationsneedstreamingaccesstotheirdatasets.HadoopHDFS ismainlydesignedfor
batch processingratherthaninteractive use byusers.The force isonhighthroughputof data access
rather thanlowlatencyof data access.It focusesonhow to retrieve dataatthe fastestpossible
speedwhile analyzinglogs.
III.Large datasets
HDFS workswithlarge data sets.Instandard practices,a file inHDFSisof size rangingfromgigabytes
to petabytes.The architecture of HDFSshouldbe designinsucha waythat itshouldbe bestfor
storingand retrievinghuge amountsof data.HDFS shouldprovide highaggregate databandwidth
and shouldbe able toscale up to hundredsof nodesona single cluster.Also,itshouldbe good
enoughtodeal withtonsof millionsof filesonasingle instance.
IV.Simple coherencymodel
It workson a theoryof write-once-read-manyaccessmodelforfiles.Once the file iscreated,written,
and closed,itshouldnotbe changed.Thisresolvesthe datacoherencyissuesandenableshigh
throughputof data access.A MapReduce-basedapplicationorwebcrawlerapplicationperfectlyfits
inthismodel.Asperapache notes,there isaplan to supportappendingwritestofilesinthe future.
V.Moving computationischeaperthanmovingdata
If an applicationdoesthe computationnearthe dataitoperateson,it ismuch more efficientthan
done far of.Thisfact becomesstrongerwhile dealingwithlarge dataset.The mainadvantage of this
isthat it increasesthe overall throughputof the system.Italsominimizesnetworkcongestion.The
assumptionisthatit isbetterto move computationclosertodatainsteadof movingdatato
computation.
VI.Portabilityacrossheterogeneoushardware andsoftware platforms
HDFS isdesignedwiththe portable propertysothatit shouldbe portable fromone platformto
another.Thisenablesthe widespreadadoptionof HDFS.Itisthe bestplatformwhile dealingwitha
large setof data.

More Related Content

What's hot

5 data resource management
5 data resource management5 data resource management
5 data resource management
Nymphea Saraf
 
data resource management
 data resource management data resource management
data resource managementsoodsurbhi123
 
Representing Non-Relational Databases with Darwinian Networks
Representing Non-Relational Databases with Darwinian NetworksRepresenting Non-Relational Databases with Darwinian Networks
Representing Non-Relational Databases with Darwinian Networks
IJERA Editor
 
Advanced Database System
Advanced Database SystemAdvanced Database System
Advanced Database System
sushmita rathour
 
TCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYATCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYAAditya Srinivasan
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A Review
IRJET Journal
 
Evaluation of graph databases
Evaluation of graph databasesEvaluation of graph databases
Evaluation of graph databases
ijaia
 
A Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and ChallengesA Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and Challenges
ijcisjournal
 
Lecture 04 data resource management
Lecture 04 data resource managementLecture 04 data resource management
Lecture 04 data resource management
Dynamic Research Centre & institute
 
Dbms unit 1
Dbms unit   1Dbms unit   1
Dbms unit 1
devineni66
 
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
ijcsit
 

What's hot (14)

U0 vqmtq3m tc=
U0 vqmtq3m tc=U0 vqmtq3m tc=
U0 vqmtq3m tc=
 
5 data resource management
5 data resource management5 data resource management
5 data resource management
 
data resource management
 data resource management data resource management
data resource management
 
Representing Non-Relational Databases with Darwinian Networks
Representing Non-Relational Databases with Darwinian NetworksRepresenting Non-Relational Databases with Darwinian Networks
Representing Non-Relational Databases with Darwinian Networks
 
Advanced Database System
Advanced Database SystemAdvanced Database System
Advanced Database System
 
TCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYATCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYA
 
Managing data resources
Managing  data resourcesManaging  data resources
Managing data resources
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A Review
 
Evaluation of graph databases
Evaluation of graph databasesEvaluation of graph databases
Evaluation of graph databases
 
A Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and ChallengesA Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and Challenges
 
Lecture 04 data resource management
Lecture 04 data resource managementLecture 04 data resource management
Lecture 04 data resource management
 
Dbms unit 1
Dbms unit   1Dbms unit   1
Dbms unit 1
 
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
 
1771 1775
1771 17751771 1775
1771 1775
 

Similar to T9

A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Science
ijtsrd
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
almaraniabwmalk
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big data
Digimark
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
Sourabh Saxena
 
IRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articlesIRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET Journal
 
Big Data
Big DataBig Data
Big Data
Vinayak Kamath
 
JPJ1417 Data Mining With Big Data
JPJ1417   Data Mining With Big DataJPJ1417   Data Mining With Big Data
JPJ1417 Data Mining With Big Data
chennaijp
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
Prof.Balakrishnan S
 
data mining with big data
data mining with big datadata mining with big data
data mining with big data
swathi78
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
nabati
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Thingspateelhs
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
Sandip Tipayle Patil
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Review of big data analytics (bda) architecture trends and analysis
Review of big data analytics (bda) architecture   trends and analysis Review of big data analytics (bda) architecture   trends and analysis
Review of big data analytics (bda) architecture trends and analysis
Conference Papers
 
Fundamentals of Big Data
Fundamentals of Big DataFundamentals of Big Data
Fundamentals of Big Data
The Wisdom Daily
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Vipin Batra
 

Similar to T9 (20)

A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Science
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
The ABCs of Big Data
The ABCs of Big DataThe ABCs of Big Data
The ABCs of Big Data
 
1
11
1
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big data
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
IRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articlesIRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articles
 
Big Data
Big DataBig Data
Big Data
 
JPJ1417 Data Mining With Big Data
JPJ1417   Data Mining With Big DataJPJ1417   Data Mining With Big Data
JPJ1417 Data Mining With Big Data
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
data mining with big data
data mining with big datadata mining with big data
data mining with big data
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Things
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Review of big data analytics (bda) architecture trends and analysis
Review of big data analytics (bda) architecture   trends and analysis Review of big data analytics (bda) architecture   trends and analysis
Review of big data analytics (bda) architecture trends and analysis
 
Fundamentals of Big Data
Fundamentals of Big DataFundamentals of Big Data
Fundamentals of Big Data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
big data
big databig data
big data
 

More from NidhiGupta8431

T6
T6T6
T4
T4T4
T 8-gurjinder
T 8-gurjinderT 8-gurjinder
T 8-gurjinder
NidhiGupta8431
 
T10
T10T10
Week 9.docx
Week 9.docxWeek 9.docx
Week 9.docx
NidhiGupta8431
 
T2
T2T2
T1
T1T1
Individual log file_3_shayan_.docx
Individual log file_3_shayan_.docxIndividual log file_3_shayan_.docx
Individual log file_3_shayan_.docx
NidhiGupta8431
 
Ict713 t320-t10-dl-08 dec2020
Ict713 t320-t10-dl-08 dec2020Ict713 t320-t10-dl-08 dec2020
Ict713 t320-t10-dl-08 dec2020
NidhiGupta8431
 
Ict713 t320-t7-dl-20 oct2020
Ict713 t320-t7-dl-20 oct2020Ict713 t320-t7-dl-20 oct2020
Ict713 t320-t7-dl-20 oct2020
NidhiGupta8431
 

More from NidhiGupta8431 (10)

T6
T6T6
T6
 
T4
T4T4
T4
 
T 8-gurjinder
T 8-gurjinderT 8-gurjinder
T 8-gurjinder
 
T10
T10T10
T10
 
Week 9.docx
Week 9.docxWeek 9.docx
Week 9.docx
 
T2
T2T2
T2
 
T1
T1T1
T1
 
Individual log file_3_shayan_.docx
Individual log file_3_shayan_.docxIndividual log file_3_shayan_.docx
Individual log file_3_shayan_.docx
 
Ict713 t320-t10-dl-08 dec2020
Ict713 t320-t10-dl-08 dec2020Ict713 t320-t10-dl-08 dec2020
Ict713 t320-t10-dl-08 dec2020
 
Ict713 t320-t7-dl-20 oct2020
Ict713 t320-t7-dl-20 oct2020Ict713 t320-t7-dl-20 oct2020
Ict713 t320-t7-dl-20 oct2020
 

Recently uploaded

PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
goswamiyash170123
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 

Recently uploaded (20)

PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 

T9

  • 1. Tutorial 9 a) NewSQLisa class of relational database managementsystemsthatseektoprovide the scalabilityof NoSQLsystemsforonline transactionprocessing(OLTP) workloadswhile maintainingthe ACID guaranteesof a traditional database system.... NewSQLsystemsattempttoreconcile the conflicts. b) There are three definingpropertiesthatcan helpbreakdownthe term.Dubbedthe three Vs; volume,velocity,andvariety,these are keytounderstandinghow we canmeasure bigdataand just howvery different‘bigdata’istooldfashioneddata. Volume The most obviousone iswhere we’ll start.Bigdatais aboutvolume.Volumesof datathatcan reach unprecedentedheightsinfact.It’sestimatedthat2.5quintillionbytesof dataiscreatedeachday, and as a result,there will be 40zettabytesof data createdby2020 – whichhighlightsanincrease of 300 timesfrom2005. As a result,itisnow not uncommonforlarge companiestohave Terabytes – and evenPetabytes –of data instorage devicesandonservers.Thisdatahelpsto shape the future of a companyand itsactions,all while trackingprogress. Velocity The growth of data, and the resultingimportanceof it,haschangedthe way we see data.There once was a time whenwe didn’tsee the importance of datainthe corporate world,butwiththe change of howwe gatherit,we’ve come torelyon it dayto day. Velocityessentiallymeasureshow fastthe data iscomingin.Some data will come ininreal-time,whereasotherwill come infitsandstarts, sentto us inbatches.Andas not all platformswill experience the incomingdataatthe same pace, it’simportantnotto generalise,discount,orjumptoconclusionswithouthavingall the factsand figures. Variety Data was once collectedfromone place anddeliveredinone format.Once takingthe shape of database files - suchas, excel,csvandaccess - it isnow beingpresentedinnon-traditionalforms,like video,text,pdf,andgraphicsonsocial media,aswell asviatechsuch as wearable devices.Although
  • 2. thisdata isextremelyusefultous,itdoescreate more workand require more analytical skillsto decipherthisincomingdata,make itmanageable andallow ittowork. 1) Seta bigdata strategy At a highlevel,abigdata strategyisa plandesignedtohelpyouoversee andimprovethe wayyou acquire,store,manage,share anduse data withinandoutside of yourorganization.A bigdata strategysetsthe stage for businesssuccessamidanabundance of data.Whendevelopingastrategy, it’simportantto considerexisting –and future – businessandtechnologygoalsandinitiatives.This callsfor treatingbigdata like anyothervaluable businessassetratherthanjusta byproductof applications. Big Data Infographic Clickon the infographictolearnmore aboutbigdata. 2) Knowthe sourcesof bigdata Streamingdatacomesfrom the Internetof Things(IoT) andotherconnecteddevicesthatflow into IT systemsfromwearables,smartcars,medical devices,industrial equipmentandmore.Youcan analyze thisbigdata as itarrives,decidingwhichdatatokeepornot keep,andwhichneedsfurther analysis. Social mediadatastemsfrominteractionsonFacebook,YouTube,Instagram, etc.Thisincludesvast amountsof big data inthe form of images,videos,voice,textandsound –useful formarketing,sales and supportfunctions.Thisdataisofteninunstructuredorsemistructuredforms,soitposesa unique challenge forconsumptionandanalysis. Publiclyavailabledatacomesfrommassive amountsof opendatasourceslike the USgovernment’s data.gov,the CIA World Factbookor the EuropeanUnionOpenData Portal. Otherbigdata may come from data lakes,clouddatasources,suppliersandcustomers. 3) Access,manage and store bigdata Moderncomputingsystemsprovide the speed,powerandflexibilityneededto quicklyaccess massive amountsandtypesof bigdata. Alongwithreliable access,companiesalsoneedmethodsfor integratingthe data,ensuringdataquality,providingdatagovernance andstorage,andpreparing the data for analytics.Some datamaybe storedon-premisesinatraditional datawarehouse –but there are also flexible,low-costoptionsforstoringandhandlingbigdataviacloudsolutions,data lakesandHadoop.
  • 3. 4) Analyze bigdata Withhigh-performance technologieslike gridcomputingorin-memoryanalytics,organizationscan choose to use all theirbigdata for analyses.Anotherapproachistodetermine upfrontwhichdatais relevantbefore analyzingit.Eitherway,bigdataanalyticsishow companiesgainvalue andinsights fromdata. Increasingly,bigdatafeedstoday’sadvancedanalyticsendeavorssuchasartificial intelligence. 5) Make intelligent,data-drivendecisions Well-managed,trusteddataleadstotrustedanalyticsandtrusteddecisions.Tostaycompetitive, businessesneedto seize the full valueof bigdataand operate ina data-drivenway – making decisionsbasedonthe evidence presentedbybigdataratherthan gut instinct.The benefitsof being data-drivenare clear.Data-drivenorganizationsperformbetter,are operationallymore predictable and are more profitable. c) HDFS AssumptionandGoals I. Hardware failure Hardware failure isnomore exception;ithasbecome aregularterm.HDFS instance consistsof hundredsorthousandsof servermachines,eachof whichisstoringpart of the file system’sdata. There existahuge numberof componentsthatare verysusceptibletohardware failure.Thismeans that there are some componentsthatare alwaysnon-functional.Sothe core architectural goal of HDFS isquickand automaticfaultdetection/recovery. II.Streamingdata access HDFS applicationsneedstreamingaccesstotheirdatasets.HadoopHDFS ismainlydesignedfor batch processingratherthaninteractive use byusers.The force isonhighthroughputof data access rather thanlowlatencyof data access.It focusesonhow to retrieve dataatthe fastestpossible speedwhile analyzinglogs. III.Large datasets HDFS workswithlarge data sets.Instandard practices,a file inHDFSisof size rangingfromgigabytes to petabytes.The architecture of HDFSshouldbe designinsucha waythat itshouldbe bestfor storingand retrievinghuge amountsof data.HDFS shouldprovide highaggregate databandwidth
  • 4. and shouldbe able toscale up to hundredsof nodesona single cluster.Also,itshouldbe good enoughtodeal withtonsof millionsof filesonasingle instance. IV.Simple coherencymodel It workson a theoryof write-once-read-manyaccessmodelforfiles.Once the file iscreated,written, and closed,itshouldnotbe changed.Thisresolvesthe datacoherencyissuesandenableshigh throughputof data access.A MapReduce-basedapplicationorwebcrawlerapplicationperfectlyfits inthismodel.Asperapache notes,there isaplan to supportappendingwritestofilesinthe future. V.Moving computationischeaperthanmovingdata If an applicationdoesthe computationnearthe dataitoperateson,it ismuch more efficientthan done far of.Thisfact becomesstrongerwhile dealingwithlarge dataset.The mainadvantage of this isthat it increasesthe overall throughputof the system.Italsominimizesnetworkcongestion.The assumptionisthatit isbetterto move computationclosertodatainsteadof movingdatato computation. VI.Portabilityacrossheterogeneoushardware andsoftware platforms HDFS isdesignedwiththe portable propertysothatit shouldbe portable fromone platformto another.Thisenablesthe widespreadadoptionof HDFS.Itisthe bestplatformwhile dealingwitha large setof data.