SlideShare a Scribd company logo
1 of 13
Big Data vs. IT STKI
Summit
2013
IT at the crossroads:
Lead, follow or get out of the way
Pini Cohen Srouec: http://machinationsintomadness.com/good-vs-evil-how-about-good-vs-good/
Big Data Definition – 4 V’s (or more…)
• Volume – tens of TBs and more (15-20TB+)
• Velocity – the speed in which data is added – 10M items per hour and more.
And the speed in which the data needs to be processed
• Variety – different types of data – structured & unstructured. In many cases
deals with internet of things, social media, but also with voice, video, etc.
• Variability - able to cope with new attributes and changing data types –
without interrupting the analytical process (without “import-export”)
• Other optional V’s - validity, volatility, viscosity (resistance to flow), etc. source:
http://www.computerweekly.com/blogs/cwdn/2011/11/datas-main-drivers-volume-velocity-variety-and-variability.html
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
The origins of the 3V’s:
• 2002 research by Doug Laney from META Group (now Gartner):
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
“Big Data” theme main current usage:
•“Big Data" is just marketing jargon. -Doug Laney,
Gartner source: http://www.computerweekly.com/blogs/cwdn/2011/11/datas-main-drivers-volume-velocity-variety-and-variability.html
Source:http://winnbadisa.com/wp-content/uploads/2011/12/marketing-career-cloud.jpg
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
Big Data at work:
• Orbitz Worldwide has collected 750 terabytes of unstructured
data on their consumers’ behavior – detailed information from
customer online visits and browsing sessions. Using Hadoop,
models have been developed intended to improve search results
and tailor the user experience based on everything from
location, interest in family travel versus solo travel, and even the
kind of device being used to explore travel options.
• The result? To date, a 7% increase in interaction rate, 37%
growth in stickiness of sessions and a net 2.6% in booking path
engagement.
Source: http://www.deloitte.com/assets/Dcom-UnitedStates/Local%20Assets/Documents/us_cons_techtrends2012_013112.pdf Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
DoD R&D prioritizes 'Big Data'
• The Pentagon has invested billions in a new generation of
electronic systems that gather and store vast quantities of
imagery and other data from the battlefield, and the digital
deluge is so vast that sifting through it manually to generate
actionable information is not a sustainable option, officials said.
• The Pentagon joined four other federal departments and
agencies at a White House event in late March to announce
$200 million in governmentwide big data research efforts.
Source: http://www.federalnewsradio.com/885/2824044/DoD-RD-prioritizes-Big-Data
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
Technology: Elements  Concepts
• Storing data for analytics (mainly):
• HDFS – Hadoop File System
• Map Reduce- Programming method mainly for analytics
• Other “Add-on”: Pig, , Hive, JAQL (IBM)
• Storing and retrieving data - DBMS:
• NoSQL – DBMS (not only SQL):
• Cassandra
• MongoDB
• CouchDB
• Hbase
• Redis
• Neo4j
• Riak
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
Big Data technologies (Hadoop etc.) vs. traditional IT - Infrastructure
Big DataTraditional IT
Local storageCentralized Storage
Cheap HW  White BoxesBrand redundant Servers
Is standardization needed?! (in the HW level). No
server virtualization.
Standard Infrastructure and virtual servers.
Why do I need backup? How do I tackle DRP (compute
clusters that are stretched over locations)
Well established backup and DRP procedures
Open Source solutionsTraditional vendors
In a new patch for specific issues sometimes it is
written “not implemented yet”
Mature products and procedures
Different kind of programming (map-reduce) , no JoinsTraditional programming, SQL
Will Big Data infrastructure be part of existing infrastructure or will be developed as new domain?
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
Big Data technologies (Hadoop etc.) vs. traditional IT - BI
• Data Scientist vs. BI
• Data science incorporates varying elements and
builds on techniques and theories from many
fields, including math, statistics, data
engineering, pattern recognition and learning,
advanced computing, visualization, uncertainty
modeling, data warehousing, and high
performance computing with the goal of
extracting meaning from data and creating data
products. (WIKI)
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
What is the business value of big data analytics?
• Big data is now a technology looking for a business need
• It can mean doing the same thing but better / faster (better
segmentation, more accurate analysis model)
• Doing things Cheaper
• Or it can mean doing completely new things (telematics,
sentiment analysis, recommendation engine, matching
competition’s pricing in real time, being able to analyze data we
haven’t been able to analyze in the past)
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
Decision making – old school vs. new school (big data)
• Old School:
• Phase 1 : Analyze existing data and prepare general model
• Phase 2: Apply the general model to specific client
• This means applying the same model for many clients when they arrive
• Issues with Old School decision making:
• Time gap between preparing and applying the model
• # of combinations might be too big for general model (example:
recommendation based in interest)
• The general model generated is biased towards “main stream” population
• New School (Big Data):
• Phase 1: Prepare specific model for the client and apply the model – instantly
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
You can’t run and you can’t hide – HadoopBig Data is coming
DBMS
Pini Cohen's work Copyright@2013
Do not remove source or attribution
from any slide, graph or portion of
graph
13

More Related Content

What's hot

Data-centric market status, case studies and outlook
Data-centric market status, case studies and outlookData-centric market status, case studies and outlook
Data-centric market status, case studies and outlookAlan Morrison
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big dataRichard Vidgen
 
Enabling the data driven enterprise
Enabling the data driven enterpriseEnabling the data driven enterprise
Enabling the data driven enterprisermikkilineni
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceSrishti44
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data ScienceEdureka!
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive FrameworkRan Zhang
 
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...
Usama Fayyad talk at IIT Madras on March 27, 2015:  BigData, AllData, Old Dat...Usama Fayyad talk at IIT Madras on March 27, 2015:  BigData, AllData, Old Dat...
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...Usama Fayyad
 
Big data introduction - Big Data from a Consulting perspective - Sogeti
Big data introduction - Big Data from a Consulting perspective - SogetiBig data introduction - Big Data from a Consulting perspective - Sogeti
Big data introduction - Big Data from a Consulting perspective - SogetiEdzo Botjes
 
Data-centric design and the knowledge graph
Data-centric design and the knowledge graphData-centric design and the knowledge graph
Data-centric design and the knowledge graphAlan Morrison
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data ScienceAndrew Gardner
 
Web search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introductionWeb search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introductionAli Dasdan
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trendsAlan Morrison
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecasesSreenatha Reddy K R
 
Crowdsourcing Approaches for Smart City Open Data Management
Crowdsourcing Approaches for Smart City Open Data ManagementCrowdsourcing Approaches for Smart City Open Data Management
Crowdsourcing Approaches for Smart City Open Data ManagementEdward Curry
 
BlueBrain Nexus Technical Introduction
BlueBrain Nexus Technical IntroductionBlueBrain Nexus Technical Introduction
BlueBrain Nexus Technical IntroductionBogdan Roman
 
State of Florida Neo4J Graph Briefing - Keynote
State of Florida Neo4J Graph Briefing - KeynoteState of Florida Neo4J Graph Briefing - Keynote
State of Florida Neo4J Graph Briefing - KeynoteNeo4j
 
Paths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphsPaths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphsAlan Morrison
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsSri Ambati
 

What's hot (20)

Data-centric market status, case studies and outlook
Data-centric market status, case studies and outlookData-centric market status, case studies and outlook
Data-centric market status, case studies and outlook
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Enabling the data driven enterprise
Enabling the data driven enterpriseEnabling the data driven enterprise
Enabling the data driven enterprise
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive Framework
 
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...
Usama Fayyad talk at IIT Madras on March 27, 2015:  BigData, AllData, Old Dat...Usama Fayyad talk at IIT Madras on March 27, 2015:  BigData, AllData, Old Dat...
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...
 
Big data introduction - Big Data from a Consulting perspective - Sogeti
Big data introduction - Big Data from a Consulting perspective - SogetiBig data introduction - Big Data from a Consulting perspective - Sogeti
Big data introduction - Big Data from a Consulting perspective - Sogeti
 
Data-centric design and the knowledge graph
Data-centric design and the knowledge graphData-centric design and the knowledge graph
Data-centric design and the knowledge graph
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data Science
 
Web search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introductionWeb search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introduction
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trends
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecases
 
Hadoop Overview
Hadoop OverviewHadoop Overview
Hadoop Overview
 
Crowdsourcing Approaches for Smart City Open Data Management
Crowdsourcing Approaches for Smart City Open Data ManagementCrowdsourcing Approaches for Smart City Open Data Management
Crowdsourcing Approaches for Smart City Open Data Management
 
BlueBrain Nexus Technical Introduction
BlueBrain Nexus Technical IntroductionBlueBrain Nexus Technical Introduction
BlueBrain Nexus Technical Introduction
 
State of Florida Neo4J Graph Briefing - Keynote
State of Florida Neo4J Graph Briefing - KeynoteState of Florida Neo4J Graph Briefing - Keynote
State of Florida Neo4J Graph Briefing - Keynote
 
Paths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphsPaths to more personal and collaborative knowledge graphs
Paths to more personal and collaborative knowledge graphs
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
 

Viewers also liked

Big Data Analytics: Facts and Feelings
Big Data Analytics: Facts and FeelingsBig Data Analytics: Facts and Feelings
Big Data Analytics: Facts and FeelingsSeth Grimes
 
Big Data - Fact or Fiction ?
Big Data -  Fact or Fiction ?Big Data -  Fact or Fiction ?
Big Data - Fact or Fiction ?Tyrone Systems
 
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...CompTIA
 
The 5 key V's of Big Data
The 5 key V's of Big DataThe 5 key V's of Big Data
The 5 key V's of Big DataAnric Blatt
 
5 facts everyone should know about big data presentation
5 facts everyone should know about big data presentation5 facts everyone should know about big data presentation
5 facts everyone should know about big data presentationInfobrandz
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBernard Marr
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big DataBernard Marr
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionDavid Pittman
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBernard Marr
 

Viewers also liked (11)

Big Data Analytics: Facts and Feelings
Big Data Analytics: Facts and FeelingsBig Data Analytics: Facts and Feelings
Big Data Analytics: Facts and Feelings
 
Big Data - Fact or Fiction ?
Big Data -  Fact or Fiction ?Big Data -  Fact or Fiction ?
Big Data - Fact or Fiction ?
 
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
 
The 5 key V's of Big Data
The 5 key V's of Big DataThe 5 key V's of Big Data
The 5 key V's of Big Data
 
5 facts everyone should know about big data presentation
5 facts everyone should know about big data presentation5 facts everyone should know about big data presentation
5 facts everyone should know about big data presentation
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big Data
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in Action
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 

Similar to 5. big data vs it stki - pini cohen

Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigManish Chopra
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
What_BigData_means_to_your_organization
What_BigData_means_to_your_organizationWhat_BigData_means_to_your_organization
What_BigData_means_to_your_organizationAttila Barta
 
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop FrameworkIRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET- Youtube Data Sensitivity and Analysis using Hadoop FrameworkIRJET Journal
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataSpringPeople
 
Unified approach to analytics
Unified approach to analyticsUnified approach to analytics
Unified approach to analyticsMadhumita Mantri
 
Python's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPython's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPeter Wang
 
TOP Business Intelligence Predictions for 2015
TOP Business Intelligence Predictions for 2015TOP Business Intelligence Predictions for 2015
TOP Business Intelligence Predictions for 2015Panorama Software
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusersBob Hardaway
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
Driving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsDriving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsEmbarcadero Technologies
 
Top Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwareTop Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwarePanorama Software
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAudrey Britton
 

Similar to 5. big data vs it stki - pini cohen (20)

Research paper on big data and hadoop
Research paper on big data and hadoopResearch paper on big data and hadoop
Research paper on big data and hadoop
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
What_BigData_means_to_your_organization
What_BigData_means_to_your_organizationWhat_BigData_means_to_your_organization
What_BigData_means_to_your_organization
 
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop FrameworkIRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
 
Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Unified approach to analytics
Unified approach to analyticsUnified approach to analytics
Unified approach to analytics
 
R180305120123
R180305120123R180305120123
R180305120123
 
Python's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPython's Role in the Future of Data Analysis
Python's Role in the Future of Data Analysis
 
TOP Business Intelligence Predictions for 2015
TOP Business Intelligence Predictions for 2015TOP Business Intelligence Predictions for 2015
TOP Business Intelligence Predictions for 2015
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Applying Big Data
Applying Big DataApplying Big Data
Applying Big Data
 
Driving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsDriving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data Assets
 
Complete-SRS.doc
Complete-SRS.docComplete-SRS.doc
Complete-SRS.doc
 
Top Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwareTop Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama Software
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data Analytics
 

More from Taldor Group

7. emc isilon hdfs enterprise storage for hadoop
7. emc isilon hdfs   enterprise storage for hadoop7. emc isilon hdfs   enterprise storage for hadoop
7. emc isilon hdfs enterprise storage for hadoopTaldor Group
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברגTaldor Group
 
3. ami big data hadoop on ucs seminar may 2013
3. ami big data hadoop on ucs seminar may 20133. ami big data hadoop on ucs seminar may 2013
3. ami big data hadoop on ucs seminar may 2013Taldor Group
 
A new platform for a new era emc
A new platform for a new era   emcA new platform for a new era   emc
A new platform for a new era emcTaldor Group
 
Yossi cohen 3 base
Yossi cohen   3 baseYossi cohen   3 base
Yossi cohen 3 baseTaldor Group
 
פיני מנדל תובנות עסקיות מיישומי Hadoop
פיני מנדל   תובנות עסקיות מיישומי Hadoopפיני מנדל   תובנות עסקיות מיישומי Hadoop
פיני מנדל תובנות עסקיות מיישומי HadoopTaldor Group
 
נתן פרידחי הקדמה לכנס Hadoop
נתן פרידחי   הקדמה לכנס Hadoopנתן פרידחי   הקדמה לכנס Hadoop
נתן פרידחי הקדמה לכנס HadoopTaldor Group
 
הערך העסקי שבאיכות הנתונים קוסטין מרזאה
הערך העסקי שבאיכות הנתונים   קוסטין מרזאההערך העסקי שבאיכות הנתונים   קוסטין מרזאה
הערך העסקי שבאיכות הנתונים קוסטין מרזאהTaldor Group
 
Dcl צביקה מנלה - סיפורי לקוחות
Dcl   צביקה מנלה - סיפורי לקוחותDcl   צביקה מנלה - סיפורי לקוחות
Dcl צביקה מנלה - סיפורי לקוחותTaldor Group
 
Taldor data quality einat shimoni - stki
Taldor data quality   einat shimoni - stkiTaldor data quality   einat shimoni - stki
Taldor data quality einat shimoni - stkiTaldor Group
 
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 32013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3Taldor Group
 
Loshin operationalizingdatagovernance
Loshin operationalizingdatagovernanceLoshin operationalizingdatagovernance
Loshin operationalizingdatagovernanceTaldor Group
 

More from Taldor Group (12)

7. emc isilon hdfs enterprise storage for hadoop
7. emc isilon hdfs   enterprise storage for hadoop7. emc isilon hdfs   enterprise storage for hadoop
7. emc isilon hdfs enterprise storage for hadoop
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברג
 
3. ami big data hadoop on ucs seminar may 2013
3. ami big data hadoop on ucs seminar may 20133. ami big data hadoop on ucs seminar may 2013
3. ami big data hadoop on ucs seminar may 2013
 
A new platform for a new era emc
A new platform for a new era   emcA new platform for a new era   emc
A new platform for a new era emc
 
Yossi cohen 3 base
Yossi cohen   3 baseYossi cohen   3 base
Yossi cohen 3 base
 
פיני מנדל תובנות עסקיות מיישומי Hadoop
פיני מנדל   תובנות עסקיות מיישומי Hadoopפיני מנדל   תובנות עסקיות מיישומי Hadoop
פיני מנדל תובנות עסקיות מיישומי Hadoop
 
נתן פרידחי הקדמה לכנס Hadoop
נתן פרידחי   הקדמה לכנס Hadoopנתן פרידחי   הקדמה לכנס Hadoop
נתן פרידחי הקדמה לכנס Hadoop
 
הערך העסקי שבאיכות הנתונים קוסטין מרזאה
הערך העסקי שבאיכות הנתונים   קוסטין מרזאההערך העסקי שבאיכות הנתונים   קוסטין מרזאה
הערך העסקי שבאיכות הנתונים קוסטין מרזאה
 
Dcl צביקה מנלה - סיפורי לקוחות
Dcl   צביקה מנלה - סיפורי לקוחותDcl   צביקה מנלה - סיפורי לקוחות
Dcl צביקה מנלה - סיפורי לקוחות
 
Taldor data quality einat shimoni - stki
Taldor data quality   einat shimoni - stkiTaldor data quality   einat shimoni - stki
Taldor data quality einat shimoni - stki
 
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 32013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
 
Loshin operationalizingdatagovernance
Loshin operationalizingdatagovernanceLoshin operationalizingdatagovernance
Loshin operationalizingdatagovernance
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 

Recently uploaded (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 

5. big data vs it stki - pini cohen

  • 1. Big Data vs. IT STKI Summit 2013 IT at the crossroads: Lead, follow or get out of the way Pini Cohen Srouec: http://machinationsintomadness.com/good-vs-evil-how-about-good-vs-good/
  • 2. Big Data Definition – 4 V’s (or more…) • Volume – tens of TBs and more (15-20TB+) • Velocity – the speed in which data is added – 10M items per hour and more. And the speed in which the data needs to be processed • Variety – different types of data – structured & unstructured. In many cases deals with internet of things, social media, but also with voice, video, etc. • Variability - able to cope with new attributes and changing data types – without interrupting the analytical process (without “import-export”) • Other optional V’s - validity, volatility, viscosity (resistance to flow), etc. source: http://www.computerweekly.com/blogs/cwdn/2011/11/datas-main-drivers-volume-velocity-variety-and-variability.html Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 3. The origins of the 3V’s: • 2002 research by Doug Laney from META Group (now Gartner): Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 4. “Big Data” theme main current usage: •“Big Data" is just marketing jargon. -Doug Laney, Gartner source: http://www.computerweekly.com/blogs/cwdn/2011/11/datas-main-drivers-volume-velocity-variety-and-variability.html Source:http://winnbadisa.com/wp-content/uploads/2011/12/marketing-career-cloud.jpg Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 5. Big Data at work: • Orbitz Worldwide has collected 750 terabytes of unstructured data on their consumers’ behavior – detailed information from customer online visits and browsing sessions. Using Hadoop, models have been developed intended to improve search results and tailor the user experience based on everything from location, interest in family travel versus solo travel, and even the kind of device being used to explore travel options. • The result? To date, a 7% increase in interaction rate, 37% growth in stickiness of sessions and a net 2.6% in booking path engagement. Source: http://www.deloitte.com/assets/Dcom-UnitedStates/Local%20Assets/Documents/us_cons_techtrends2012_013112.pdf Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 6. DoD R&D prioritizes 'Big Data' • The Pentagon has invested billions in a new generation of electronic systems that gather and store vast quantities of imagery and other data from the battlefield, and the digital deluge is so vast that sifting through it manually to generate actionable information is not a sustainable option, officials said. • The Pentagon joined four other federal departments and agencies at a White House event in late March to announce $200 million in governmentwide big data research efforts. Source: http://www.federalnewsradio.com/885/2824044/DoD-RD-prioritizes-Big-Data Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 7. Technology: Elements Concepts • Storing data for analytics (mainly): • HDFS – Hadoop File System • Map Reduce- Programming method mainly for analytics • Other “Add-on”: Pig, , Hive, JAQL (IBM) • Storing and retrieving data - DBMS: • NoSQL – DBMS (not only SQL): • Cassandra • MongoDB • CouchDB • Hbase • Redis • Neo4j • Riak Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 8. Big Data technologies (Hadoop etc.) vs. traditional IT - Infrastructure Big DataTraditional IT Local storageCentralized Storage Cheap HW White BoxesBrand redundant Servers Is standardization needed?! (in the HW level). No server virtualization. Standard Infrastructure and virtual servers. Why do I need backup? How do I tackle DRP (compute clusters that are stretched over locations) Well established backup and DRP procedures Open Source solutionsTraditional vendors In a new patch for specific issues sometimes it is written “not implemented yet” Mature products and procedures Different kind of programming (map-reduce) , no JoinsTraditional programming, SQL Will Big Data infrastructure be part of existing infrastructure or will be developed as new domain? Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 9. Big Data technologies (Hadoop etc.) vs. traditional IT - BI • Data Scientist vs. BI • Data science incorporates varying elements and builds on techniques and theories from many fields, including math, statistics, data engineering, pattern recognition and learning, advanced computing, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products. (WIKI) Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 10. What is the business value of big data analytics? • Big data is now a technology looking for a business need • It can mean doing the same thing but better / faster (better segmentation, more accurate analysis model) • Doing things Cheaper • Or it can mean doing completely new things (telematics, sentiment analysis, recommendation engine, matching competition’s pricing in real time, being able to analyze data we haven’t been able to analyze in the past) Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 11. Decision making – old school vs. new school (big data) • Old School: • Phase 1 : Analyze existing data and prepare general model • Phase 2: Apply the general model to specific client • This means applying the same model for many clients when they arrive • Issues with Old School decision making: • Time gap between preparing and applying the model • # of combinations might be too big for general model (example: recommendation based in interest) • The general model generated is biased towards “main stream” population • New School (Big Data): • Phase 1: Prepare specific model for the client and apply the model – instantly Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph
  • 12. Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph You can’t run and you can’t hide – HadoopBig Data is coming DBMS
  • 13. Pini Cohen's work Copyright@2013 Do not remove source or attribution from any slide, graph or portion of graph 13