SlideShare a Scribd company logo
1 of 75
Download to read offline
Zohar Elkayam
CTO, Brillix
Zohar@Brillix.co.il
Big Data For CIOs
Who am I?
• Zohar Elkayam, CTO at Brillix
• DBA, team leader, and a senior consultant for over 17 years
• Oracle ACE Associate
• Involved with Big Data projects since 2011
• Blogger – www.realdbamagic.com
http://brillix.co.il2
About Brillix
• Brillix is a leading company that specialized in Data
Management
• We provide professional services and consulting for
Databases, Security and Big Data solutions
3
Agenda: Big Data
• Big Data
• Why
• What
• Where
• Who and How
• A Big Data Solution: Hadoop
• NoSQL vs. RDBMS
4 http://brillix.co.il
What is Big Data?
http://brillix.co.il5
"Big Data"??
Different definitions
“Bigdataexceedsthereachofcommonlyusedhardwareenvironments
andsoftwaretoolstocapture,manage,andprocessitwithinatolerable
elapsedtimeforitsuserpopulation.”-TeradataMagazinearticle,2011
“Bigdatareferstodatasetswhosesizeisbeyondtheabilityoftypical
databasesoftwaretoolstocapture,store,manageandanalyze.”
- TheMcKinseyGlobalInstitute, 2012
“Bigdataisacollectionofdatasetssolargeandcomplexthatit
becomesdifficulttoprocessusingon-handdatabasemanagement
tools.” -Wikipedia, 2014
http://brillix.co.il6
http://brillix.co.il7
Success Stories
http://brillix.co.il8
More success stories
http://brillix.co.il9
MORE stories..
• Crime Prevention in Los Angeles
• Diagnosis and treatment of genetic diseases
• Investments in the financial sector
• Generation of personalized advertising
• Astronomical discoveries
http://brillix.co.il10
Examples of Big Data Use Cases Today
MEDIA/
ENTERTAINMENT
Viewers / advertising
effectiveness
COMMUNICATIONS
Location-based advertising
EDUCATION &
RESEARCH
Experiment sensor
analysis
CONSUMER PACKAGED
GOODS
Sentiment analysis of what’s
hot, problems
HEALTH CARE
Patient sensors,
monitoring, EHRs
Quality of care
LIFE SCIENCES
Clinical trials
Genomics
HIGH TECHNOLOGY /
INDUSTRIAL MFG.
Mfg quality
Warranty analysis
OIL & GAS
Drilling exploration
sensor analysis
FINANCIAL
SERVICES
Risk & portfolio analysis
New products
AUTOMOTIVE
Auto sensors reporting
location, problems
RETAIL
Consumer sentiment
Optimized marketing
LAW ENFORCEMENT
& DEFENSE
Threat analysis - social
media monitoring, photo
analysis
TRAVEL &
TRANSPORTATION
Sensor analysis for optimal
traffic flows
Customer sentiment
UTILITIES
Smart Meter
analysis for
network capacity,
ON-LINE SERVICES /
SOCIAL MEDIA
People & career
matching
Web-site
optimization
http://brillix.co.il11
Most Requested Uses of Big Data
• Log Analytics & Storage
• Smart Grid / Smarter Utilities
• RFID Tracking & Analytics
• Fraud / Risk Management & Modeling
• 360° View of the Customer
• Warehouse Extension
• Email / Call Center Transcript Analysis
• Call Detail Record Analysis
12 http://brillix.co.il
The Challenge
http://brillix.co.il13
The Big Data Challenge
http://brillix.co.il14
Volume
• Big data come in one size: Big.
• Size is measured in Terabyte(1012), Petabyte(1015), Exabyte(1018),
Zettabyte (1021)
• The storing and handling of the data becomes an issue
• Producing value out of the data in a reasonable time is an issue
15 http://brillix.co.il
Some numbers
• How much data in the world?
• 800 Terabytes, 2000
• 160 Exabytes, 2006 (1EB = 1018B)
• 4.5 Zettabytes, 2012 (1ZB = 1021B)
• 44 Zettabytes by 2020
• How much is a zettabyte?
• 1,000,000,000,000,000,000,000 bytes
• A stack of 1TB hard disks that is 25,400 km high
http://brillix.co.il16
Growth Rate
• How much data
generated in a
day?
• 7 TB, Twitter
• 10 TB, Facebook
http://brillix.co.il17
Data grows fast!
http://brillix.co.il18
Variety
• Big Data extends beyond structured data: including
semi-structured and unstructured information: logs,
text, audio and videos.
• Wide variety of rapidly evolving data types requires
highly flexible stores and handling.
19 http://brillix.co.il
Structured & Un-Structured
Un-Structured Structured
Objects Tables
Flexible Columns and Rows
Structure Unknown Predefined Structure
Textual and Binary Mostly Textual
http://brillix.co.il20
Big Data is ANY data
• Some has fixed structure
• Some is “bring own structure”
• We want to find value in all of it
Unstructured, Semi-Structure and Structured
http://brillix.co.il21
Data Types by Industry
http://brillix.co.il22
Velocity
• The speed in which the data is being generated and collected
• Streaming data and large volume data movement
• High velocity of data capture – requires rapid ingestion
• Might cause the backlog problem
23 http://brillix.co.il
Global Internet Device Forecast
http://brillix.co.il24
http://brillix.co.il25
Internet of Things
Veracity
• Quality of the data can vary greatly
• Data sources might be messy or corrupted
http://brillix.co.il26
So, What Defines Big Data?
• When we think that we can produce value from that data and
want to handle it
• When the data is too big or moves too fast to handle in a
sensible amount of time
• When the data doesn’t fit conventional database structure
• When the solution becomes part of the problem
27 http://brillix.co.il
http://brillix.co.il28
Why Big Data Now?
• Because we have data:
• Data is born already in digital form
• 40% of data growth per year
• Because we can:
• 500$ for a drive in which to store all the music of the world
• 40 years of Moore's Law = large computational resources
• 64% of organizations have invested in big data in 2013
• 34 billion $ invested in big data in 2013
“Because we reached dead end with logic”
http://brillix.co.il29
How to do Big Data
http://brillix.co.il30
31 http://brillix.co.il
Big Data in Practice
• Big data is big: technological infrastructure solutions needed
• Big data is messy: data sources must be cleaned before use
• Big data is complicated: need developers and system admins
to manage intake of data
http://brillix.co.il32
Big Data in Practice (cont.)
• Data must be broken out of silos in order to be mined, analyzed
and transformed into value
• The organization must learn how to communicate and interpret
the results of analysis
http://brillix.co.il33
Infrastructure Challenges
• Infrastructure that is built for:
• Large-scale
• Distributed
• Data-intensive jobs that spread the problem across clusters of server
nodes
34 http://brillix.co.il
Infrastructure Challenges (cont.)
• Storage:
• Efficient and cost-effective enough to capture and store terabytes, if
not petabytes, of data
• With intelligent capabilities to reduce your data footprint such as:
• Data compression
• Automatic data tiering
• Data deduplication
35 http://brillix.co.il
Infrastructure Challenges (cont.)
• Network infrastructure that can quickly import large data sets
and then replicate it to various nodes for processing
• Security capabilities that protect highly-distributed
infrastructure and data
36 http://brillix.co.il
Goals of Analytics
http://brillix.co.il37
Positions in Big Data management
• DevOps are handling the infrastructure – sys admins and
cluster manager
• Data scientists are in charge of producing value from the data
http://brillix.co.il38
Data Scientist
http://brillix.co.il39
Hadoop
http://brillix.co.il40
Apache Hadoop
• Open source project run by Apache (2006)
• Hadoop brings the ability to cheaply process large amounts of
data, regardless of its structure
• It Is has been the driving force behind the growth of the big
data Industry
• Get the public release from:
• http://hadoop.apache.org/core/
41 http://brillix.co.il
Hadoop Creation History
http://brillix.co.il42
Key points
• An open-source framework that uses a simple programming model to
enable distributed processing of large data sets on clusters of
computers.
• The complete technology stack includes
• common utilities
• a distributed file system
• analytics and data storage platforms
• an application layer that manages distributed processing, parallel
computation, workflow, and configuration management
• Cost-effective for handling large unstructured data sets than
conventional approaches, and it offers massive scalability and speed
43
Why use Hadoop?
Cost Flexibility
Near linear
performance up
to 1000s of
nodes
Leverages
commodity HW &
open source SW
Versatility with
data, analytics &
operation
Scalability
http://brillix.co.il44
What Hadoop Is Not?
• Hadoop does not replace DW or relational databases
• Hadoop is not for OLTP or real-time systems
• Very good for large amount, not so much for smaller sets
• Designed for clusters – there is Hadoop monster server (single
server)
http://brillix.co.il45
Hadoop Cluster in Yahoo
46
Cluster of machine running Hadoop at Yahoo! (credit: Yahoo!)
http://brillix.co.il
Hadoop under the Hood
http://brillix.co.il47
Hadoop Main Components
• HDFS: Hadoop Distributed File System – distributed file
system that runs in a clustered environment.
• MapReduce – programming paradigm for running processes
over a clustered environments.
48 http://brillix.co.il
HDFS is...
• A distributed file system
• Redundant storage
• Designed to reliably store data using commodity hardware
• Designed to expect hardware failures
• Intended for large files
• Designed for batch inserts
• The Hadoop Distributed File System
49 http://brillix.co.il
MapReduce is...
• A programming model for expressing distributed
computations at a massive scale
• An execution framework for organizing and performing such
computations
• An open-source implementation called Hadoop
50 http://brillix.co.il
MapReduce is good for...
• Embarrassingly parallel algorithms
• Summing, grouping, filtering, joining
• Off-line batch jobs on massive data sets
• Analyzing an entire large dataset
51 http://brillix.co.il
MapReduce is OK for...
• Iterative jobs (i.e., graph algorithms)
• Each iteration must read/write data to disk
• IO and latency cost of an iteration is high
52 http://brillix.co.il
MapReduce is NOT good for...
• Jobs that need shared state/coordination
• Tasks are shared-nothing
• Shared-state requires scalable state store
• Low-latency jobs
• Jobs on small datasets
• Finding individual records
53 http://brillix.co.il
Spark
• Fast and general MapReduce-like engine for large-scale data
processing
• Fast
• In memory data storage for very fast interactive queries Up to 100 times
faster then Hadoop
• General
• Unified platform that can combine: SQL, Machine Learning , Streaming ,
Graph & Complex analytics
• Ease of use
• Can be developed in Java, Scala or Python
• Integrated with Hadoop
• Can read from HDFS, HBase, Cassandra, and any Hadoop data source.
54
Key Concepts
55
Resilient Distributed Datasets
• Collections of objects spread
across a cluster, stored in RAM
or on Disk
• Built through parallel
transformations
• Automatically rebuilt on failure
Operations
• Transformations
(e.g. map, filter, groupBy)
• Actions
(e.g. count, collect, save)
Write programs in terms of transformations on
distributed datasets
Unified Platform
• Continued innovation bringing new functionality, e.g.:
• Java 8 (Closures, LambaExpressions)
• Spark SQL (SQL on Spark, not just Hive)
• BlinkDB(Approximate Queries)
• SparkR(R wrapper for Spark)
56
Big Data and NoSQL
http://brillix.co.il57
The Challenge
• We want scalable, durable, high volume, high velocity,
distributed data storage that can handle non-structured data
and that will fit our specific need
• RDBMS is too generic and doesn’t cut it any more – it can do
the job but it is not cost effective to our usages
58 http://brillix.co.il
The Solution: NoSQL
• Let’s take some parts of the standard RDBMS out to and
design the solution to our specific uses
• NoSQL databases have been around for ages under different
names/solutions
59 http://brillix.co.il
Example Comparison: RDBMS vs. Hadoop
60
Typical Traditional RDBMS Hadoop
Data Size Gigabytes Petabytes
Access Interactive and Batch Batch – NOT Interactive
Updates Read / Write many times Write once, Read many times
Structure Static Schema Dynamic Schema
Scaling Nonlinear Linear
Query Response
Time
Can be near immediate Has latency (due to batch processing)
http://brillix.co.il
Best Used For:
 Structured or Not (Flexibility)
 Scalability of Storage/Compute
 Complex Data Processing
 Cheaper compared to RDBMS
Relational Database
Best Used For:
 Interactive OLAP Analytics
(<1sec)
 Multistep Transactions
 100% SQL Compliance
Best when used together
Hadoop And Relational Database
61 http://brillix.co.il
The NOSQL Movement
• NOSQL is not a technology – it’s a concept
• We need high performance, scale out abilities or agile structure
• We are willing to sacrifice our sacred database cows:
consistency, transactions, durability
• Over 150 different brands and solutions
(http://nosql-database.org/).
62 http://brillix.co.il
Is NoSQL a RDMS Replacement?
NO
63
Well... Sometimes it does…
http://brillix.co.il
NoSQL Taxonomy
Type Examples
Key-Value Store
Document Store
Column Store
Graph Store
http://brillix.co.il64
Key Value Store
• Distributed hash tables
• Very fast to get a single value
• Examples:
• Amazon DynamoDB
• Berkeley DB
• Redis
• Riak
• Cassandra
65 http://brillix.co.il
Document Store
• Similar to Key/Value, but value is a document
• JSON or something similar, flexible schema
• Agile technology
• Examples:
• MongoDB
• CouchDB
• CouchBase
66 http://brillix.co.il
What is a Column Store Database?
• Column Store databases are management systems that uses
data managed in a columnar structure format for better
analysis of single column data (i.e. aggregation). Data is saved
and handled as columns instead of rows.
• Examples:
• HP Vertica
• Pivotal (EMC) GreenPlum
• Hadoop Hbase
• Amazon’s SimpleDB
• Cassandra
http://brillix.co.il67
Query Data
• When we query data, records are read at the
order they are organized in the physical structure
• Even when we query a single
column, we still need to read the
entire table and extract the column
Row 1
Row 2
Row 3
Row 4
Col 1 Col 2 Col 3 Col 4
Select Col2
From MyTable
Select *
From MyTable
http://brillix.co.il68
How Does Column Stores Keep Data
Organization in row store Organization in column store
http://brillix.co.il69
Select Col2
From MyTable
Row Format vs. Column Format
http://brillix.co.il71
Graph Store
• Inspired by the graph theory
• Data model: nodes, relationships, properties on both sides
• Relational database have a hard time to represent a graph in
the Database
• Example:
• Neo4j
• InfiniteGraph
• RDF
72 http://brillix.co.il
Graph Example
http://brillix.co.il73
Conclusion
• We do Big Data to gain Value. Without value, there is no Big Data
• Handling Big Data is a challenge – we talked about who uses it, when
and where
• Hadoop is a solution for Big Data usages but it’s not a magical solution
• NoSQL, NewSQL and RDBMS are all solutions we can integrate for
different usages
• New organizational positions: cluster devops and data scientist.
http://brillix.co.il74
Q&A
http://brillix.co.il75
Thank You
Zohar Elkayam
twitter: @realmgic
Zohar@Brillix.co.il
www.realdbamagic.com
http://brillix.co.il76

More Related Content

What's hot

Introduction to Oracle Data Guard Broker
Introduction to Oracle Data Guard BrokerIntroduction to Oracle Data Guard Broker
Introduction to Oracle Data Guard BrokerZohar Elkayam
 
Oracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUGOracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUGZohar Elkayam
 
Docker Concepts for Oracle/MySQL DBAs and DevOps
Docker Concepts for Oracle/MySQL DBAs and DevOpsDocker Concepts for Oracle/MySQL DBAs and DevOps
Docker Concepts for Oracle/MySQL DBAs and DevOpsZohar Elkayam
 
SQLcl the next generation of SQLPlus?
SQLcl the next generation of SQLPlus?SQLcl the next generation of SQLPlus?
SQLcl the next generation of SQLPlus?Zohar Elkayam
 
Fast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud ServiceFast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud ServiceGustavo Rene Antunez
 
2019 - GUOB Tech Day / Groundbreakers LAD Tour - Database Migration Methods t...
2019 - GUOB Tech Day / Groundbreakers LAD Tour - Database Migration Methods t...2019 - GUOB Tech Day / Groundbreakers LAD Tour - Database Migration Methods t...
2019 - GUOB Tech Day / Groundbreakers LAD Tour - Database Migration Methods t...Marcus Vinicius Miguel Pedro
 
Introduction of MariaDB AX / TX
Introduction of MariaDB AX / TXIntroduction of MariaDB AX / TX
Introduction of MariaDB AX / TXGOTO Satoru
 
Winning performance challenges in oracle multitenant
Winning performance challenges in oracle multitenantWinning performance challenges in oracle multitenant
Winning performance challenges in oracle multitenantPini Dibask
 
SQL Server 2019 CTP2.4
SQL Server 2019 CTP2.4SQL Server 2019 CTP2.4
SQL Server 2019 CTP2.4Gianluca Hotz
 
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...Olivier DASINI
 
Research on vector spatial data storage scheme based
Research on vector spatial data storage scheme basedResearch on vector spatial data storage scheme based
Research on vector spatial data storage scheme basedAnant Kumar
 
Oracle Goldengate training by Vipin Mishra
Oracle Goldengate training by Vipin Mishra Oracle Goldengate training by Vipin Mishra
Oracle Goldengate training by Vipin Mishra Vipin Mishra
 
Enable GoldenGate Monitoring with OEM 12c/JAgent
Enable GoldenGate Monitoring with OEM 12c/JAgentEnable GoldenGate Monitoring with OEM 12c/JAgent
Enable GoldenGate Monitoring with OEM 12c/JAgentBobby Curtis
 
Oracle Exadata Cloud Services guide from practical experience - OOW19
Oracle Exadata Cloud Services guide from practical experience - OOW19Oracle Exadata Cloud Services guide from practical experience - OOW19
Oracle Exadata Cloud Services guide from practical experience - OOW19Nelson Calero
 
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data WarehousingGoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data WarehousingMichael Rainey
 
Collaborate 17 - Database consolidation using the oracle multitenant architec...
Collaborate 17 - Database consolidation using the oracle multitenant architec...Collaborate 17 - Database consolidation using the oracle multitenant architec...
Collaborate 17 - Database consolidation using the oracle multitenant architec...Pini Dibask
 
Welcome to databases in the Cloud
Welcome to databases in the CloudWelcome to databases in the Cloud
Welcome to databases in the CloudNelson Calero
 
OEM12c, DB12c and You! - RMOUG TD2014 Edition
OEM12c, DB12c and You! - RMOUG TD2014 EditionOEM12c, DB12c and You! - RMOUG TD2014 Edition
OEM12c, DB12c and You! - RMOUG TD2014 EditionBobby Curtis
 
Introduction of MariaDB 2017 09
Introduction of MariaDB 2017 09Introduction of MariaDB 2017 09
Introduction of MariaDB 2017 09GOTO Satoru
 

What's hot (20)

Introduction to Oracle Data Guard Broker
Introduction to Oracle Data Guard BrokerIntroduction to Oracle Data Guard Broker
Introduction to Oracle Data Guard Broker
 
Oracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUGOracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUG
 
Docker Concepts for Oracle/MySQL DBAs and DevOps
Docker Concepts for Oracle/MySQL DBAs and DevOpsDocker Concepts for Oracle/MySQL DBAs and DevOps
Docker Concepts for Oracle/MySQL DBAs and DevOps
 
SQLcl the next generation of SQLPlus?
SQLcl the next generation of SQLPlus?SQLcl the next generation of SQLPlus?
SQLcl the next generation of SQLPlus?
 
Oracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_databaseOracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_database
 
Fast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud ServiceFast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud Service
 
2019 - GUOB Tech Day / Groundbreakers LAD Tour - Database Migration Methods t...
2019 - GUOB Tech Day / Groundbreakers LAD Tour - Database Migration Methods t...2019 - GUOB Tech Day / Groundbreakers LAD Tour - Database Migration Methods t...
2019 - GUOB Tech Day / Groundbreakers LAD Tour - Database Migration Methods t...
 
Introduction of MariaDB AX / TX
Introduction of MariaDB AX / TXIntroduction of MariaDB AX / TX
Introduction of MariaDB AX / TX
 
Winning performance challenges in oracle multitenant
Winning performance challenges in oracle multitenantWinning performance challenges in oracle multitenant
Winning performance challenges in oracle multitenant
 
SQL Server 2019 CTP2.4
SQL Server 2019 CTP2.4SQL Server 2019 CTP2.4
SQL Server 2019 CTP2.4
 
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
 
Research on vector spatial data storage scheme based
Research on vector spatial data storage scheme basedResearch on vector spatial data storage scheme based
Research on vector spatial data storage scheme based
 
Oracle Goldengate training by Vipin Mishra
Oracle Goldengate training by Vipin Mishra Oracle Goldengate training by Vipin Mishra
Oracle Goldengate training by Vipin Mishra
 
Enable GoldenGate Monitoring with OEM 12c/JAgent
Enable GoldenGate Monitoring with OEM 12c/JAgentEnable GoldenGate Monitoring with OEM 12c/JAgent
Enable GoldenGate Monitoring with OEM 12c/JAgent
 
Oracle Exadata Cloud Services guide from practical experience - OOW19
Oracle Exadata Cloud Services guide from practical experience - OOW19Oracle Exadata Cloud Services guide from practical experience - OOW19
Oracle Exadata Cloud Services guide from practical experience - OOW19
 
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data WarehousingGoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
 
Collaborate 17 - Database consolidation using the oracle multitenant architec...
Collaborate 17 - Database consolidation using the oracle multitenant architec...Collaborate 17 - Database consolidation using the oracle multitenant architec...
Collaborate 17 - Database consolidation using the oracle multitenant architec...
 
Welcome to databases in the Cloud
Welcome to databases in the CloudWelcome to databases in the Cloud
Welcome to databases in the Cloud
 
OEM12c, DB12c and You! - RMOUG TD2014 Edition
OEM12c, DB12c and You! - RMOUG TD2014 EditionOEM12c, DB12c and You! - RMOUG TD2014 Edition
OEM12c, DB12c and You! - RMOUG TD2014 Edition
 
Introduction of MariaDB 2017 09
Introduction of MariaDB 2017 09Introduction of MariaDB 2017 09
Introduction of MariaDB 2017 09
 

Viewers also liked

The Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
The Loss of HMAS SYDNEY 2: Medical Aspects- WestphalenThe Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
The Loss of HMAS SYDNEY 2: Medical Aspects- WestphalenLeishman Associates
 
Reference Architecture: EMC Hybrid Cloud with VMware
Reference Architecture: EMC Hybrid Cloud with VMwareReference Architecture: EMC Hybrid Cloud with VMware
Reference Architecture: EMC Hybrid Cloud with VMwareEMC
 
Channel partners: Get ready for future trends in client solutions
Channel partners: Get ready for future trends in client solutionsChannel partners: Get ready for future trends in client solutions
Channel partners: Get ready for future trends in client solutionsDell World
 
Digital transformation - DevOps Day - 02/02/2017
Digital transformation - DevOps Day - 02/02/2017Digital transformation - DevOps Day - 02/02/2017
Digital transformation - DevOps Day - 02/02/2017Clara Feuillet
 
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...DIWUG
 
Brown Bag Lunch sur Hazelcast
Brown Bag Lunch sur HazelcastBrown Bag Lunch sur Hazelcast
Brown Bag Lunch sur HazelcastSylvain Wallez
 
'Living Lab' for HCI - presentation made at HCI International 2009
'Living Lab' for HCI - presentation made at HCI International 2009'Living Lab' for HCI - presentation made at HCI International 2009
'Living Lab' for HCI - presentation made at HCI International 2009Ed Chi
 
Julie Van den Steen en Maarten Verhulst richten firma op
Julie Van den Steen en Maarten Verhulst richten firma opJulie Van den Steen en Maarten Verhulst richten firma op
Julie Van den Steen en Maarten Verhulst richten firma opThierry Debels
 
C1 keynote creating_your_enterprise_cloud_strategy
C1 keynote creating_your_enterprise_cloud_strategyC1 keynote creating_your_enterprise_cloud_strategy
C1 keynote creating_your_enterprise_cloud_strategyDr. Wilfred Lin (Ph.D.)
 
Understanding Camouflage
Understanding CamouflageUnderstanding Camouflage
Understanding CamouflageEmily Kissner
 
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloudA1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloudDr. Wilfred Lin (Ph.D.)
 
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
5733   a deep dive into IBM Watson Foundation for CSP (WFC)5733   a deep dive into IBM Watson Foundation for CSP (WFC)
5733 a deep dive into IBM Watson Foundation for CSP (WFC)Arvind Sathi
 
Business model cavans nl-sep-2014
Business model cavans nl-sep-2014Business model cavans nl-sep-2014
Business model cavans nl-sep-2014RolandSyntens
 
Love Cloud: 28 June 2017
Love Cloud: 28 June 2017 Love Cloud: 28 June 2017
Love Cloud: 28 June 2017 Chloe Mustafa
 
AI = SE , giip system manage automation with A.I
AI = SE , giip system manage automation with A.IAI = SE , giip system manage automation with A.I
AI = SE , giip system manage automation with A.ILowy Shin
 
Grade 3 text structure assessment teaching guide
Grade 3 text structure assessment teaching guideGrade 3 text structure assessment teaching guide
Grade 3 text structure assessment teaching guideEmily Kissner
 
Evolving your automation with hybrid workers
Evolving your automation with hybrid workersEvolving your automation with hybrid workers
Evolving your automation with hybrid workerskieranjacobsen
 
Emerging Technologies: Heroku for ISVs (October 13, 2014)
Emerging Technologies: Heroku for ISVs (October 13, 2014)Emerging Technologies: Heroku for ISVs (October 13, 2014)
Emerging Technologies: Heroku for ISVs (October 13, 2014)Salesforce Partners
 

Viewers also liked (20)

The Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
The Loss of HMAS SYDNEY 2: Medical Aspects- WestphalenThe Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
The Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
 
Reference Architecture: EMC Hybrid Cloud with VMware
Reference Architecture: EMC Hybrid Cloud with VMwareReference Architecture: EMC Hybrid Cloud with VMware
Reference Architecture: EMC Hybrid Cloud with VMware
 
Channel partners: Get ready for future trends in client solutions
Channel partners: Get ready for future trends in client solutionsChannel partners: Get ready for future trends in client solutions
Channel partners: Get ready for future trends in client solutions
 
Unc plus delta
Unc plus deltaUnc plus delta
Unc plus delta
 
Digital transformation - DevOps Day - 02/02/2017
Digital transformation - DevOps Day - 02/02/2017Digital transformation - DevOps Day - 02/02/2017
Digital transformation - DevOps Day - 02/02/2017
 
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
 
Brown Bag Lunch sur Hazelcast
Brown Bag Lunch sur HazelcastBrown Bag Lunch sur Hazelcast
Brown Bag Lunch sur Hazelcast
 
'Living Lab' for HCI - presentation made at HCI International 2009
'Living Lab' for HCI - presentation made at HCI International 2009'Living Lab' for HCI - presentation made at HCI International 2009
'Living Lab' for HCI - presentation made at HCI International 2009
 
Julie Van den Steen en Maarten Verhulst richten firma op
Julie Van den Steen en Maarten Verhulst richten firma opJulie Van den Steen en Maarten Verhulst richten firma op
Julie Van den Steen en Maarten Verhulst richten firma op
 
C1 keynote creating_your_enterprise_cloud_strategy
C1 keynote creating_your_enterprise_cloud_strategyC1 keynote creating_your_enterprise_cloud_strategy
C1 keynote creating_your_enterprise_cloud_strategy
 
Understanding Camouflage
Understanding CamouflageUnderstanding Camouflage
Understanding Camouflage
 
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloudA1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
 
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
5733   a deep dive into IBM Watson Foundation for CSP (WFC)5733   a deep dive into IBM Watson Foundation for CSP (WFC)
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
 
Business model cavans nl-sep-2014
Business model cavans nl-sep-2014Business model cavans nl-sep-2014
Business model cavans nl-sep-2014
 
Intel and Big Data
Intel and Big DataIntel and Big Data
Intel and Big Data
 
Love Cloud: 28 June 2017
Love Cloud: 28 June 2017 Love Cloud: 28 June 2017
Love Cloud: 28 June 2017
 
AI = SE , giip system manage automation with A.I
AI = SE , giip system manage automation with A.IAI = SE , giip system manage automation with A.I
AI = SE , giip system manage automation with A.I
 
Grade 3 text structure assessment teaching guide
Grade 3 text structure assessment teaching guideGrade 3 text structure assessment teaching guide
Grade 3 text structure assessment teaching guide
 
Evolving your automation with hybrid workers
Evolving your automation with hybrid workersEvolving your automation with hybrid workers
Evolving your automation with hybrid workers
 
Emerging Technologies: Heroku for ISVs (October 13, 2014)
Emerging Technologies: Heroku for ISVs (October 13, 2014)Emerging Technologies: Heroku for ISVs (October 13, 2014)
Emerging Technologies: Heroku for ISVs (October 13, 2014)
 

Similar to Big data for cio 2015

Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016Zohar Elkayam
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Rittman Analytics
 
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 GenoaHadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoalarsgeorge
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014ALTER WAY
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxdickonsondorris
 
Data Science Overview
Data Science OverviewData Science Overview
Data Science OverviewDavide Mauri
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataRoi Blanco
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...Mihai Criveti
 
Transform from database professional to a Big Data architect
Transform from database professional to a Big Data architectTransform from database professional to a Big Data architect
Transform from database professional to a Big Data architectSaurabh K. Gupta
 
Level Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationLevel Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationDoug Denton
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptxkalai75
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01nayanbhatia2
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - IntroductionTomy Rhymond
 

Similar to Big data for cio 2015 (20)

Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
big data
big data big data
big data
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
 
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 GenoaHadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Ds01 data science
Ds01   data scienceDs01   data science
Ds01 data science
 
bigdata.pptx
bigdata.pptxbigdata.pptx
bigdata.pptx
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
bigdata.pdf
bigdata.pdfbigdata.pdf
bigdata.pdf
 
Data Science Overview
Data Science OverviewData Science Overview
Data Science Overview
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
 
Transform from database professional to a Big Data architect
Transform from database professional to a Big Data architectTransform from database professional to a Big Data architect
Transform from database professional to a Big Data architect
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Level Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationLevel Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentation
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 

More from Zohar Elkayam

Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsOracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsZohar Elkayam
 
PL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformancePL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformanceZohar Elkayam
 
The art of querying – newest and advanced SQL techniques
The art of querying – newest and advanced SQL techniquesThe art of querying – newest and advanced SQL techniques
The art of querying – newest and advanced SQL techniquesZohar Elkayam
 
Oracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic FunctionsOracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic FunctionsZohar Elkayam
 
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem (c17lv version)
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem (c17lv version)Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem (c17lv version)
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem (c17lv version)Zohar Elkayam
 
Oracle 12c New Features For Better Performance
Oracle 12c New Features For Better PerformanceOracle 12c New Features For Better Performance
Oracle 12c New Features For Better PerformanceZohar Elkayam
 
Advanced PL/SQL Optimizing for Better Performance 2016
Advanced PL/SQL Optimizing for Better Performance 2016Advanced PL/SQL Optimizing for Better Performance 2016
Advanced PL/SQL Optimizing for Better Performance 2016Zohar Elkayam
 
Oracle Database Advanced Querying (2016)
Oracle Database Advanced Querying (2016)Oracle Database Advanced Querying (2016)
Oracle Database Advanced Querying (2016)Zohar Elkayam
 
OOW2016: Exploring Advanced SQL Techniques Using Analytic Functions
OOW2016: Exploring Advanced SQL Techniques Using Analytic FunctionsOOW2016: Exploring Advanced SQL Techniques Using Analytic Functions
OOW2016: Exploring Advanced SQL Techniques Using Analytic FunctionsZohar Elkayam
 
Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?Zohar Elkayam
 
Exploring Advanced SQL Techniques Using Analytic Functions
Exploring Advanced SQL Techniques Using Analytic FunctionsExploring Advanced SQL Techniques Using Analytic Functions
Exploring Advanced SQL Techniques Using Analytic FunctionsZohar Elkayam
 
Exploring Advanced SQL Techniques Using Analytic Functions
Exploring Advanced SQL Techniques Using Analytic FunctionsExploring Advanced SQL Techniques Using Analytic Functions
Exploring Advanced SQL Techniques Using Analytic FunctionsZohar Elkayam
 
Advanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better PerformanceAdvanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better PerformanceZohar Elkayam
 
Oracle Database Advanced Querying
Oracle Database Advanced QueryingOracle Database Advanced Querying
Oracle Database Advanced QueryingZohar Elkayam
 
Oracle Data Guard A to Z
Oracle Data Guard A to ZOracle Data Guard A to Z
Oracle Data Guard A to ZZohar Elkayam
 
Oracle Data Guard Broker Webinar
Oracle Data Guard Broker WebinarOracle Data Guard Broker Webinar
Oracle Data Guard Broker WebinarZohar Elkayam
 

More from Zohar Elkayam (16)

Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsOracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
 
PL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformancePL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme Performance
 
The art of querying – newest and advanced SQL techniques
The art of querying – newest and advanced SQL techniquesThe art of querying – newest and advanced SQL techniques
The art of querying – newest and advanced SQL techniques
 
Oracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic FunctionsOracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic Functions
 
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem (c17lv version)
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem (c17lv version)Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem (c17lv version)
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem (c17lv version)
 
Oracle 12c New Features For Better Performance
Oracle 12c New Features For Better PerformanceOracle 12c New Features For Better Performance
Oracle 12c New Features For Better Performance
 
Advanced PL/SQL Optimizing for Better Performance 2016
Advanced PL/SQL Optimizing for Better Performance 2016Advanced PL/SQL Optimizing for Better Performance 2016
Advanced PL/SQL Optimizing for Better Performance 2016
 
Oracle Database Advanced Querying (2016)
Oracle Database Advanced Querying (2016)Oracle Database Advanced Querying (2016)
Oracle Database Advanced Querying (2016)
 
OOW2016: Exploring Advanced SQL Techniques Using Analytic Functions
OOW2016: Exploring Advanced SQL Techniques Using Analytic FunctionsOOW2016: Exploring Advanced SQL Techniques Using Analytic Functions
OOW2016: Exploring Advanced SQL Techniques Using Analytic Functions
 
Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?
 
Exploring Advanced SQL Techniques Using Analytic Functions
Exploring Advanced SQL Techniques Using Analytic FunctionsExploring Advanced SQL Techniques Using Analytic Functions
Exploring Advanced SQL Techniques Using Analytic Functions
 
Exploring Advanced SQL Techniques Using Analytic Functions
Exploring Advanced SQL Techniques Using Analytic FunctionsExploring Advanced SQL Techniques Using Analytic Functions
Exploring Advanced SQL Techniques Using Analytic Functions
 
Advanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better PerformanceAdvanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better Performance
 
Oracle Database Advanced Querying
Oracle Database Advanced QueryingOracle Database Advanced Querying
Oracle Database Advanced Querying
 
Oracle Data Guard A to Z
Oracle Data Guard A to ZOracle Data Guard A to Z
Oracle Data Guard A to Z
 
Oracle Data Guard Broker Webinar
Oracle Data Guard Broker WebinarOracle Data Guard Broker Webinar
Oracle Data Guard Broker Webinar
 

Recently uploaded

Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 

Recently uploaded (20)

Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 

Big data for cio 2015

  • 2. Who am I? • Zohar Elkayam, CTO at Brillix • DBA, team leader, and a senior consultant for over 17 years • Oracle ACE Associate • Involved with Big Data projects since 2011 • Blogger – www.realdbamagic.com http://brillix.co.il2
  • 3. About Brillix • Brillix is a leading company that specialized in Data Management • We provide professional services and consulting for Databases, Security and Big Data solutions 3
  • 4. Agenda: Big Data • Big Data • Why • What • Where • Who and How • A Big Data Solution: Hadoop • NoSQL vs. RDBMS 4 http://brillix.co.il
  • 5. What is Big Data? http://brillix.co.il5
  • 10. MORE stories.. • Crime Prevention in Los Angeles • Diagnosis and treatment of genetic diseases • Investments in the financial sector • Generation of personalized advertising • Astronomical discoveries http://brillix.co.il10
  • 11. Examples of Big Data Use Cases Today MEDIA/ ENTERTAINMENT Viewers / advertising effectiveness COMMUNICATIONS Location-based advertising EDUCATION & RESEARCH Experiment sensor analysis CONSUMER PACKAGED GOODS Sentiment analysis of what’s hot, problems HEALTH CARE Patient sensors, monitoring, EHRs Quality of care LIFE SCIENCES Clinical trials Genomics HIGH TECHNOLOGY / INDUSTRIAL MFG. Mfg quality Warranty analysis OIL & GAS Drilling exploration sensor analysis FINANCIAL SERVICES Risk & portfolio analysis New products AUTOMOTIVE Auto sensors reporting location, problems RETAIL Consumer sentiment Optimized marketing LAW ENFORCEMENT & DEFENSE Threat analysis - social media monitoring, photo analysis TRAVEL & TRANSPORTATION Sensor analysis for optimal traffic flows Customer sentiment UTILITIES Smart Meter analysis for network capacity, ON-LINE SERVICES / SOCIAL MEDIA People & career matching Web-site optimization http://brillix.co.il11
  • 12. Most Requested Uses of Big Data • Log Analytics & Storage • Smart Grid / Smarter Utilities • RFID Tracking & Analytics • Fraud / Risk Management & Modeling • 360° View of the Customer • Warehouse Extension • Email / Call Center Transcript Analysis • Call Detail Record Analysis 12 http://brillix.co.il
  • 14. The Big Data Challenge http://brillix.co.il14
  • 15. Volume • Big data come in one size: Big. • Size is measured in Terabyte(1012), Petabyte(1015), Exabyte(1018), Zettabyte (1021) • The storing and handling of the data becomes an issue • Producing value out of the data in a reasonable time is an issue 15 http://brillix.co.il
  • 16. Some numbers • How much data in the world? • 800 Terabytes, 2000 • 160 Exabytes, 2006 (1EB = 1018B) • 4.5 Zettabytes, 2012 (1ZB = 1021B) • 44 Zettabytes by 2020 • How much is a zettabyte? • 1,000,000,000,000,000,000,000 bytes • A stack of 1TB hard disks that is 25,400 km high http://brillix.co.il16
  • 17. Growth Rate • How much data generated in a day? • 7 TB, Twitter • 10 TB, Facebook http://brillix.co.il17
  • 19. Variety • Big Data extends beyond structured data: including semi-structured and unstructured information: logs, text, audio and videos. • Wide variety of rapidly evolving data types requires highly flexible stores and handling. 19 http://brillix.co.il
  • 20. Structured & Un-Structured Un-Structured Structured Objects Tables Flexible Columns and Rows Structure Unknown Predefined Structure Textual and Binary Mostly Textual http://brillix.co.il20
  • 21. Big Data is ANY data • Some has fixed structure • Some is “bring own structure” • We want to find value in all of it Unstructured, Semi-Structure and Structured http://brillix.co.il21
  • 22. Data Types by Industry http://brillix.co.il22
  • 23. Velocity • The speed in which the data is being generated and collected • Streaming data and large volume data movement • High velocity of data capture – requires rapid ingestion • Might cause the backlog problem 23 http://brillix.co.il
  • 24. Global Internet Device Forecast http://brillix.co.il24
  • 26. Veracity • Quality of the data can vary greatly • Data sources might be messy or corrupted http://brillix.co.il26
  • 27. So, What Defines Big Data? • When we think that we can produce value from that data and want to handle it • When the data is too big or moves too fast to handle in a sensible amount of time • When the data doesn’t fit conventional database structure • When the solution becomes part of the problem 27 http://brillix.co.il
  • 29. Why Big Data Now? • Because we have data: • Data is born already in digital form • 40% of data growth per year • Because we can: • 500$ for a drive in which to store all the music of the world • 40 years of Moore's Law = large computational resources • 64% of organizations have invested in big data in 2013 • 34 billion $ invested in big data in 2013 “Because we reached dead end with logic” http://brillix.co.il29
  • 30. How to do Big Data http://brillix.co.il30
  • 32. Big Data in Practice • Big data is big: technological infrastructure solutions needed • Big data is messy: data sources must be cleaned before use • Big data is complicated: need developers and system admins to manage intake of data http://brillix.co.il32
  • 33. Big Data in Practice (cont.) • Data must be broken out of silos in order to be mined, analyzed and transformed into value • The organization must learn how to communicate and interpret the results of analysis http://brillix.co.il33
  • 34. Infrastructure Challenges • Infrastructure that is built for: • Large-scale • Distributed • Data-intensive jobs that spread the problem across clusters of server nodes 34 http://brillix.co.il
  • 35. Infrastructure Challenges (cont.) • Storage: • Efficient and cost-effective enough to capture and store terabytes, if not petabytes, of data • With intelligent capabilities to reduce your data footprint such as: • Data compression • Automatic data tiering • Data deduplication 35 http://brillix.co.il
  • 36. Infrastructure Challenges (cont.) • Network infrastructure that can quickly import large data sets and then replicate it to various nodes for processing • Security capabilities that protect highly-distributed infrastructure and data 36 http://brillix.co.il
  • 38. Positions in Big Data management • DevOps are handling the infrastructure – sys admins and cluster manager • Data scientists are in charge of producing value from the data http://brillix.co.il38
  • 41. Apache Hadoop • Open source project run by Apache (2006) • Hadoop brings the ability to cheaply process large amounts of data, regardless of its structure • It Is has been the driving force behind the growth of the big data Industry • Get the public release from: • http://hadoop.apache.org/core/ 41 http://brillix.co.il
  • 43. Key points • An open-source framework that uses a simple programming model to enable distributed processing of large data sets on clusters of computers. • The complete technology stack includes • common utilities • a distributed file system • analytics and data storage platforms • an application layer that manages distributed processing, parallel computation, workflow, and configuration management • Cost-effective for handling large unstructured data sets than conventional approaches, and it offers massive scalability and speed 43
  • 44. Why use Hadoop? Cost Flexibility Near linear performance up to 1000s of nodes Leverages commodity HW & open source SW Versatility with data, analytics & operation Scalability http://brillix.co.il44
  • 45. What Hadoop Is Not? • Hadoop does not replace DW or relational databases • Hadoop is not for OLTP or real-time systems • Very good for large amount, not so much for smaller sets • Designed for clusters – there is Hadoop monster server (single server) http://brillix.co.il45
  • 46. Hadoop Cluster in Yahoo 46 Cluster of machine running Hadoop at Yahoo! (credit: Yahoo!) http://brillix.co.il
  • 47. Hadoop under the Hood http://brillix.co.il47
  • 48. Hadoop Main Components • HDFS: Hadoop Distributed File System – distributed file system that runs in a clustered environment. • MapReduce – programming paradigm for running processes over a clustered environments. 48 http://brillix.co.il
  • 49. HDFS is... • A distributed file system • Redundant storage • Designed to reliably store data using commodity hardware • Designed to expect hardware failures • Intended for large files • Designed for batch inserts • The Hadoop Distributed File System 49 http://brillix.co.il
  • 50. MapReduce is... • A programming model for expressing distributed computations at a massive scale • An execution framework for organizing and performing such computations • An open-source implementation called Hadoop 50 http://brillix.co.il
  • 51. MapReduce is good for... • Embarrassingly parallel algorithms • Summing, grouping, filtering, joining • Off-line batch jobs on massive data sets • Analyzing an entire large dataset 51 http://brillix.co.il
  • 52. MapReduce is OK for... • Iterative jobs (i.e., graph algorithms) • Each iteration must read/write data to disk • IO and latency cost of an iteration is high 52 http://brillix.co.il
  • 53. MapReduce is NOT good for... • Jobs that need shared state/coordination • Tasks are shared-nothing • Shared-state requires scalable state store • Low-latency jobs • Jobs on small datasets • Finding individual records 53 http://brillix.co.il
  • 54. Spark • Fast and general MapReduce-like engine for large-scale data processing • Fast • In memory data storage for very fast interactive queries Up to 100 times faster then Hadoop • General • Unified platform that can combine: SQL, Machine Learning , Streaming , Graph & Complex analytics • Ease of use • Can be developed in Java, Scala or Python • Integrated with Hadoop • Can read from HDFS, HBase, Cassandra, and any Hadoop data source. 54
  • 55. Key Concepts 55 Resilient Distributed Datasets • Collections of objects spread across a cluster, stored in RAM or on Disk • Built through parallel transformations • Automatically rebuilt on failure Operations • Transformations (e.g. map, filter, groupBy) • Actions (e.g. count, collect, save) Write programs in terms of transformations on distributed datasets
  • 56. Unified Platform • Continued innovation bringing new functionality, e.g.: • Java 8 (Closures, LambaExpressions) • Spark SQL (SQL on Spark, not just Hive) • BlinkDB(Approximate Queries) • SparkR(R wrapper for Spark) 56
  • 57. Big Data and NoSQL http://brillix.co.il57
  • 58. The Challenge • We want scalable, durable, high volume, high velocity, distributed data storage that can handle non-structured data and that will fit our specific need • RDBMS is too generic and doesn’t cut it any more – it can do the job but it is not cost effective to our usages 58 http://brillix.co.il
  • 59. The Solution: NoSQL • Let’s take some parts of the standard RDBMS out to and design the solution to our specific uses • NoSQL databases have been around for ages under different names/solutions 59 http://brillix.co.il
  • 60. Example Comparison: RDBMS vs. Hadoop 60 Typical Traditional RDBMS Hadoop Data Size Gigabytes Petabytes Access Interactive and Batch Batch – NOT Interactive Updates Read / Write many times Write once, Read many times Structure Static Schema Dynamic Schema Scaling Nonlinear Linear Query Response Time Can be near immediate Has latency (due to batch processing) http://brillix.co.il
  • 61. Best Used For:  Structured or Not (Flexibility)  Scalability of Storage/Compute  Complex Data Processing  Cheaper compared to RDBMS Relational Database Best Used For:  Interactive OLAP Analytics (<1sec)  Multistep Transactions  100% SQL Compliance Best when used together Hadoop And Relational Database 61 http://brillix.co.il
  • 62. The NOSQL Movement • NOSQL is not a technology – it’s a concept • We need high performance, scale out abilities or agile structure • We are willing to sacrifice our sacred database cows: consistency, transactions, durability • Over 150 different brands and solutions (http://nosql-database.org/). 62 http://brillix.co.il
  • 63. Is NoSQL a RDMS Replacement? NO 63 Well... Sometimes it does… http://brillix.co.il
  • 64. NoSQL Taxonomy Type Examples Key-Value Store Document Store Column Store Graph Store http://brillix.co.il64
  • 65. Key Value Store • Distributed hash tables • Very fast to get a single value • Examples: • Amazon DynamoDB • Berkeley DB • Redis • Riak • Cassandra 65 http://brillix.co.il
  • 66. Document Store • Similar to Key/Value, but value is a document • JSON or something similar, flexible schema • Agile technology • Examples: • MongoDB • CouchDB • CouchBase 66 http://brillix.co.il
  • 67. What is a Column Store Database? • Column Store databases are management systems that uses data managed in a columnar structure format for better analysis of single column data (i.e. aggregation). Data is saved and handled as columns instead of rows. • Examples: • HP Vertica • Pivotal (EMC) GreenPlum • Hadoop Hbase • Amazon’s SimpleDB • Cassandra http://brillix.co.il67
  • 68. Query Data • When we query data, records are read at the order they are organized in the physical structure • Even when we query a single column, we still need to read the entire table and extract the column Row 1 Row 2 Row 3 Row 4 Col 1 Col 2 Col 3 Col 4 Select Col2 From MyTable Select * From MyTable http://brillix.co.il68
  • 69. How Does Column Stores Keep Data Organization in row store Organization in column store http://brillix.co.il69 Select Col2 From MyTable
  • 70. Row Format vs. Column Format http://brillix.co.il71
  • 71. Graph Store • Inspired by the graph theory • Data model: nodes, relationships, properties on both sides • Relational database have a hard time to represent a graph in the Database • Example: • Neo4j • InfiniteGraph • RDF 72 http://brillix.co.il
  • 73. Conclusion • We do Big Data to gain Value. Without value, there is no Big Data • Handling Big Data is a challenge – we talked about who uses it, when and where • Hadoop is a solution for Big Data usages but it’s not a magical solution • NoSQL, NewSQL and RDBMS are all solutions we can integrate for different usages • New organizational positions: cluster devops and data scientist. http://brillix.co.il74
  • 75. Thank You Zohar Elkayam twitter: @realmgic Zohar@Brillix.co.il www.realdbamagic.com http://brillix.co.il76