1 hbasecon.com
HBASEATBLOOMBERG//
THE EVOLUTION
OF BLOOMBERG
DATA SYSTEMSMEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY
MAY // 07 // 2015
HBASEATBLOOMBERG//
BLOOMBERG
3
Leading Data and Analytics provider to the financial industry
HBASEATBLOOMBERG//
DATA IS OUR BUSINESS
4
HBASEATBLOOMBERG//
September 28: Full Workshop at Bloomberg
September 30: Showcase at Strata Hadoop
Call for papers at:
bloomberglabs.com/data-science
DATA SCIENCE
FOR SOCIAL GOOD:
GOVERNMENT INNOVATION,
PUBLIC HEALTH, ENVIRONMENT,
EDUCATION
HBASEATBLOOMBERG//
6
‱ We have a “medium data” problem

‱ Speed and availability are paramount
‱ Hundreds of thousands of users with
expensive requests
We’ve built many systems
to address
DATA MANAGEMENT TODAY
HBASEATBLOOMBERG//
DATA MANAGEMENT CHALLENGES
7
‱ Single security
analytics on Big Iron
‱ Replication of
Systems and Data
‱ Complexity kills
Top 500 Supercomputer list, 2013
>96% Linux. 100% of top 40.
HBASEATBLOOMBERG//
DATA MANAGEMENT TOMORROW
8
‱ Simplicity and
performance
‱ Benefit from external
developments
‱ Retain our
independence
‱ Details matter
HBASEATBLOOMBERG//
THE PREMISE
9
‱ Can apply big data techniques to our medium
data problem, by addressing gaps in existing
open systems
‱ HBase is a good bet
‱ Part of a broader whole
‱ The Biggest community wins
HBASEATBLOOMBERG//
CHALLENGES
Our requirements from HBase are:
‱ Read performance – fast with low variability
‱ High availability
‱ Operational simplicity
‱ Efficient use of good hardware
‱ Expressive power
Bloomberg has been investing in all these
aspects of HBase
HBASEATBLOOMBERG//
WE’VE MADE THAT BET
11
HBASEATBLOOMBERG//
WE’RE NOT THE ONLY ONES
12
Google Cloud Bigtable
HBASEATBLOOMBERG//
AIMING HIGHER
We can make things better
by working together
Let’s be the gold standard
HBASEATBLOOMBERG//
14
>>>>>>>>>>>>>>
CALL TO ACTION
HBASEATBLOOMBERG//
FURTHER BOLSTER RELIABILITY
16
Great strides such as HBASE-10070 but more to do
‱ Improved reconciliation of
state between Master,
META and ZK
‱ More determinism in
Admin/Master operations
HBASEATBLOOMBERG//
BENEFIT FROM MODERN HARDWARE
17
‱ 32 cores - 256GB RAM – SSD - untapped potential
‱ CPU load max 20% , inadequate throughput
‱ Multi-RS administratively painful
‱ Much better story with memory
HBASEATBLOOMBERG//
IMPROVE MULTI-TENANCY
18
‱ Mixed workloads challenging
‱ interactive vs batch
‱ read vs write
‱ different read access
patterns
‱ Many solutions in progress
‱ Administrative simplicity is key
HBASEATBLOOMBERG//
SPARK INTEGRATION
19
‱ Analytical frameworks need a distributed database
‱ Columnar file format != column database
‱ Integrate with HBase to move towards the
universal database
HBASEATBLOOMBERG//
ANALYTICS: EFFICIENCY
20
‱ Choice of row and columnar storage engines
‱ Expose primitives for efficiency:
‱ Column pruning
‱ Predicate pushdowns
‱ Data locality
HBASEATBLOOMBERG//
THE FUTURE IS BRIGHT
21
‱ The state of the “Hadoop Database” union is strong
– Increasing adoption
– Strong foundation
– Great community
‱ Prominent role in the data & analytics platform of
the future
‱ Let’s go create the future
>>>>>>>>>>>>>>
THANK YOU
23 hbasecon.com

HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg

Editor's Notes

  • #2 Welcome everyone! Today is going to be fantastic and we have quite the agenda for you. In the interest of time, lets dive right in.
  • #24 Welcome everyone! Today is going to be fantastic and we have quite the agenda for you. In the interest of time, lets dive right in.