State of HBase: Meet the Release Managers

•Download as PPTX, PDF•

1 like•3,093 views

HBase release managers Lars Hofhansl, Andrew Purtell, Enis Soztutar, Michael Stack, and Liyin Tang jointly present highlights from their releases, and take your questions throughout.

State of HBase
Invasion of the Release Managers

Release Managers
• 0.94 Lars Hofhansl
• 0.96 Michael Stack
• 0.98 Andrew Purtell
• 1.0 Enis Söztutar

0.94 Attributes
• Frequent bug fix releases (monthly)
• Still sees minor features
• Support for Hadoop 1, 2.0.x, Java 6 and 7
• Old (0.92) DNA, no protobufs, old AM

0.94 State
• Current release 0.94.19
• Will have a few more releases
• Many large production installs out there
• Super stable and battle hardened
• EOL? Downtime for upgrade to 0.96+

• The “Singularity”
o Released 10/19/2013
o 18 months in the making
o 2k issues fixed/1500 in 0.96 only
• Big Themes
o Stability
o Operability
o Scaling
https://www.flickr.com/photos/sysli/3026288256/sizes/q/in/photostream/
Attributes

• Currently 0.96.2
• Maybe 0.96.3, but EOL’ing => 0.98.x!
• In CDH 5.0.x (0.96.1.1)/HDP 2.0.x
State

0.98 Attributes
• Major themes
o Security
o Evolution
o Performance improvements
o API cleanups/deprecations on the road to HBase 1.0
• Monthly release schedule
• Support for Hadoop 1 and 2, but focus is on
Hadoop 2; Java 6 and 7

0.98 State
• Current release 0.98.2
• Field testing for 1.0
o Expect incremental additive feature evolution
o HFile V3 and dependent features experimental until
1.0
• Seamless upgrade from 0.96
• CDH 5.1.x (not out yet)/HDP 2.1.x

1.0 Attributes
• Stability of 0.96 / 0.98 line
• API cleanup
o Table / Connection
o Annotation of what is public
o Replication / Coprocessor APIs
• Semantic improvements
o Security / ACLs
o SeqId

1.0 Attributes
• Masters become region servers
o (Optional) only system tables are hosted in active
master
• Cell level ACL / HFile v3 completion
• Dist log replay enabled by default
• Perf improvements

1.0 State
• Planned a couple of 0.99.x releases
o A developer releases which won’t be supported
o Summer timeframe
o 0.99.x will become 1.0.0
• Use semantic versioning afterwards
o Major, minor, and patch releases
o More frequent major releases

Speaker: Jean-Daniel Cryans (Cloudera) HBase Replication has come a long way since its inception in HBase 0.89 almost four years ago. Today, master-master and cyclic replication setups are supported; many bug fixes and new features like log compression, per-family peers configuration, and throttling have been added; and a major refactoring has been done. This presentation will recap the work done during the past four years, present a few use cases that are currently in production, and take a look at the roadmap.

HBaseCon 2013: How to Get the MTTR Below 1 Minute and More

Cloudera, Inc.

This document discusses ways to reduce the mean time to recovery (MTTR) in HBase to below 1 minute. It outlines improvements made to failure detection, region reassignment, and data recovery processes. Faster failure detection is achieved by lowering ZooKeeper timeouts to 30 seconds from 180. Region reassignment is made faster through parallelism. Data recovery is improved by rewriting the recovery process to directly write edits to regions instead of HDFS. These changes have reduced recovery times from 10-15 minutes to less than 1 minute in tests.

HBase: Where Online Meets Low Latency

HBaseCon

This document summarizes a presentation about optimizing for low latency in HBase. It discusses how to measure latency, the write and read paths in HBase, sources of latency like garbage collection and compactions, and techniques for reducing latency like streaming puts, block caching, and timeline consistency. The key points are that single puts can achieve millisecond latency while garbage collection and machine failures can cause pauses of 10s of milliseconds to seconds, and optimizing for the "magical 1%" of requests after the 99th percentile is important to improve average latency.

Cross-Site BigTable using HBase

HBaseCon

Speakers: Jingcheng Du and Ramkrishna Vasudevan (Intel) As HBase continues to expand in application and enterprise or government deployments, there is a growing demand for storing data across geographically distributed datacenters for improved availability and disaster recovery. The Cross-Site BigTable extends HBase to make it well-suited for such deployments, providing the capabilities of creating and accessing HBase tables that are partitioned and asynchronously backed-up over a number of distributed datacenters. This talk reveals how the Cross-Site BigTable manages data access over multiple datacenters and removes the data center itself as a single point of failure in geographically distributed HBase deployments.

HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...

Cloudera, Inc.

HBase Coprocessors allow user code to be run on region servers within each region of an HBase table. Coprocessors are loaded dynamically and scale automatically as regions are split or merged. They provide hooks into various HBase operations via observer classes and define an interface for custom endpoint calls between clients and servers. Examples of use cases include secondary indexes, filters, and replacing MapReduce jobs with server-side processing.

HBase: Extreme Makeover

HBaseCon

Speaker: Vladimir Rodionov (bigbase.org) This talks introduces a totally new implementation of a multilayer caching in HBase called BigBase. BigBase has a big advantage over HBase 0.94/0.96 because of an ability to utilize all available server RAM in the most efficient way, and because of a novel implementation of a L3 level cache on fast SSDs. The talk will show that different type of caches in BigBase work best for different type of workloads, and that a combination of these caches (L1/L2/L3) increases the overall performance of HBase by a very wide margin.

HBase and HDFS: Understanding FileSystem Usage in HBase

enissoz

This document discusses file system usage in HBase. It provides an overview of the three main file types in HBase: write-ahead logs (WALs), data files, and reference files. It describes durability semantics, IO fencing techniques for region server recovery, and how HBase leverages data locality through short circuit reads, checksums, and block placement hints. The document is intended help understand HBase's interactions with HDFS for tuning IO performance.

HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.

Cloudera, Inc.

This document provides an overview of operating HBase clusters in production environments. It discusses leveraging existing knowledge of distributed systems, getting metrics set up using tools like Ganglia and OpenTSDB, automating tasks with Puppet, Chef and Fabric, setting up alerting with Nagios and Zabbix, and different backup strategies for HBase including offline distcp backups, replication to another cluster, and using HBase snapshots. The goals are to help operations teams understand how to manage HBase and empower them to work with their own operations organizations.

Apache HDFS, the file system on which HBase is most commonly deployed, was originally designed for high-latency high-throughput batch analytic systems like MapReduce. Over the past two to three years, the rising popularity of HBase has driven many enhancements in HDFS to improve its suitability for real-time systems, including durability support for write-ahead logs, high availability, and improved low-latency performance. This talk will give a brief history of some of the enhancements from Hadoop 0.20.2 through 0.23.0, discuss some of the most exciting work currently under way, and explore some of the future enhancements we expect to develop in the coming years. We will include both high-level overviews of the new features as well as practical tips and benchmark results from real deployments.

hbaseconasia2017: Large scale data near-line loading method and architecture

HBaseCon

This document proposes a read-write split near-line data loading method and architecture to: - Increase data loading performance by separating write operations from read operations. A WriteServer handles write requests and loads data to HDFS to be read from by RegionServers. - Control resources used by write operations to ensure read operations are not starved of resources like CPU, network, disk I/O, and handlers. - Provide an architecture corresponding to Kafka and HDFS for streaming data from Kafka to HDFS to be loaded into HBase in a delayed manner. - Include optimizations like task balancing across WriteServer slaves, prioritized compaction of small files, and customizable storage engines. - Report test results showing one Write

hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

HBaseCon

Ashish Singhi HBase Disaster recovery solution aims to maintain high availability of HBase service in case of disaster of one HBase cluster with very minimal user intervention. This session will introduce the HBase disaster recovery use cases and the various solutions adopted at Huawei like. a) Cluster Read-Write mode b) DDL operations synchronization with standby cluster c) Mutation and bulk loaded data replication d) Further challenges and pending work hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#

HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket

Cloudera, Inc.

Solbase is an exciting new open-source, real-time search engine being developed at Photobucket to service the over 30 million daily search requests Photobucket handles. Solbase replaces Lucene’s file system-based index with HBase. This allows the system to update in real-time and linearly scale to serve millions of daily search requests on a large dataset. This session will explore the architecture of Solbase as well as some of Lucene/Solr’s inherent issues we overcame. Finally, we’ll go over performance metrics of Solbase against production traffic.

HBase Accelerated: In-Memory Flush and Compaction

DataWorks Summit/Hadoop Summit

HBase Accelerated introduces an in-memory flush and compaction pipeline for HBase to improve performance of real-time workloads. By keeping data in memory longer and avoiding frequent disk flushes and compactions, it reduces I/O and improves read and scan latencies. Evaluation on workloads with high update rates and small working sets showed the new approach significantly outperformed the default HBase implementation by serving most data from memory. Work is ongoing to further optimize the in-memory representation and memory usage.

HBaseCon 2015: HBase Operations at Xiaomi

HBaseCon

Meet hbase 2.0

enissoz

HBase 2.0 is the next stable major release for Apache HBase scheduled for early 2017. It is the biggest and most exciting milestone release from the Apache community after 1.0. HBase-2.0 contains a large number of features that is long time in the development, some of which include rewritten region assignment, perf improvements (RPC, rewritten write pipeline, etc), async clients, C++ client, offheaping memstore and other buffers, Spark integration, shading of dependencies as well as a lot of other fixes and stability improvements. We will go into technical details on some of the most important improvements in the release, as well as what are the implications for the users in terms of API and upgrade paths. Existing users of HBase/Phoenix as well as operators managing HBase clusters will benefit the most where they can learn about the new release and the long list of features. We will also briefly cover earlier 1.x release lines and compatibility and upgrade paths for existing users and conclude by giving an outlook on the next level of initiatives for the project.

Meet HBase 1.0

enissoz

The document summarizes the HBase 1.0 release which introduces major new features and interfaces including a new client API, region replicas for high availability, online configuration changes, and semantic versioning. It describes goals of laying a stable foundation, stabilizing clusters and clients, and making versioning explicit. Compatibility with earlier versions is discussed and the new interfaces like ConnectionFactory, Connection, Table and BufferedMutator are introduced along with examples of using them.

HBaseCon 2015: HBase Performance Tuning @ Salesforce

HBaseCon

HBase Advanced - Lars George

JAX London

Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight

HBaseCon

Nitin Verma, Pravin Mittal, and Maxim Lukiyanov (Microsoft) This session presents our success story of enabling a big internal customer on Microsoft Azure’s HBase service along with the methodology and tools used to meet high-throughput goals. We will also present how new features in HBase (like BucketCache and MultiWAL) are helping our customers in the medium-latency/high-bandwidth cloud-storage scenario.

HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce

Cloudera, Inc.

The strength of an open source project resides entirely in its developer community; a strong democratic culture of participation and hacking makes for a better piece of software. The key requirement is having developers who are not only willing to contribute, but also knowledgeable about the project’s internal structure and architecture. This session will introduce developers to the core internal architectural concepts of HBase, not just “what” it does from the outside, but “how” it works internally, and “why” it does things a certain way. We’ll walk through key sections of code and discuss key concepts like the MVCC implementation and memstore organization. The goal is to convert serious “HBase Users” into HBase Developer Users”, and give voice to some of the deep knowledge locked in the committers’ heads.

HBase Applications - Atlanta HUG - May 2014

larsgeorge

Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments

DataWorks Summit

hbaseconasia2017: HBase在Hulu的使用和实践

HBaseCon

Qianxi Zhang 1. Hulu是美国最受欢迎的在线视频网站之一，Hulu Beijing是Hulu第二大研发中心。北京大数据基础架构团队负责整个公司的大数据基础架构的研发和运维。 2. HBase在Hulu的概况 3. HBase在Hulu的使用 4. 用户画像系统，存放所有用户的基本信息，用户行为，第三方DMP数据和机器学习结果标签(几十万个Qualifier)，Spark和Spark Streaming读写HBase数据，运行各种机器学习模型，为公司的视频推荐，精准广告和Marketing团队服务 5. HBase在Hulu的优化 hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#

Rigorous and Multi-tenant HBase Performance Measurement

DataWorks Summit

The document discusses techniques for rigorously measuring HBase performance in both standalone and multi-tenant environments. It begins with an overview of HBase and the Yahoo! Cloud Serving Benchmark (YCSB) for evaluating databases. It then discusses best practices for cluster setup, data loading, and benchmarking techniques like warming the cache, setting target throughput, and using appropriate workloads. Finally, it covers challenges in measuring HBase performance when used alongside other frameworks like MapReduce and Solr in a multi-tenant setting.

Configuration Management - The Operations Managers View

Stephen Thair

Mori 2004 Iew

FNian

The document describes Project Phoenix, a project to develop an integrated assessment model to analyze global warming impacts, mitigation, and adaptation. The model will be multi-region and multi-sector, and include an economic model integrated with an energy flow model. It will assess climate change impacts on areas like food and water resources. The project will develop scenarios using a cross-impact method to consistently incorporate key social and technological factors. Preliminary model simulations assess economic impacts of carbon emission policies under different scenarios.

Web Performance Optimisation at times.co.uk

Stephen Thair

Optimizing dynamic websites like www.thetimes.co.uk and www.thesundaytimes.co.uk isn't an easy task! Speeding up a site requires a "war plan" and having a clear vision, dedicated team, appropriate tools and most importantly speed comparison data with similar sites. Mehdi Ali, Optimisation Manager for the Times websites, will show us how this strategy was applied for The Times and Sunday Times sites with great results.

Continuous Delivery Maturity Model

IBM UrbanCode Products

What's hot

Operating and supporting HBase Clusters

enissoz

Digital Library Collection Management using HBase

HBaseCon

HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera

Cloudera, Inc.

hbaseconasia2017: Large scale data near-line loading method and architecture

HBaseCon

hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

HBaseCon

HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket

Cloudera, Inc.

HBase Accelerated: In-Memory Flush and Compaction

DataWorks Summit/Hadoop Summit

HBaseCon 2015: HBase Operations at Xiaomi

HBaseCon

Meet hbase 2.0

enissoz

Meet HBase 1.0

enissoz

HBaseCon 2015: HBase Performance Tuning @ Salesforce

HBaseCon

HBase Advanced - Lars George

JAX London

Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight

HBaseCon

HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce

Cloudera, Inc.

HBase Applications - Atlanta HUG - May 2014

larsgeorge

Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments

DataWorks Summit

hbaseconasia2017: HBase在Hulu的使用和实践

HBaseCon

Rigorous and Multi-tenant HBase Performance Measurement

DataWorks Summit

What's hot (18)

Operating and supporting HBase Clusters

Digital Library Collection Management using HBase

HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera

hbaseconasia2017: Large scale data near-line loading method and architecture

hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket

HBase Accelerated: In-Memory Flush and Compaction

HBaseCon 2015: HBase Operations at Xiaomi

Meet hbase 2.0

Meet HBase 1.0

HBaseCon 2015: HBase Performance Tuning @ Salesforce

HBase Advanced - Lars George

Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight

HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce

HBase Applications - Atlanta HUG - May 2014

Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments

hbaseconasia2017: HBase在Hulu的使用和实践

Rigorous and Multi-tenant HBase Performance Measurement

Viewers also liked

Configuration Management - The Operations Managers View

Stephen Thair

Mori 2004 Iew

FNian

Web Performance Optimisation at times.co.uk

Stephen Thair

Continuous Delivery Maturity Model

IBM UrbanCode Products

Continuous Integration & the Release Maturity Model

Watch the recorded version of this Webinar here: Curious about Continuous Integration? Tune in! Continuous Integration (CI), which is a big part of continuous delivery, is the concept of continuously building and testing software using an automated process. We have learned that utilizing CI could help us catch bugs earlier, enable better visibility, reduce repetitive processes, enable the development team to produce deployable products at a moment's notice, and reduce risk overall. These slides will identify the various levels of continuous integration and delivery with regards to a release maturity of the development team or parent organization.

Keynote: The Phoenix Project: Lessons Learned - PuppetConf 2014

Puppet

This document summarizes a presentation about DevOps lessons learned. It discusses that DevOps can provide higher business value than expected, benefits both development and operations teams, and requires high-trust management. It also notes that DevOps is applicable to all types and sizes of organizations, not just large tech companies, and shares an example of a COBOL application adopting DevOps practices across 20 technology stacks. The presentation encourages attendees to learn more about an upcoming DevOps Enterprise Summit for organizations applying DevOps.

HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...

Cloudera, Inc.

This document summarizes Berk D. Demir's design for a content addressable storage system to store and serve large amounts of static assets with low latency, high availability, and without data duplication. The key aspects of the design are: 1) Using HBase as the underlying distributed database to store immutable rows of metadata and blob content in a single table with different column families based on access patterns. 2) Addressing content via a cryptographic hash of the content rather than a database key to allow immutable and deduplicated storage. 3) Serving the stored content via HTTP using common verbs and headers to provide a simple interface for clients.

HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon

Cloudera, Inc.

Determining the number of unique users that have interacted with a web page, game, or application is a very common use case. HBase is becoming an increasingly accepted tool for calculating sets or counts of unique individuals who meet some criteria. Computing these statistics can range in difficulty from very simple to very difficult. This session will explore how different approaches have worked or not worked at scale for counting uniques on HBase with Hadoop.

HBaseCon 2012 | Scaling GIS In Three Acts

Cloudera, Inc.

The document is a presentation titled "Scaling GIS in 3 Acts - Lightning Edition" presented by Nick Dimiduk on May 22, 2012. It discusses scaling geographic information systems (GIS) in three acts: Act I defines what GIS is and that it involves data on maps; Act II discusses what can be done with GIS such as geospatial queries and non-Euclidean geometry; Act III covers implementing GIS on HBase including spatial partitioning and indices.

HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...

Cloudera, Inc.

HBase application developers face a number of challenges: schema management is performed at the application level, decoupled components of a system can break one another in unexpected ways, less-technical users cannot easily access data, and evolving data collection and analysis needs are difficult to plan for. In this talk, we describe a schema management methodology based on Apache Avro that enables users and applications to share data in HBase in a scalable, evolvable fashion. By adopting these practices, engineers independently using the same data have guarantees on how their applications interact. As data collection needs change, applications are resilient to drift in the underlying data representation. This methodology results in a data dictionary that allows less-technical users to understand what data is available to them for analysis and inspect data using general-purpose tools (for example, export it via Sqoop to an RDBMS). And because of Avro’s cross-language capabilities, HBase’s power can reach new domains, like web apps built in Ruby.

HBaseCon 2012 | HBase for the Worlds Libraries - OCLC

Cloudera, Inc.

WorldCat is the world’s largest network of library content and services. Over 25,000 libraries in 170 countries have cooperated for 40 years to build WorldCat. OCLC is currently in the process of transitioning Worldcat from Oracle to Apache HBase. This session will discuss our data design for representing the constantly changing ownership information for thousands of libraries (billions of data points, millions of daily updates) and our plans for how we’re managing HBase in an environment that is equal parts end user facing and batch.

HBaseCon 2012 | Building Mobile Infrastructure with HBase

Cloudera, Inc.

In this session you will learn the common mistakes made when deploying a high write environment when building an analytics database in HBase, as well as tips on how to diagnose and debug performance bottlenecks, and an overview of an open source monitoring utility developed at Urban Airship for finding HBase hotspots. This session will also present a case study on how Urban Airship replaced a tag system running on a highly sharded PostgreSQL cluster to HBase, the options explored to create a high throughput Boolean tag system and how it was ultimately built on HBase.

HBaseCon 2013: Being Smarter Than the Smart Meter

Cloudera, Inc.

The document discusses a smart meter analytics platform that was acquired by Oracle in 2012. It uses HBase for distributed storage of smart meter data and provides real-time and batch analytics capabilities. Filters and datasets allow configurable querying of time series data stored in HBase to power a user interface and exploratory analytics. The platform can scale to support the analytics needs of large utilities with billions of data points from millions of smart meters stored across multiple HBase clusters.

HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...

Cloudera, Inc.

This document discusses how NextBio leverages HBase to store and access the world's largest curated genomic data collection. It describes using HBase to store billions of genomic variants and correlations. Specific use cases show how variant and correlation data are partitioned across different HBase tables and accessed using pagination, keys and filters. Lessons learned are that HBase works well for immutable data with high insert rates and big data volumes when intelligence is incorporated into the keys.

HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!

Cloudera, Inc.

For Map/Reduce programmers used to HDFS, the mutability of HBase tables poses new challenges: Data can change over the duration of a job, multiple jobs can write concurrently, writes are effective immediately, and it is not trivial to clean up partial writes. Revision Manager introduces atomic commits and point-in-time consistent snapshots over a table, guaranteeing repeatable reads and protection from partial writes. Revision Manager is optimized for a relatively small number of concurrent write jobs, which is typical within Hadoop clusters. This session will discuss the implementation of Revision Manager using ZooKeeper and coprocessors, and paying extra care to ensure security in multi-tenant clusters. Revision Manager is available as part of the HBase storage handler in HCatalog, but can easily be used stand-alone with little coding effort.

HBaseCon 2013: Apache HBase on Flash

Cloudera, Inc.

This document discusses using flash storage for HBase deployments. It begins by explaining the basics of NAND flash memory. It then analyzes the performance of HBase on flash versus DRAM, finding that flash can support the larger working sets now common in HBase clusters. The document details several flash-optimized features for HBase, including short-circuit reads, the BucketCache, and minimizing write amplification. It concludes by considering opportunities to further optimize HBase for flash, such as reducing write amplification and making HDFS aware of different storage technologies.

HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN

HBaseCon

HBaseCon 2013: Rebuilding for Scale on Apache HBase

Cloudera, Inc.

The document discusses Simply Measured's migration from MongoDB to HBase for storing social media analytics data. Simply Measured initially used MongoDB but encountered scaling issues as their data grew beyond 10TB. They migrated to HBase using a dual write approach to ensure accuracy. HBase provided better scaling and stability. The migration process and challenges are described, such as schema design in HBase and performance tuning. Future plans including using Hive and HCatalog for easier querying are also mentioned.

HBase Read High Availability Using Timeline-Consistent Region Replicas

HBaseCon

Speakers: Enis Soztutar and Devaraj Das (Hortonworks) HBase has ACID semantics within a row that make it a perfect candidate for a lot of real-time serving workloads. However, single homing a region to a server implies some periods of unavailability for the regions after a server crash. Although the mean time to recovery has improved a lot recently, for some use cases, it is still preferable to do possibly stale reads while the region is recovering. In this talk, you will get an overview of our design and implementation of region replicas in HBase, which provide timeline-consistent reads even when the primary region is unavailable or busy.

HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase

Cloudera, Inc.

Valta is a resource management layer over Apache HBase that aims to address issues with shared workloads on a single HBase cluster. It introduces resource limits for HBase clients to prevent ill-behaved clients from monopolizing cluster resources. This is an initial step, and more work is needed to address request scheduling across HBase, HDFS, and lower layers to meet service level objectives. The document outlines ideas for full-stack request scheduling, auto-tuning systems based on high-level SLOs, and using multiple read replicas to improve latency.

Viewers also liked (20)

Configuration Management - The Operations Managers View

Mori 2004 Iew

Web Performance Optimisation at times.co.uk

Continuous Delivery Maturity Model

Continuous Integration & the Release Maturity Model

Keynote: The Phoenix Project: Lessons Learned - PuppetConf 2014

HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...

HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon

HBaseCon 2012 | Scaling GIS In Three Acts

HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...

HBaseCon 2012 | HBase for the Worlds Libraries - OCLC

HBaseCon 2012 | Building Mobile Infrastructure with HBase

HBaseCon 2013: Being Smarter Than the Smart Meter

HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...

HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!

HBaseCon 2013: Apache HBase on Flash

HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN

HBaseCon 2013: Rebuilding for Scale on Apache HBase

HBase Read High Availability Using Timeline-Consistent Region Replicas

HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase

Similar to State of HBase: Meet the Release Managers

Hadoop Versioning

Hanborq Inc.

This document discusses the versioning conventions and history of Hadoop releases. It notes that features were occasionally developed on branches off the trunk codeline and that some releases included features from different branches, causing confusion. It also summarizes the status of Hadoop 1.0, which unified many previously separated features, and the versioning of the Cloudera CDH distribution in relation to Apache Hadoop releases.

Using LuaJIT in mid-load web-projects

Alexander Gladysh

This document discusses using Lua and LuaJIT for web applications and services. It describes the author's stack which uses LuaJIT, nginx, Redis, and other technologies. Performance of over 160M requests per day on EX6 servers is achieved. Some challenges encountered include Redis explosions, HDD reliability issues, and LuaRocks package management problems. The next steps discussed are improving performance further with coroutines, transitioning to LuaJIT FFI, and simplifying the architecture.

How static analysis supports quality over 50 million lines of C++ code

cppfrug

Into The Box 2020 Keynote Day 1

Ortus Solutions, Corp

This document provides an overview of a virtual ITB conference keynote on Day 1. It includes information about the hosts and speakers, as well as summaries of topics to be covered such as ColdBox 6 features including cbFutures, Task Executors, RestHandler, Singleton Renderer, and a new debugging experience called Whoops!. It also provides updates on projects like CommandBox, ForgeBox, and upcoming courses.

Tuenti Release Workflow

Tuenti

At Tuenti, we do two code pushes per week, sometimes modifying thousands of files and running thousands of automated tests and build operations before, to ensure not only that the code works but also that proper localization is applied, bundles are generated and files get deployed to hundreds of servers as fast and reliable as possible. We use opensource tools like Mercurial, MySQL, Jenkins, Selenium, PHPUnit and Rsync among our own in-house ones, and have different development, testing, staging and production environments. We had to fight with problems like statics bundling and versioning, syntax errors and of course the fact that we have +100 engineers working on the codebase, sometimes merging and releasing more than a dozen branches the same day. We also switched from Subversion to Mercurial to obtain more flexibility and faster branching operations. With this talk we will explain the process of how code changes in ourcode repository end up in live code, detailing some practices and tips that we apply.

Apache HBase: State of the Union

DataWorks Summit/Hadoop Summit

- The document summarizes the state of Apache HBase, including recent releases, compatibility between versions, and new developments. - Key releases include HBase 1.1, 1.2, and 1.3, which added features like async RPC client, scan improvements, and date-tiered compaction. HBase 2.0 is targeting compatibility improvements and major changes to data layout and assignment. - New developments include date-tiered compaction for time series data, Spark integration, and ongoing work on async operations, replication 2.0, and reducing garbage collection overhead.

HBase state of the union

enissoz

Tuenti Release Workflow v1.1

Tuenti

At Tuenti, we do 3 code pushes per week, sometimes modifying thousands of files and running thousands of automated tests and build operations before, to ensure not only that the code works but also that proper localization is applied, bundles are generated and files get deployed to hundreds of servers as fast and reliable as possible. We use opensource tools like Mercurial, MySQL, Jenkins, Selenium, PHPUnit and Rsync among our own in-house ones, and have different development, testing, staging and production environments. We had to fight with problems like statics bundling and versioning, syntax errors and of course the fact that we have +100 engineers working on the codebase, merging and releasing more than a 15 branches the same day. We also switched from Subversion to Mercurial to obtain more flexibility and faster branching operations. With this talk we will explain the process of how code changes in ourcode repository end up in live code, detailing some practices and tips that we apply, problems we had and how we solved them.

HBaseCon 2015 General Session: State of HBase

HBaseCon

InfluxData Internals by Ryan Betts

InfluxData

InfluxData builds a time series platform primarily deployed for DevOps and IoT monitoring. This talk presents several lessons learned while scaling the platform across a large number of deployments—from single server open source instances to highly available high-throughput clusters. This talk presents a number of failure conditions that informed subsequent design choices. Ryan Betts (Director of Engineering at InfluxData) will discuss designing backpressure in an AP system with tens of thousands of resource-limited writers; trade-offs between monolithic and service-oriented database implementations; and lessons learned implementing multiple query processing systems.

What's up with HTTP?

Mark Nottingham

Get the Facts: Oracle's Unbreakable Enterprise Kernel

Terry Wang

1) Oracle introduced the Unbreakable Enterprise Kernel for Oracle Linux, which is optimized for Oracle software and provides significant performance gains over the Red Hat compatible kernel. 2) The Unbreakable Enterprise Kernel includes many new features like improved power management, data integrity, and diagnostic tools. 3) Oracle recommends customers use the Unbreakable Enterprise Kernel for all Oracle software on Linux, though it will continue to support the Red Hat compatible kernel.

Distributions from the view a package

Colin Charles

Having spent more than the last decade being the main point of contact for distributions shipping MySQL, then MariaDB Server, it's clear that working with distributions have many challenges. Licensing changes (when MySQL moved the client libraries from LGPL to GPL with a FOSS Exception), ABI changes, speed (or lack thereof) of distribution releases/freezes, supporting the software throughout the lifespan of the distribution, specific bugs due to platforms, and a lot more will be discussed in this talk. Let's not forget the politics. How do we decide "tiers" of importance for distributions? As a bonus, there will be a focus on how much effort it took to "replace" MySQL with MariaDB. Benefits: if you're making a distribution, this is the point of view of the upstream package makers. Why are distribution statistics important to us? Do we monitor your bugs system or do you have a better escalation to us? How do we test to make sure things are going well before release. This and more will be spoken about. As an upstream project (package), we love nothing more than being available everywhere. But time and energy goes into making this is so as there are quirks in every distribution.

Hadoop Robot from eBay at China Hadoop Summit 2015

polo li

Chef for OpenStack: Grizzly Roadmap

Matt Ray

The document summarizes a meeting that was held to plan the roadmap for the Chef for OpenStack community for the Grizzly release. Key points discussed included the attendees, resources being used like GitHub repos and IRC channels, licensing, cookbook goals, the initial osops release focusing on Ubuntu and KVM/Nova network, and plans to expand support for additional operating systems, databases, hypervisors, OpenStack services and configurations. The roadmap also covered continued development of knife-openstack and providing a status update at the Fall 2013 OpenStack Summit.

Storm worker redesign

Roshan Naik

The document discusses proposed enhancements to improve the performance of Apache Storm 2.0. It analyzes the current messaging architecture and identifies bottlenecks. Preliminary testing shows the redesigned messaging architecture improves latency by 116x and increases throughput by 50% over Storm 1.0 and the 2.0 master branch. Further optimizations to grouping, tuple implementation, and the acking mechanism could potentially yield even higher throughput of 15 million tuples per second. A new threading and execution model is also proposed to improve performance.

Status of Embedded Linux

LinuxCon ContainerCon CloudOpen China

This document provides a summary of the status of embedded Linux. It discusses recent Linux kernel versions from 4.7 to the upcoming 4.12, highlighting new features. It also covers technology areas like boot time, device tree, graphics, file systems, and security. Several ongoing Linux Foundation projects are mentioned like Long Term Support Initiative, Fuego test framework, and the eLinux wiki. Finally, it lists upcoming conferences and trade associations working on embedded Linux. The document aims to give a quick overview of the current state of embedded Linux topics and projects.

The MySQL Server ecosystem in 2016

sys army

The document summarizes the history and current state of the MySQL database server ecosystem. It discusses the origins and development of MySQL, MariaDB, Percona Server, and other related projects. It also describes some of the key features and innovations in recent versions of these database servers. The ecosystem is very active with contributions from many organizations and the future remains promising with ongoing work.

OpenStack - JobShop @Iași, 2016

Alexandru Coman

This document provides an overview of OpenStack, an open-source cloud computing platform. It discusses OpenStack's components, development process using Gerrit, and integration testing. It also covers using Windows as a guest operating system in OpenStack with Hyper-V, including Windows Cloud-Init tools and supported Windows versions. Key OpenStack components that support Windows like Nova, Neutron, Cinder, and Manila are summarized.

Meet Apache HBase - 2.0

DataWorks Summit

HBase 2.0 is the next stable major release for Apache HBase scheduled for early 2017. It is the biggest and most exciting milestone release from the Apache community after 1.0. HBase-2.0 contains a large number of features that is long time in the development, some of which include rewritten region assignment, perf improvements (RPC, rewritten write pipeline, etc), async clients, C++ client, offheaping memstore and other buffers, Spark integration, shading of dependencies as well as a lot of other fixes and stability improvements. We will go into technical details on some of the most important improvements in the release, as well as what are the implications for the users in terms of API and upgrade paths. Speaker Ankit Singhal, Member of Technical Staff, Hortonworks

Similar to State of HBase: Meet the Release Managers (20)

Hadoop Versioning

Using LuaJIT in mid-load web-projects

How static analysis supports quality over 50 million lines of C++ code

Into The Box 2020 Keynote Day 1

Tuenti Release Workflow

Apache HBase: State of the Union

HBase state of the union

Tuenti Release Workflow v1.1

HBaseCon 2015 General Session: State of HBase

InfluxData Internals by Ryan Betts

What's up with HTTP?

Get the Facts: Oracle's Unbreakable Enterprise Kernel

Distributions from the view a package

Hadoop Robot from eBay at China Hadoop Summit 2015

Chef for OpenStack: Grizzly Roadmap

Storm worker redesign

Status of Embedded Linux

The MySQL Server ecosystem in 2016

OpenStack - JobShop @Iași, 2016

Meet Apache HBase - 2.0

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes

HBaseCon

Zhiyong Bai As a high performance and scalable key value database, Zhihu use HBase to provide online data store system along with Mysql and Redis. Zhihu’s platform team had accumulated some experience in technology of container, and this time, based on Kubernetes, we build flexible platform of online HBase system, create multiple logic isolated HBase clusters on the shared physical cluster with fast rapid，and provide customized service for different business needs. Combined with Consul and DNS server, we implement high available access of HBase using client mainly written with Python. This presentation is mainly shared the architecture of online HBase platform in Zhihu and some practical experience in production environment. hbaseconasia2017 hbasecon hbase

hbaseconasia2017: HBase on Beam

HBaseCon

Jingcheng Du Apache Beam is an open source and unified programming model for defining batch and streaming jobs that run on many execution engines, HBase on Beam is a connector that allows Beam to use HBase as a bounded data source and target data store for both batch and streaming data sets. With this connector HBase can work with many batch and streaming engines directly, for example Spark, Flink, Google Cloud Dataflow, etc. In this session, I will introduce Apache Beam, and the current implementation of HBase on Beam and the future plan on this. hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#

hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest

HBaseCon

Tianying Chang HBase is used to serve online facing traffic in Pinterest. It means no downtime is allowed. However, we were on HBase 94. To upgrade to latest version, we need to figure out a way to live upgrade while keeping Pinterest site live. Recently, we successfully upgrade 94 HBase cluster to 1.2 with no downtime. We made change to both Asynchbase and HBase server side. We will talk about what we did and how we did it. We will also talk about the finding in config and performance tuning we did to achieve low latency. hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#

hbaseconasia2017: HareQL：快速HBase查詢工具的發展過程

HBaseCon

Mon-Fong Mike Jiang, Kuan-Yu Hubert Fan-Chiang and Tienyu Rebecca Lin 自2011年起，我們就開始使用HBase作為結構化大數據的儲存工具，主要是做為半導體製造設備參數的分析。為了有效進行數據查詢，我們開發Standard Query Language(SQL)的整合介面，最早的方式是(1)自行開發GUI操作介面及(2)透過自行定義SQL語法的方式進行，但是這樣會衍生出很多額外的工作，特別是SQL Parser與對應的HBase API的連結。為了解決此問題，我們解析了Hive QL Parser作為主要的核心，將此部分的原始碼整合進HareDB HBase Client之中，另外，也整合了HBase Coprocessor，可以加速查詢的進行，這個架構我們實際使用在數個半導體製造廠的大數據系統中，也展現了高查詢效率。除此之外，透過整合Kafka來處理串流數據的匯入，同時對於數據分析的呈現也加上Cube建立工具，這些都是實際開發大數據系統時陸續面對的問題與解決方法，我們將分享這一連串的系統開發過程。 hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#

hbaseconasia2017: Apache HBase at Netease

HBaseCon

This document summarizes Netease's use of Apache HBase for big data. It discusses Netease operating 7 HBase clusters with 200+ RegionServers and hundreds of terabytes of data across more than 40 applications. It outlines key practices for Linux system configuration, HBase schema design, garbage collection, and request queueing at the table level. Ongoing work includes region server grouping, inverted indexes, and improving high availability of HBase.

hbaseconasia2017: 基于HBase的企业级大数据平台

HBaseCon

Xinyu Zhang, Xueliang Chen and Zheng Fan 基于HBase的大数据平台已经成为中国人寿新一代综合业务处理系统中非常重要的基础性数据平台。目前基于该平台已经整合了上百TB的数据，并将几亿客户的客户、业务、接触数据整合到一个统一的数据模型中，并基于此形成了上千个客户标签。同时，基于该平台为客户、营销员和内部管理人员提供了销售支持、客户服务、运营支持等多类应用。通过APP、网页等形式提供了多种信息的检索和查询，并通过深度学习模型提供了反欺诈等方面的数据应用。 hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#

hbaseconasia2017: HBase at JD.com

HBaseCon

Xingbo Peng, Nan Zhang and Bang Wen 1.规模现状 HBase在京东CTO体系中经历了数年的发展，集群规模已经达到3000+台，支持了京东600+业务系统，京东CTO体系的HBase集群，已经经历了多次618和双11的考验。京东CTO体系是HBase的重要用户。 2.应用的业务场景介绍HBase在京东的典型应用的业务，包括监控、风控、推荐、广告等 3.高可用改进介绍我们在HBase集群高可用方面做的一些工作，包括跨机房容灾、多租户-资源分组、集群安全等 4.运维实践主要介绍我们在HBase集群运维上的一些实践，包括：HBase集群监控系统Mummut、报警系统、HBase集群与大数据平台结合、业务运营及数据迁移等 5.未来展望介绍我们正在基于HBase做的及未来要做的一些工作，包括：kylin、phoenix和容器化部署等 hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#

hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei

HBaseCon

CTBase is a lightweight HBase client designed for structured data use cases. It provides features like schematized tables, global secondary indexes, cluster tables for joins, and online schema changes. Tagram is a distributed bitmap index implementation on HBase that supports ad-hoc queries on low-cardinality attributes with millisecond latency. CloudTable Service offers HBase as a managed service on Huawei Cloud with features including easy maintenance, security, high performance, service level agreements, high availability and low cost.

hbaseconasia2017: HBase Practice At XiaoMi

HBaseCon

hbaseconasia2017: hbase-2.0.0

HBaseCon

HBaseCon2017 Democratizing HBase

HBaseCon

As HBase and Hadoop continue to become routine across enterprises, these enterprises inevitably shift priorities from effective deployments to cost-efficient operations. Consolidation of infrastructure, the sum of hardware, software, and system-administrator effort, is the most common strategy to reduce costs. As a company grows, the number of business organizations, development teams, and individuals accessing HBase grows commensurately, creating a not-so-simple requirement: HBase must effectively service many users, each with a variety of use-cases. This is problem is known as multi-tenancy. While multi-tenancy isn’t a new problem, it also isn’t a solved one, in HBase or otherwise. This talk will present a high-level view of the common issues organizations face when multiple users and teams share a single HBase instance and how certain HBase features were designed specifically to mitigate the issues created by the sharing of finite resources.

HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest

HBaseCon

HBase is used to serve online facing traffic in Pinterest. It means no downtime is allowed. However, we were on HBase 94. To upgrade to latest version, we need to figure out a way to live upgrade while keeping Pinterest site live. Recently, we successfully upgrade 94 HBase cluster to 1.2 with no downtime. We made change to both Asynchbase and HBase server side. We will talk about what we did and how we did it. We will also talk about the finding in config and performance tuning we did to achieve low latency.

HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase

HBaseCon

Hundreds of millions of people use Quora to find accurate, informative, and trustworthy answers to their questions. As it so happens, counting things at scale is both an important and a difficult problem to solve. In this talk, we will be talking about Quanta, Quora's counting system built on top of HBase that powers our high-volume near-realtime analytics that serves many applications like ads, content views, and many dashboards. In addition to regular counting, Quanta supports count propagation along the edges of an arbitrary DAG. HBase is the underlying data store for both the counting data and the graph data. We will describe the high-level architecture of Quanta and share our design goals, constraints, and choices that enabled us to build Quanta very quickly on top of our existing infrastructure systems.

HBaseCon2017 Transactions in HBase

HBaseCon

In the age of NoSQL, big data storage engines such as HBase have given up ACID semantics of traditional relational databases, in exchange for high scalability and availability. However, it turns out that in practice, many applications require consistency guarantees to protect data from concurrent modification in a massively parallel environment. In the past few years, several transaction engines have been proposed as add-ons to HBase; three different engines, namely Omid, Tephra, and Trafodion were open-sourced in Apache alone. In this talk, we will introduce and compare the different approaches from various perspectives including scalability, efficiency, operability and portability, and make recommendations pertaining to different use cases.

HBaseCon2017 Highly-Available HBase

HBaseCon

HBaseCon2017 Apache HBase at Didi

HBaseCon

In DiDi Chuxing Company, which is China’s most popular ride-sharing company. we use HBase to serve when we have a bigdata problem. We run three clusters which serve different business needs. We backported the Region Grouping feature back to our internal HBase version so we could isolate the different use cases. We built the Didi HBase Service platform which is popular amongst engineers at our company. It includes a workflow and project management function as well as a user monitoring view. Internally we recommend users use Phoenix to simplify access.even more,we used row timestamp;multidimensional table schema to slove muti dimension query problems C++, Go, Python, and PHP clients get to HBase via thrift2 proxies and QueryServer. We run many important buisness applications out of our HBase cluster such as ETA/GPS/History Order/API metrics monitoring/ and Traffic in the Cloud. If you are interested in any aspects listed above, please come to our talk. We would like to share our experiences with you.

HBaseCon2017 gohbase: Pure Go HBase Client

HBaseCon

HBaseCon2017 Improving HBase availability in a multi tenant environment

HBaseCon

The document discusses improvements made by Hubspot's Big Data Team to increase the availability of HBase in a multi-tenant environment. It outlines reducing the cost of region server failures by improving mean time to recovery, addressing issues that slowed recovery, and optimizing the load balancer. It also details eliminating workload-driven failures through service limits and improving hardware monitoring to reduce impacts of failures. The changes resulted in 8-10x faster balancing, reduced recovery times from 90 to 30 seconds, and consistently achieving 99.99% availability across clusters.

HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...

HBaseCon

Both Spark and HBase are widely used, but how to use them together with high performance and simplicity is a very hard topic. Spark HBase Connector(SHC) provides feature rich and efficient access to HBase through Spark SQL. It bridges the gap between the simple HBase key value store and complex relational SQL queries and enables users to perform complex data analytics on top of HBase using Spark. SHC implements the standard Spark data source APIs, and leverages the Spark catalyst engine for query optimization. To achieve high performance, SHC constructs the RDD from scratch instead of using the standard HadoopRDD. With the customized RDD, all critical techniques can be applied and fully implemented, such as partition pruning, column pruning, predicate pushdown and data locality. The design makes the maintenance very easy, while achieving a good tradeoff between performance and simplicity. Also, SHC has supported Phoenix data as input to HBase in addition to Avro data. Defaulting to a simple native binary encoding seems susceptible to future changes and is a risk for users who write data from SHC into HBase. For example, with SHC going forward, backwards compatibility needs to be properly handled. So the default, SHC needs to support a more standard and well tested format like Phoenix. In this talk, we will demo how SHC works, how to use SHC in secure/non-secure clusters, how SHC works with multi-HBase clusters, etc. This talk will also benefit people who use Spark and other data sources (besides HBase) as it inspires them with ideas of how to support high performance data source access at the Spark DataFrame level.

HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase

HBaseCon

In this talk we introduce Apache Beam, a unified model to create efficient and portable data processing pipelines. Beam uses a single set of abstractions to implement both batch and streaming computations that can be executed in different environments, e.g. Apache Spark, Apache Flink and Google Dataflow. Beam not only does data processing, but can be used as a tool to ingest/extract data to/from different data stores including HBase. We will present interaction scenarios between HBase and Beam and explore Beam's Input/Output (IO) model and how we leverage it to provide support for HBase.

More from HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes

hbaseconasia2017: HBase on Beam

hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest

hbaseconasia2017: HareQL：快速HBase查詢工具的發展過程

hbaseconasia2017: Apache HBase at Netease

hbaseconasia2017: 基于HBase的企业级大数据平台

hbaseconasia2017: HBase at JD.com

hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei

hbaseconasia2017: HBase Practice At XiaoMi

hbaseconasia2017: hbase-2.0.0

HBaseCon2017 Democratizing HBase

HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest

HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase

HBaseCon2017 Transactions in HBase

HBaseCon2017 Highly-Available HBase

HBaseCon2017 Apache HBase at Didi

HBaseCon2017 gohbase: Pure Go HBase Client

HBaseCon2017 Improving HBase availability in a multi tenant environment

HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...

HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase

Recently uploaded

Measures in SQL (SIGMOD 2024, Santiago, Chile)

Julian Hyde

SQL has attained widespread adoption, but Business Intelligence tools still use their own higher level languages based upon a multidimensional paradigm. Composable calculations are what is missing from SQL, and we propose a new kind of column, called a measure, that attaches a calculation to a table. Like regular tables, tables with measures are composable and closed when used in queries. SQL-with-measures has the power, conciseness and reusability of multidimensional languages but retains SQL semantics. Measure invocations can be expanded in place to simple, clear SQL. To define the evaluation semantics for measures, we introduce context-sensitive expressions (a way to evaluate multidimensional expressions that is consistent with existing SQL semantics), a concept called evaluation context, and several operations for setting and modifying the evaluation context. A talk at SIGMOD, June 9–15, 2024, Santiago, Chile Authors: Julian Hyde (Google) and John Fremlin (Google) https://doi.org/10.1145/3626246.3653374

openEuler Case Study - The Journey to Supply Chain Security

Shane Coughlan

What is Augmented Reality Image Tracking

pavan998932

8 Best Automated Android App Testing Tool and Framework in 2024.pdf

kalichargn70th171

DDS-Security 1.2 - What's New? Stronger security for long-running systems

Gerardo Pardo-Castellote

KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD

rodomar2

Empowering Growth with Best Software Development Company in Noida - Deuglo

Deuglo Infosystem Pvt Ltd

Do you want Software for your Business? Visit Deuglo Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions. Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC). Requirement — Collecting the Requirements is the first Phase in the SSLC process. Feasibility Study — after completing the requirement process they move to the design phase. Design — in this phase, they start designing the software. Coding — when designing is completed, the developers start coding for the software. Testing — in this phase when the coding of the software is done the testing team will start testing. Installation — after completion of testing, the application opens to the live server and launches! Maintenance — after completing the software development, customers start using the software.

Essentials of Automations: The Art of Triggers and Actions in FME

Safe Software

In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation. We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios. Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!

Transform Your Communication with Cloud-Based IVR Solutions

TheSMSPoint

Discover the power of Cloud-Based IVR Solutions to streamline communication processes. Embrace scalability and cost-efficiency while enhancing customer experiences with features like automated call routing and voice recognition. Accessible from anywhere, these solutions integrate seamlessly with existing systems, providing real-time analytics for continuous improvement. Revolutionize your communication strategy today with Cloud-Based IVR Solutions. Learn more at: https://thesmspoint.com/channel/cloud-telephony

原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样

mz5nrf0n

原版一模一样【微信：741003700 】【美国纽约州立大学奥尔巴尼分校毕业证学位证书】【微信：741003700 】学位证，留信认证（真实可查，永久存档）offer、雅思、外壳等材料/诚信可靠,可直接看成品样本，帮您解决无法毕业带来的各种难题！外壳，原版制作，诚信可靠，可直接看成品样本。行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备。十五年致力于帮助留学生解决难题，包您满意。本公司拥有海外各大学样板无数，能完美还原海外各大学 Bachelor Diploma degree, Master Degree Diploma 1:1完美还原海外各大学毕业材料上的工艺：水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。材料咨询办理、认证咨询办理请加学历顾问Q/微741003700 留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才

Introducing Crescat - Event Management Software for Venues, Festivals and Eve...

Crescat

Crescat is industry-trusted event management software, built by event professionals for event professionals. Founded in 2017, we have three key products tailored for the live event industry. Crescat Event for concert promoters and event agencies. Crescat Venue for music venues, conference centers, wedding venues, concert halls and more. And Crescat Festival for festivals, conferences and complex events. With a wide range of popular features such as event scheduling, shift management, volunteer and crew coordination, artist booking and much more, Crescat is designed for customisation and ease-of-use. Over 125,000 events have been planned in Crescat and with hundreds of customers of all shapes and sizes, from boutique event agencies through to international concert promoters, Crescat is rigged for success. What's more, we highly value feedback from our users and we are constantly improving our software with updates, new features and improvements. If you plan events, run a venue or produce festivals and you're looking for ways to make your life easier, then we have a solution for you. Try our software for free or schedule a no-obligation demo with one of our product specialists today at crescat.io

Oracle 23c New Features For DBAs and Developers.pptx

Remote DBA Services

2024 eCommerceDays Toulouse - Sylius 2.0.pdf

Łukasz Chruściel

Revolutionizing Visual Effects Mastering AI Face Swaps.pdf

Undress Baby

The quest for the best AI face swap solution is marked by an amalgamation of technological prowess and artistic finesse, where cutting-edge algorithms seamlessly replace faces in images or videos with striking realism. Leveraging advanced deep learning techniques, the best AI face swap tools meticulously analyze facial features, lighting conditions, and expressions to execute flawless transformations, ensuring natural-looking results that blur the line between reality and illusion, captivating users with their ingenuity and sophistication. Web:- https://undressbaby.com/

E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies

Quickdice ERP

Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris

Neo4j

SMS API Integration in Saudi Arabia| Best SMS API Service

Yara Milbes

Discover the benefits and implementation of SMS API integration in the UAE and Middle East. This comprehensive guide covers the importance of SMS messaging APIs, the advantages of bulk SMS APIs, and real-world case studies. Learn how CEQUENS, a leader in communication solutions, can help your business enhance customer engagement and streamline operations with innovative CPaaS, reliable SMS APIs, and omnichannel solutions, including WhatsApp Business. Perfect for businesses seeking to optimize their communication strategies in the digital age.

GreenCode-A-VSCode-Plugin--Dario-Jurisic

Green Software Development

Microservice Teams - How the cloud changes the way we work

Sven Peters

A lot of technical challenges and complexity come with building a cloud-native and distributed architecture. The way we develop backend software has fundamentally changed in the last ten years. Managing a microservices architecture demands a lot of us to ensure observability and operational resiliency. But did you also change the way you run your development teams? Sven will talk about Atlassian’s journey from a monolith to a multi-tenanted architecture and how it affected the way the engineering teams work. You will learn how we shifted to service ownership, moved to more autonomous teams (and its challenges), and established platform and enablement teams.

UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem

Peter Muessig

Recently uploaded (20)

Measures in SQL (SIGMOD 2024, Santiago, Chile)

openEuler Case Study - The Journey to Supply Chain Security

What is Augmented Reality Image Tracking

8 Best Automated Android App Testing Tool and Framework in 2024.pdf

DDS-Security 1.2 - What's New? Stronger security for long-running systems

KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD

Empowering Growth with Best Software Development Company in Noida - Deuglo

Essentials of Automations: The Art of Triggers and Actions in FME

Transform Your Communication with Cloud-Based IVR Solutions

原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样

Introducing Crescat - Event Management Software for Venues, Festivals and Eve...

Oracle 23c New Features For DBAs and Developers.pptx

2024 eCommerceDays Toulouse - Sylius 2.0.pdf

Revolutionizing Visual Effects Mastering AI Face Swaps.pdf

E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies

Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris

SMS API Integration in Saudi Arabia| Best SMS API Service

GreenCode-A-VSCode-Plugin--Dario-Jurisic

Microservice Teams - How the cloud changes the way we work

UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem

State of HBase: Meet the Release Managers

1. State of HBase Invasion of the Release Managers

2. Release Managers • 0.94 Lars Hofhansl • 0.96 Michael Stack • 0.98 Andrew Purtell • 1.0 Enis Söztutar

3. Outline • State of each branch • Q&A

4. 0.94 Attributes • Frequent bug fix releases (monthly) • Still sees minor features • Support for Hadoop 1, 2.0.x, Java 6 and 7 • Old (0.92) DNA, no protobufs, old AM

5. 0.94 State • Current release 0.94.19 • Will have a few more releases • Many large production installs out there • Super stable and battle hardened • EOL? Downtime for upgrade to 0.96+

6. • The “Singularity” o Released 10/19/2013 o 18 months in the making o 2k issues fixed/1500 in 0.96 only • Big Themes o Stability o Operability o Scaling https://www.flickr.com/photos/sysli/3026288256/sizes/q/in/photostream/ Attributes

7. • Currently 0.96.2 • Maybe 0.96.3, but EOL’ing => 0.98.x! • In CDH 5.0.x (0.96.1.1)/HDP 2.0.x State

8. 0.98 Attributes • Major themes o Security o Evolution o Performance improvements o API cleanups/deprecations on the road to HBase 1.0 • Monthly release schedule • Support for Hadoop 1 and 2, but focus is on Hadoop 2; Java 6 and 7

9. 0.98 State • Current release 0.98.2 • Field testing for 1.0 o Expect incremental additive feature evolution o HFile V3 and dependent features experimental until 1.0 • Seamless upgrade from 0.96 • CDH 5.1.x (not out yet)/HDP 2.1.x

10. 1.0 Attributes • Stability of 0.96 / 0.98 line • API cleanup o Table / Connection o Annotation of what is public o Replication / Coprocessor APIs • Semantic improvements o Security / ACLs o SeqId

11. 1.0 Attributes • Masters become region servers o (Optional) only system tables are hosted in active master • Cell level ACL / HFile v3 completion • Dist log replay enabled by default • Perf improvements

12. 1.0 State • Planned a couple of 0.99.x releases o A developer releases which won’t be supported o Summer timeframe o 0.99.x will become 1.0.0 • Use semantic versioning afterwards o Major, minor, and patch releases o More frequent major releases

13. Q&A

Editor's Notes

Current live branches, 0.90, 0.92 are EOL’d.
MTTR, protobufs everywhere, integration test suit, metrics revamp,
Suspect not wide deploy; folks going to 0.98
+ Where do you see HBase being in a year?

State of HBase: Meet the Release Managers

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Viewers also liked

Viewers also liked (20)

Similar to State of HBase: Meet the Release Managers

Similar to State of HBase: Meet the Release Managers (20)

More from HBaseCon

More from HBaseCon (20)

Recently uploaded

Recently uploaded (20)

State of HBase: Meet the Release Managers

Editor's Notes