With HBase hitting the 1.0 mark and adoption/production use cases continuing to grow, it's been an exciting year since last we met at HBaseCon 2014. What is the state of HBase today, and where does it go from here?
Speakers: Eric Czech and Alec Zopf (Next Big Sound)
Managing the evolution of data within HBase over time is not easy: Data resulting from Hadoop processing pipelines or otherwise placed in HBase is subject to the same kinds of oversights, bugs, and faulty assumptions inherent to the software that creates it. While the development of this software is often effectively managed through revision control systems, data itself is rarely modeled in a way that affords the same flexibility. In this session, we'll talk about how to build a versioned, time-series data store using HBase that can provide significantly greater adaptability and performance than similar systems.
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
Speakers: Enis Soztutar and Devaraj Das (Hortonworks)
HBase has ACID semantics within a row that make it a perfect candidate for a lot of real-time serving workloads. However, single homing a region to a server implies some periods of unavailability for the regions after a server crash. Although the mean time to recovery has improved a lot recently, for some use cases, it is still preferable to do possibly stale reads while the region is recovering. In this talk, you will get an overview of our design and implementation of region replicas in HBase, which provide timeline-consistent reads even when the primary region is unavailable or busy.
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon
In this session, we will briefly cover the FINRA use case and then dive into our approach with a particular focus on how we leverage HBase on AWS. Among the topics covered will be our use of HBase Bulk Loading and ExportSnapShots for backup. We will also cover some lessons learned and experiences of running a persistent HBase cluster on AWS.
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...Cloudera, Inc.
Mignify is a platform for collecting, storing and analyzing Big Data harvested from the web. It aims at providing an easy access to focused and structured information extracted from Web data flows. It consists of a distributed crawler, a resource-oriented storage based on HDFS and HBase, and an extraction framework that produces filtered, enriched, and aggregated data from large document collections, including the temporal aspect. The whole system is deployed in an innovative hardware architecture comprising of a high number of small (low-consumption) nodes. This talk will tackle the decisions made along the design and development of the platform, both under a technical and functional perspective. It will introduce the cloud infrastructure, the LTE-like ingestion of the crawler output into HBase/HDFS, and the triggering mechanism of analytics based on a declarative filter/extraction specification. The design choices will be illustrated with a pilot application targeting Daily Web Monitoring in the context of a national domain.
Moderated by Lars Hofhansl (Salesforce), with Matteo Bertozzi (Cloudera), John Leach (Splice Machine), Maxim Lukiyanov (Microsoft), Matt Mullins (Facebook), and Carter Page (Google)
The future of HBase, via a variety of viewpoints.
Speakers: Eric Czech and Alec Zopf (Next Big Sound)
Managing the evolution of data within HBase over time is not easy: Data resulting from Hadoop processing pipelines or otherwise placed in HBase is subject to the same kinds of oversights, bugs, and faulty assumptions inherent to the software that creates it. While the development of this software is often effectively managed through revision control systems, data itself is rarely modeled in a way that affords the same flexibility. In this session, we'll talk about how to build a versioned, time-series data store using HBase that can provide significantly greater adaptability and performance than similar systems.
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
Speakers: Enis Soztutar and Devaraj Das (Hortonworks)
HBase has ACID semantics within a row that make it a perfect candidate for a lot of real-time serving workloads. However, single homing a region to a server implies some periods of unavailability for the regions after a server crash. Although the mean time to recovery has improved a lot recently, for some use cases, it is still preferable to do possibly stale reads while the region is recovering. In this talk, you will get an overview of our design and implementation of region replicas in HBase, which provide timeline-consistent reads even when the primary region is unavailable or busy.
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon
In this session, we will briefly cover the FINRA use case and then dive into our approach with a particular focus on how we leverage HBase on AWS. Among the topics covered will be our use of HBase Bulk Loading and ExportSnapShots for backup. We will also cover some lessons learned and experiences of running a persistent HBase cluster on AWS.
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...Cloudera, Inc.
Mignify is a platform for collecting, storing and analyzing Big Data harvested from the web. It aims at providing an easy access to focused and structured information extracted from Web data flows. It consists of a distributed crawler, a resource-oriented storage based on HDFS and HBase, and an extraction framework that produces filtered, enriched, and aggregated data from large document collections, including the temporal aspect. The whole system is deployed in an innovative hardware architecture comprising of a high number of small (low-consumption) nodes. This talk will tackle the decisions made along the design and development of the platform, both under a technical and functional perspective. It will introduce the cloud infrastructure, the LTE-like ingestion of the crawler output into HBase/HDFS, and the triggering mechanism of analytics based on a declarative filter/extraction specification. The design choices will be illustrated with a pilot application targeting Daily Web Monitoring in the context of a national domain.
Moderated by Lars Hofhansl (Salesforce), with Matteo Bertozzi (Cloudera), John Leach (Splice Machine), Maxim Lukiyanov (Microsoft), Matt Mullins (Facebook), and Carter Page (Google)
The future of HBase, via a variety of viewpoints.
In this session, learn how to build an Apache Spark or Spark Streaming application that can interact with HBase. In addition, you'll walk through how to implement common, real-world batch design patterns to optimize for performance and scale.
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetCloudera, Inc.
YapMap is a new kind of search platform that does multi-quanta search to better understand threaded discussions. This talk will cover how HBase made it possible for two self-funded guys to build a new kind of search platform. We will discuss our data model and how we use row based atomicity to manage parallel data integration problems. We’ll also talk about where we don’t use HBase and instead use a traditional SQL based infrastructure. We’ll cover the benefits of using MapReduce and HBase for index generation. Then we’ll cover our migration of some tasks from a message based queue to the Coprocessor framework as well as our future Coprocessor use cases. Finally, we’ll talk briefly about our operational experience with HBase, our hardware choices and challenges we’ve had.
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCCloudera, Inc.
WorldCat is the world’s largest network of library content and services. Over 25,000 libraries in 170 countries have cooperated for 40 years to build WorldCat. OCLC is currently in the process of transitioning Worldcat from Oracle to Apache HBase. This session will discuss our data design for representing the constantly changing ownership information for thousands of libraries (billions of data points, millions of daily updates) and our plans for how we’re managing HBase in an environment that is equal parts end user facing and batch.
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHBaseCon
Speakers: Dheeraj Kapur, Rajiv Chittajallu & Anish Mathew (Yahoo!)
In early 2013, Yahoo! introduced multi-tenancy to HBase to offer it as a platform service for all Hadoop users. A certain degree of customization per tenant (a user or a project) was achieved through RegionServer groups, namespaces, and customized configs for each tenant. This talk covers how to accommodate diverse needs to individual tenants on the cluster, as well as operational tips and techniques that allow Yahoo! to automate the management of multi-tenant clusters at petabyte scale without errors.
HBase 2.0 is the next stable major release for Apache HBase scheduled for early 2017. It is the biggest and most exciting milestone release from the Apache community after 1.0. HBase-2.0 contains a large number of features that is long time in the development, some of which include rewritten region assignment, perf improvements (RPC, rewritten write pipeline, etc), async clients, C++ client, offheaping memstore and other buffers, Spark integration, shading of dependencies as well as a lot of other fixes and stability improvements. We will go into technical details on some of the most important improvements in the release, as well as what are the implications for the users in terms of API and upgrade paths. Existing users of HBase/Phoenix as well as operators managing HBase clusters will benefit the most where they can learn about the new release and the long list of features. We will also briefly cover earlier 1.x release lines and compatibility and upgrade paths for existing users and conclude by giving an outlook on the next level of initiatives for the project.
Speaker: Varun Sharma (Pinterest)
Over the past year, HBase has become an integral component of Pinterest's storage stack. HBase has enabled us to quickly launch and iterate on new products and create amazing pinner experiences. This talk briefly describes some of these applications, the underlying schema, and how our HBase setup stays highly available and performant despite billions of requests every week. It will also include some performance tips for running on SSDs. Finally, we will talk about a homegrown serving technology we built from a mashup of HBase components that has gained wide adoption across Pinterest.
Speakers: Jesse Yates (Salesforce.com), Demai Ni, Richard Ding & Jing Chen He (IBM)
This talk provides an overview of enterprise-scale backup strategies for HBase: Jesse Yates will describe how Salesforce.com runs backup and recovery on its multi-tenant, enterprise scale HBase deploys; Demai Ni, Songqinq Ding, and Jing Chen of the IBM InfoSphere BigInsights development team will then follow with a description of IBM's recently open-sourced disaster/recovery solution based on HBase snapshots and replication.
Apache HBase - Introduction & Use CasesData Con LA
Apache HBase™ is the Hadoop database, a distributed, scalable, big data store. Use Apache HBase when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable
This talk will introduce to Apache HBase and will give you an overview of Columnar databases. We will also talk about how Facebook is using HBase currently. We will talk about HBase security, Apache Phoenix and Apache Slider
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBaseCon
Speaker: Sudarshan Kadambi and Matthew Hunt (Bloomberg LP)
Bloomberg is a financial data and analytics provider, so data management is core to what we do. There's tremendous diversity in the type of data we manage, and HBase is a natural fit for many of these datasets - from the perspective of the data model as well as in terms of a scalable, distributed database. This talk covers data and analytics use cases at Bloomberg and operational challenges around HA. We'll explore the work currently being done under HBASE-10070, further extensions to it, and how this solution is qualitatively different to how failover is handled by Apache Cassandra.
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon
Phoenix has evolved to become a full-fledged relational database layer over HBase data. We'll discuss the fundamental principles of how Phoenix pushes the computation to the server and why this leads to performance enabling direct support of low-latency applications, along with some major new features. Next, we'll outline our approach for transaction support in Phoenix, a work in-progress, and discuss the pros and cons of the various approaches. Lastly, we'll examine the current means of integrating Phoenix with the rest of the Hadoop ecosystem.
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataCloudera, Inc.
The AOL Mail Team will discuss our implementation of HBase for two large scale applications: an anti-abuse mechanism and a user-visible API. We will provide an overview of how and why HBase and Hadoop were incorporated into the massive and diverse technology stack that is the nearly 20-year-old AOL Mail system and the history of how we took our HBase/Hadoop apps through our traditional process of design, to development, through QA, and into production. We will explain how our practical approach to HBase has evolved over time, and we will discuss our lessons learned and some of our techniques and tools developed via our iterative dev/qa and operational processes. We will explain the pain-points we have experienced with erratic usage and edge-cases, and how we address problems when we run across them.
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightHBaseCon
Nitin Verma, Pravin Mittal, and Maxim Lukiyanov (Microsoft)
This session presents our success story of enabling a big internal customer on Microsoft Azure’s HBase service along with the methodology and tools used to meet high-throughput goals. We will also present how new features in HBase (like BucketCache and MultiWAL) are helping our customers in the medium-latency/high-bandwidth cloud-storage scenario.
HBaseCon 2015 General Session: The Evolution of HBase @ BloombergHBaseCon
Learn the evolution and consolidation of Bloomberg's core infrastructure around fewer, faster, and simpler systems, and the role HBase plays within that effort. You'll also hear about HBase modifications to accommodate the "medium data" use case and get a preview of what's to come.
HBase 1.0 is the new stable major release, and the start of "semantic versioned" releases. We will cover new features, changes in behavior and requirements, source/binary and wire compatibility details, and upgrading. We'll also dive deep into the new standardized client API in 1.0, which establishes a separation of concerns, encapsulates what is needed from how it's delivered, and guarantees future compatibility while freeing the implementation to evolve.
In this session, learn how to build an Apache Spark or Spark Streaming application that can interact with HBase. In addition, you'll walk through how to implement common, real-world batch design patterns to optimize for performance and scale.
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetCloudera, Inc.
YapMap is a new kind of search platform that does multi-quanta search to better understand threaded discussions. This talk will cover how HBase made it possible for two self-funded guys to build a new kind of search platform. We will discuss our data model and how we use row based atomicity to manage parallel data integration problems. We’ll also talk about where we don’t use HBase and instead use a traditional SQL based infrastructure. We’ll cover the benefits of using MapReduce and HBase for index generation. Then we’ll cover our migration of some tasks from a message based queue to the Coprocessor framework as well as our future Coprocessor use cases. Finally, we’ll talk briefly about our operational experience with HBase, our hardware choices and challenges we’ve had.
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCCloudera, Inc.
WorldCat is the world’s largest network of library content and services. Over 25,000 libraries in 170 countries have cooperated for 40 years to build WorldCat. OCLC is currently in the process of transitioning Worldcat from Oracle to Apache HBase. This session will discuss our data design for representing the constantly changing ownership information for thousands of libraries (billions of data points, millions of daily updates) and our plans for how we’re managing HBase in an environment that is equal parts end user facing and batch.
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHBaseCon
Speakers: Dheeraj Kapur, Rajiv Chittajallu & Anish Mathew (Yahoo!)
In early 2013, Yahoo! introduced multi-tenancy to HBase to offer it as a platform service for all Hadoop users. A certain degree of customization per tenant (a user or a project) was achieved through RegionServer groups, namespaces, and customized configs for each tenant. This talk covers how to accommodate diverse needs to individual tenants on the cluster, as well as operational tips and techniques that allow Yahoo! to automate the management of multi-tenant clusters at petabyte scale without errors.
HBase 2.0 is the next stable major release for Apache HBase scheduled for early 2017. It is the biggest and most exciting milestone release from the Apache community after 1.0. HBase-2.0 contains a large number of features that is long time in the development, some of which include rewritten region assignment, perf improvements (RPC, rewritten write pipeline, etc), async clients, C++ client, offheaping memstore and other buffers, Spark integration, shading of dependencies as well as a lot of other fixes and stability improvements. We will go into technical details on some of the most important improvements in the release, as well as what are the implications for the users in terms of API and upgrade paths. Existing users of HBase/Phoenix as well as operators managing HBase clusters will benefit the most where they can learn about the new release and the long list of features. We will also briefly cover earlier 1.x release lines and compatibility and upgrade paths for existing users and conclude by giving an outlook on the next level of initiatives for the project.
Speaker: Varun Sharma (Pinterest)
Over the past year, HBase has become an integral component of Pinterest's storage stack. HBase has enabled us to quickly launch and iterate on new products and create amazing pinner experiences. This talk briefly describes some of these applications, the underlying schema, and how our HBase setup stays highly available and performant despite billions of requests every week. It will also include some performance tips for running on SSDs. Finally, we will talk about a homegrown serving technology we built from a mashup of HBase components that has gained wide adoption across Pinterest.
Speakers: Jesse Yates (Salesforce.com), Demai Ni, Richard Ding & Jing Chen He (IBM)
This talk provides an overview of enterprise-scale backup strategies for HBase: Jesse Yates will describe how Salesforce.com runs backup and recovery on its multi-tenant, enterprise scale HBase deploys; Demai Ni, Songqinq Ding, and Jing Chen of the IBM InfoSphere BigInsights development team will then follow with a description of IBM's recently open-sourced disaster/recovery solution based on HBase snapshots and replication.
Apache HBase - Introduction & Use CasesData Con LA
Apache HBase™ is the Hadoop database, a distributed, scalable, big data store. Use Apache HBase when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable
This talk will introduce to Apache HBase and will give you an overview of Columnar databases. We will also talk about how Facebook is using HBase currently. We will talk about HBase security, Apache Phoenix and Apache Slider
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBaseCon
Speaker: Sudarshan Kadambi and Matthew Hunt (Bloomberg LP)
Bloomberg is a financial data and analytics provider, so data management is core to what we do. There's tremendous diversity in the type of data we manage, and HBase is a natural fit for many of these datasets - from the perspective of the data model as well as in terms of a scalable, distributed database. This talk covers data and analytics use cases at Bloomberg and operational challenges around HA. We'll explore the work currently being done under HBASE-10070, further extensions to it, and how this solution is qualitatively different to how failover is handled by Apache Cassandra.
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon
Phoenix has evolved to become a full-fledged relational database layer over HBase data. We'll discuss the fundamental principles of how Phoenix pushes the computation to the server and why this leads to performance enabling direct support of low-latency applications, along with some major new features. Next, we'll outline our approach for transaction support in Phoenix, a work in-progress, and discuss the pros and cons of the various approaches. Lastly, we'll examine the current means of integrating Phoenix with the rest of the Hadoop ecosystem.
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataCloudera, Inc.
The AOL Mail Team will discuss our implementation of HBase for two large scale applications: an anti-abuse mechanism and a user-visible API. We will provide an overview of how and why HBase and Hadoop were incorporated into the massive and diverse technology stack that is the nearly 20-year-old AOL Mail system and the history of how we took our HBase/Hadoop apps through our traditional process of design, to development, through QA, and into production. We will explain how our practical approach to HBase has evolved over time, and we will discuss our lessons learned and some of our techniques and tools developed via our iterative dev/qa and operational processes. We will explain the pain-points we have experienced with erratic usage and edge-cases, and how we address problems when we run across them.
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightHBaseCon
Nitin Verma, Pravin Mittal, and Maxim Lukiyanov (Microsoft)
This session presents our success story of enabling a big internal customer on Microsoft Azure’s HBase service along with the methodology and tools used to meet high-throughput goals. We will also present how new features in HBase (like BucketCache and MultiWAL) are helping our customers in the medium-latency/high-bandwidth cloud-storage scenario.
HBaseCon 2015 General Session: The Evolution of HBase @ BloombergHBaseCon
Learn the evolution and consolidation of Bloomberg's core infrastructure around fewer, faster, and simpler systems, and the role HBase plays within that effort. You'll also hear about HBase modifications to accommodate the "medium data" use case and get a preview of what's to come.
HBase 1.0 is the new stable major release, and the start of "semantic versioned" releases. We will cover new features, changes in behavior and requirements, source/binary and wire compatibility details, and upgrading. We'll also dive deep into the new standardized client API in 1.0, which establishes a separation of concerns, encapsulates what is needed from how it's delivered, and guarantees future compatibility while freeing the implementation to evolve.
At Salesforce, we have deployed many thousands of HBase/HDFS servers, and learned a lot about tuning during this process. This talk will walk you through the many relevant HBase, HDFS, Apache ZooKeeper, Java/GC, and Operating System configuration options and provides guidelines about which options to use in what situation, and how they relate to each other.
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon
HTrace is a new Apache incubator project which makes it much easier to diagnose and detect performance problems in HBase. It provides a unified view of the performance of requests, following them from their origin in the HBase client, through the HBase region servers, and finally into HDFS. System administrators can use a central web interface to query and view aggregate performance information for the whole cluster. This talk will cover the motivations for creating HTrace, its design, and some examples of how HTrace can help diagnose real-world HBase problems.
Breaking the Sound Barrier with Persistent Memory HBaseCon
Liqi Yi and Shylaja Kokoori (Intel)
A fully optimized HBase cluster could easily hit the limit of the underlying storage device’s capability, which is beyond the reach of software optimization alone. To get around this constraint, we need a new design that brings data processing and data storage closer together. In this presentation, we will look at how persistent memory will change the way large datasets are stored. We will review the hardware characteristics of 3D XPoint™, a new persistent memory technology with low latency and high capacity. We will also discuss opportunities for further improvement within the HBase framework using persistent memory.
Anoop Sam John and Ramkrishna Vasudevan (Intel)
HBase provides an LRU based on heap cache but its size (and so the total data size that can be cached) is limited by Java’s max heap space. This talk highlights our work under HBASE-11425 to allow the HBase read path to work directly from the off-heap area.
Apache HBase Improvements and Practices at XiaomiHBaseCon
Duo Zhang and Liangliang He (Xiaomi)
In this session, we’ll discuss the various practices around HBase in use at Xiaomi, including those relating to HA, tiered compaction, multi-tenancy, and failover across data centers.
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon
In this presentation, we will introduce Hotspot's Garbage First collector (G1GC) as the most suitable collector for latency-sensitive applications running with large memory environments. We will first discuss G1GC internal operations and tuning opportunities, and also cover tuning flags that set desired GC pause targets, change adaptive GC thresholds, and adjust GC activities at runtime. We will provide several HBase case studies using Java heaps as large as 100GB that show how to best tune applications to remove unpredicted, protracted GC pauses.
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBaseHBaseCon
Zen is a storage service built at Pinterest that offers a graph data model of top of HBase and potentially other storage backends. In this talk, Zen's architects go over the design motivation for Zen and describe its internals including the API, type system, and HBase backend.
Now that you've seen Base 1.0, what's ahead in HBase 2.0, and beyond—and why? Find out from this panel of people who have designed and/or are working on 2.0 features.
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...Cloudera, Inc.
NextBio relies on HBase to store the world’s largest collection of continuously curated genomic knowledge. The HBase cluster is leveraged to store billions of correlations as well as processed genomic information. In this talk, we will describe how we use HBase, why we migrated from a large MySQL deployment to HBase, and the challenges along the way.
HBaseCon 2012 | Building Mobile Infrastructure with HBaseCloudera, Inc.
In this session you will learn the common mistakes made when deploying a high write environment when building an analytics database in HBase, as well as tips on how to diagnose and debug performance bottlenecks, and an overview of an open source monitoring utility developed at Urban Airship for finding HBase hotspots. This session will also present a case study on how Urban Airship replaced a tag system running on a highly sharded PostgreSQL cluster to HBase, the options explored to create a high throughput Boolean tag system and how it was ultimately built on HBase.
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...Cloudera, Inc.
Explorys has been using HBase and Hadoop since HBase 0.20, and will walk through lessons learned over years of usage from their first HBase implementation through a series of upgrades and changes, including impacts to schema design, data loading, data indexing, data access and analytics, and operational processes.
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!Cloudera, Inc.
For Map/Reduce programmers used to HDFS, the mutability of HBase tables poses new challenges: Data can change over the duration of a job, multiple jobs can write concurrently, writes are effective immediately, and it is not trivial to clean up partial writes. Revision Manager introduces atomic commits and point-in-time consistent snapshots over a table, guaranteeing repeatable reads and protection from partial writes. Revision Manager is optimized for a relatively small number of concurrent write jobs, which is typical within Hadoop clusters. This session will discuss the implementation of Revision Manager using ZooKeeper and coprocessors, and paying extra care to ensure security in multi-tenant clusters. Revision Manager is available as part of the HBase storage handler in HCatalog, but can easily be used stand-alone with little coding effort.
Scaling geospatial data is hard. State of the art GIS technologies available to the general public are locked in the realm of relational databases with PostGIS as the prominent leader. Though a number of location-based startups have walked this path before, few have marked their trail along the way. Act one proveds a survey of the landscape, defining terms, and highlighting pitfalls. Act two explores the world of open source, horizontally scalable GIS and outlines the problems they solve. Act three explores implementations backed by HBase. No previous GIS knowledge is required.
Apache Mesos and the new Open Source Architecture of the Modern DatacenterData Con LA
Abstract
Apache Mesos has the ability to run on every private and cloud instance, anywhere. In this talk, Aaron will explain the momentum behind the “single computer” abstraction that has put Mesos at the center of one of the most exciting architecture shifts in recent information technology history. He will explain how Mesos is enabling application developers and devops to redefine their responsibilities and shorten the amount of time it takes to write and ship production code. He will outline how Mesos is empowering the new class of “datacenter developers” to program directly against datacenter resources, and draw correlations to how the Linux revolutionized the server industry.
Bio
Aaron Williams is Head of Developer Advocacy at Mesosphere, where he enjoys helping to build the vibrant Mesos community and drive adoption of the Mesosphere Datacenter Operating System (DCOS). Prior to Mesosphere, he was a two-time startup CEO at SocialSamba and Picotent. Earlier in Aaron’s career he led the developer ecosystem at SAP and ran the Java Community Process at Sun Microsystems. Aaron has an MS in computer science from Case Western. When not immersing himself in technology, he plays basketball, runs a few half marathons a year, and is a television junkie (BTW: he is a 2-time Emmy Nominee). Going way back, he grew up in Columbus, Ohio and is a die-hard Ohio State Buckeye.
Meteor goes v1.0 and we had the first Meteor Meetup in Athens. A monthly get-together to share ideas, problems, and solutions around Meteor and to meet fellow Meteor enthusiasts.
Apache Ambari is used by thousands of Hadoop Operators to manage the deployment, lifecycle, and automation of DevOps for Hadoop ecosystem projects. The Ambari engineering team will talk about improvements being made to the automation, metrics, logging, upgrade, and other core frameworks within Ambari as the project is being re-imagined.
Starting out, Apache Ambari installed a handful of Apache Hadoop ecosystem projects, on a few operating systems, and helped with the most basic Hadoop operational tasks. Today, the product manages over 20 different services, runs on multiple major operating systems and versions, and automates many of the most challenging Hadoop operational tasks in the most secure customer environments.
As part of this talk, the engineering team will walk you through what we've learned, the challenges we've overcome, and how the Apache Ambari community has changed the product to handle them. The future is fast approaching, and with it comes new on-premise and cloud deployment architectures. See how Apache Ambari is being re-imagined to handle these new challenges.
Speaker
Paul Codding, Product Management Director, Hortonworks
Oliver Szabo, Senior Software Engineer, Hortonworks
Tips for Installing Cognos Analytics 11.2.1xSenturus
We walk through the installation and configuration steps for a Cognos 11.2.1 upgrade. Just some of the topics we cover include: how installer got smarter, upgraded hardware requirements, backing up and preserving files, upgrade strategy and themes and extensions. See the recording and download this deck: https://senturus.com/resources/tips-for-installing-cognos-analytics-11-2-1/
Senturus offers a full spectrum of services for business analytics. Our Knowledge Center has hundreds of free live and recorded webinars, blog posts, demos and unbiased product reviews available on our website at: https://senturus.com/resources/
Learning plan
Skillsets required for Koha maintenance
Hardware requirements
Software requirements
Koha release schedules
Types of Koha implementation
Methods of Koha installation
How to update with changes in Koha.
Microservices on AWS: Divide & Conquer for Agility and ScalabilityAmazon Web Services
To tackle complexity and change, AWS customers are increasingly evolving their architectures from monoliths towards microservices, and benefiting from increased agility, simplified scalability, resiliency, and faster deployments. However, microservices also introduce new technical challenges. In this session, we'll provide an introduction and overview of the benefits and challenges of micrososervices, and share best practices for architecting and deploying microservices on AWS.
Conduct data discovery or rapid BI prototyping without becoming a Hadoop expert by analyzing big data with standard BI tools, including Cognos. View the webinar video recording and download this deck: http://www.senturus.com/resources/running-cognos-on-hadoop/.
See a cost effective, scalable solution that does not have the barriers to entry common with big data applications. The webinar explains: 1) use cases for Hadoop, 2) pros and cons of different visualization tools and their integration with Hadoop and 3) a demonstration of BigInsights, IBM’s solution.
Senturus, a business analytics consulting firm, has a resource library with hundreds of free recorded webinars, trainings, demos and unbiased product reviews. Take a look and share them with your colleagues and friends: http://www.senturus.com/resources/.
Microservices on AWS: Divide & Conquer for Agility and ScalabilityAmazon Web Services
To tackle complexity and change, AWS customers are increasingly evolving their architectures from monoliths towards microservices, and benefiting from increased agility, simplified scalability, resiliency, and faster deployments. However, microservices also introduce new technical challenges. In this session, we'll provide an introduction and overview of the benefits and challenges of micrososervices, and share best practices for architecting and deploying microservices on AWS.
Enterprise summit – architecting microservices on aws final v2Amazon Web Services
To tackle complexity and change, AWS customers are increasingly evolving their architectures from monoliths towards microservices, and benefiting from increased agility, simplified scalability, resiliency, and faster deployments. However, microservices also introduce new technical challenges. In this session, we'll provide an introduction and overview of the benefits and challenges of micrososervices, and share best practices for architecting and deploying microservices on AWS.
Faster Computing has contacted Go2Linux and requested a brief propChereCheek752
Faster Computing has contacted Go2Linux and requested a brief proposal presentation for migrating its systems from Windows to Linux.
The company is specifically interested in seeing the following information:
(10.1.1: Identify the problem to be solved.)
· Based on your current understanding of Faster Computing's business, what are some potential benefits of Linux?
· The company is aware that many different Linux derivatives exist. Be very specific and choose only one version (e.g., Ubuntu, Mint, Zorin, Redhat, CentOS, Kali). Which would Go2Linux recommend, and why? Give specific reasons for your choice (e.g., security features, support, updates, user interface).
(10.1.2: Gather project requirements to meet stakeholder needs.)
· What steps will be required to migrate the systems from Windows to Linux?
· Are there graphical interfaces available for the Linux workstations that would provide similar functionality to Windows? Some users are concerned about working with a command-line interface.
(10.1.3: Define the specifications of required technologies.)
· What tools are available on Linux for the servers to provide file sharing, Linux services, and printing? (e.g., Apache/Nginx, Samba, CUPS, SSH/SCP). Ensure you identify what the functions/services are used for (e.g., Samba is used for file sharing).
(1.1.3: Present ideas in a clear, logical order appropriate to the task.)
The deliverable for this phase of the project is a three- to five-slide PowerPoint narrated presentation.
· An introductory slide
· A summary slide
· Voice narration on every slide
For each slide, you will embed your own audio recording as if you were presenting the content to the Faster Computing team. Faster Computing has not yet committed to the project, so this should be presented as a proposal. The presentation should be visually appealing; the inclusion of at least one image that supports the content and adds value to the proposal is required.
(1.3.3: Integrate appropriate credible sources to illustrate and validate ideas.)
You must cite at least two quality sources.
You used at least 2 references and your references were cited properly following an accepted style. Ask your instructor for clarification.
Use the Migration Proposal Presentation template to get started.
(2.3.1: State conclusions or solutions clearly and precisely.)
You should present your proposal as if you are selling to the company. Revisit all of these important reasons in the summary slide.
Migration Proposal Presentation
Linux, like Windows and Mac OS, is a fully accessible software. It is no longer only an operating system;
Linux
Cont’
it is now also a substrate for running workstations, servers, and integrated devices.
Since it is publicly available and portable, it has a wide range of installations and modifications. The kernel is an essential component of the Linux operating system.
Numerous characteristics of the Linux environment show that it is superior than o ...
Cloud Native Night, April 2018, Mainz: Workshop led by Jörg Schad (@joerg_schad, Technical Community Lead / Developer at Mesosphere)
Join our Meetup: https://www.meetup.com/de-DE/Cloud-Native-Night/
PLEASE NOTE:
During this workshop, Jörg showed many demos and the audience could participate on their laptops. Unfortunately, we can't provide these demos. Nevertheless, Jörg's slides give a deep dive into the topic.
DETAILS ABOUT THE WORKSHOP:
Kubernetes has been one of the topics in 2017 and will probably remain so in 2018. In this hands-on technical workshop you will learn how best to deploy, operate and scale Kubernetes clusters from one to hundreds of nodes using DC/OS. You will learn how to integrate and run Kubernetes alongside traditional applications and fast data services of your choice (e.g. Apache Cassandra, Apache Kafka, Apache Spark, TensorFlow and more) on any infrastructure.
This workshop best suits operators focussed on keeping their apps and services up and running in production and developers focussed on quickly delivering internal and customer facing apps into production.
You will learn how to:
- Introduction to Kubernetes and DC/OS (including the differences between both)
- Deploy Kubernetes on DC/OS in a secure, highly available, and fault-tolerant manner
- Solve operational challenges of running a large/multiple Kubernetes cluster
- One-click deploy big data stateful and stateless services alongside a Kubernetes cluster
Similar to HBaseCon 2015 General Session: State of HBase (20)
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
Zhiyong Bai
As a high performance and scalable key value database, Zhihu use HBase to provide online data store system along with Mysql and Redis. Zhihu’s platform team had accumulated some experience in technology of container, and this time, based on Kubernetes, we build flexible platform of online HBase system, create multiple logic isolated HBase clusters on the shared physical cluster with fast rapid,and provide customized service for different business needs. Combined with Consul and DNS server, we implement high available access of HBase using client mainly written with Python. This presentation is mainly shared the architecture of online HBase platform in Zhihu and some practical experience in production environment.
hbaseconasia2017 hbasecon hbase
Jingcheng Du
Apache Beam is an open source and unified programming model for defining batch and streaming jobs that run on many execution engines, HBase on Beam is a connector that allows Beam to use HBase as a bounded data source and target data store for both batch and streaming data sets. With this connector HBase can work with many batch and streaming engines directly, for example Spark, Flink, Google Cloud Dataflow, etc. In this session, I will introduce Apache Beam, and the current implementation of HBase on Beam and the future plan on this.
hbaseconasia2017 hbasecon hbase
https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
Ashish Singhi
HBase Disaster recovery solution aims to maintain high availability of HBase service in case of disaster of one HBase cluster with very minimal user intervention. This session will introduce the HBase disaster recovery use cases and the various solutions adopted at Huawei like.
a) Cluster Read-Write mode
b) DDL operations synchronization with standby cluster
c) Mutation and bulk loaded data replication
d) Further challenges and pending work
hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
Tianying Chang
HBase is used to serve online facing traffic in Pinterest. It means no downtime is allowed. However, we were on HBase 94. To upgrade to latest version, we need to figure out a way to live upgrade while keeping Pinterest site live. Recently, we successfully upgrade 94 HBase cluster to 1.2 with no downtime. We made change to both Asynchbase and HBase server side. We will talk about what we did and how we did it. We will also talk about the finding in config and performance tuning we did to achieve low latency.
hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
Xinxin Fan and Hongxiang Jiang
First, we will give a brief introduction about the HBase service at Netease,include the basic cluster info and the key HBase service. And then we will talk same tips about the tuning practices for HBase. Last, we will introduce some improvements at the internal HBase version.
hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
Shuaifeng Zhou
When we do real-time data loading to HBase, we use put/putlist interface. After receiving put request, regionserver will write WAL, write data into memory store, flush memory store to disk-store, then compact files again and again. That precedure occupies too much resource and causing read/write performance decrease. To solve the problem, we provide a kind of near-line loading method and architecture, greatly increase the loading bandwidth, and decrease the influence to read operations.
hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
Jieshan Bi and Yanhui Zhong
1. CTBase: A light-weight HBase client for structured data.
1). Schematized table, more friendly for structured data storage.
2). Global secondary index for HBase.
3). HBase Query DSL. JSON based light-weight API.
4) Cluster table. Pre-joining with keys, a better solution for cross-table join queries from HBase.
2. Tagram: Distributed bitmap index implementation with HBase.
1). Distributed bitmap index for accelerating AD-HOC queries with low cardinality columns.
2). Powerful and flexible query API.
3). Tagram offers millisecond-level query latency.
3. CloudTable Service Introduction: HBase on Huawei cloud.
hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
Zheng Hu
We'll share some HBase experience at XiaoMi:
1. How did we tuning G1GC for HBase Clusters.
2. Development and performance of Async HBase Client.
hbaseconasia2017 hbasecon hbase xiaomi https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
HBase-2.0.0 has been a couple of years in the making. It is chock-a-block full of a long list of new features and fixes. In this session, the 2.0.0 release manager will perform the impossible, describing the release content inside the session time bounds.
hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
As HBase and Hadoop continue to become routine across enterprises, these enterprises inevitably shift priorities from effective deployments to cost-efficient operations. Consolidation of infrastructure, the sum of hardware, software, and system-administrator effort, is the most common strategy to reduce costs. As a company grows, the number of business organizations, development teams, and individuals accessing HBase grows commensurately, creating a not-so-simple requirement: HBase must effectively service many users, each with a variety of use-cases. This is problem is known as multi-tenancy. While multi-tenancy isn’t a new problem, it also isn’t a solved one, in HBase or otherwise. This talk will present a high-level view of the common issues organizations face when multiple users and teams share a single HBase instance and how certain HBase features were designed specifically to mitigate the issues created by the sharing of finite resources.
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
HBase is used to serve online facing traffic in Pinterest. It means no downtime is allowed. However, we were on HBase 94. To upgrade to latest version, we need to figure out a way to live upgrade while keeping Pinterest site live. Recently, we successfully upgrade 94 HBase cluster to 1.2 with no downtime. We made change to both Asynchbase and HBase server side. We will talk about what we did and how we did it. We will also talk about the finding in config and performance tuning we did to achieve low latency.
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
Hundreds of millions of people use Quora to find accurate, informative, and trustworthy answers to their questions. As it so happens, counting things at scale is both an important and a difficult problem to solve.
In this talk, we will be talking about Quanta, Quora's counting system built on top of HBase that powers our high-volume near-realtime analytics that serves many applications like ads, content views, and many dashboards. In addition to regular counting, Quanta supports count propagation along the edges of an arbitrary DAG. HBase is the underlying data store for both the counting data and the graph data.
We will describe the high-level architecture of Quanta and share our design goals, constraints, and choices that enabled us to build Quanta very quickly on top of our existing infrastructure systems.
In the age of NoSQL, big data storage engines such as HBase have given up ACID semantics of traditional relational databases, in exchange for high scalability and availability. However, it turns out that in practice, many applications require consistency guarantees to protect data from concurrent modification in a massively parallel environment. In the past few years, several transaction engines have been proposed as add-ons to HBase; three different engines, namely Omid, Tephra, and Trafodion were open-sourced in Apache alone. In this talk, we will introduce and compare the different approaches from various perspectives including scalability, efficiency, operability and portability, and make recommendations pertaining to different use cases.
In order to effectively predict and prevent online fraud in real time, Sift Science stores hundreds of terabytes of data in HBase—and needs it to be always available. This talk will cover how we used circuit-breaking, cluster failover, monitoring, and automated recovery procedures to improve our HBase uptime from 99.7% to 99.99% on top of unreliable cloud hardware and networks.
In DiDi Chuxing Company, which is China’s most popular ride-sharing company. we use HBase to serve when we have a bigdata problem.
We run three clusters which serve different business needs. We backported the Region Grouping feature back to our internal HBase version so we could isolate the different use cases.
We built the Didi HBase Service platform which is popular amongst engineers at our company. It includes a workflow and project management function as well as a user monitoring view.
Internally we recommend users use Phoenix to simplify access.even more,we used row timestamp;multidimensional table schema to slove muti dimension query problems
C++, Go, Python, and PHP clients get to HBase via thrift2 proxies and QueryServer.
We run many important buisness applications out of our HBase cluster such as ETA/GPS/History Order/API metrics monitoring/ and Traffic in the Cloud. If you are interested in any aspects listed above, please come to our talk. We would like to share our experiences with you.
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
gohbase is an implementation of an HBase client in pure Go: https://github.com/tsuna/gohbase. In this presentation we'll talk about its architecture and compare its performance against the native Java HBase client as well as AsyncHBase (http://opentsdb.github.io/asynchbase/) and some nice characteristics of golang that resulted in a simpler implementation.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
In the ever-evolving landscape of technology, enterprise software development is undergoing a significant transformation. Traditional coding methods are being challenged by innovative no-code solutions, which promise to streamline and democratize the software development process.
This shift is particularly impactful for enterprises, which require robust, scalable, and efficient software to manage their operations. In this article, we will explore the various facets of enterprise software development with no-code solutions, examining their benefits, challenges, and the future potential they hold.
2. The State of HBase
Andrew Purtell, Enis Söztutar, Michael Stack
3. About Us
Andrew Purtell
Salesforce
Release Manager for 0.98
@akpurtell
Enis Söztutar
Hortonworks
Release Manager for 1.0
@enissoz
2.0
Michael Stack
Cloudera
-
@saintstack
4. Outline
● State of the Project
● State of the Software
● State of the Ecosystem
2.0
5. Outline
● State of the Project
● State of the Software
● State of the Ecosystem
2.0
6. State of the Project
● Backing medium- and high- scale services
o Hundreds of enterprises
o Some of the largest Internet companies in the world
● Well established, mature codebase
o >100 contributors, 4.2M lines of code, 1200+ man-
years of total effort*
● Runs on HDFS, MapR, Gluster, GPFS, etc.
● As a service: AWS EMR, HDInsight, etc.
2.0
*Source: OpenHub https://www.openhub.net/p/hbase
7. Project: Vision
Simple, steady, and powerful: “A first class high
performance horizontally scalable data storage
engine for Big Data, suitable as the store of
record for mission critical data.”
2.0
8. Project: Goals
● Availability: Always more, always faster
● Stability and Operability
o Continuous Improvement
● Scaling (up and down)
● Readying for NextGen ‘commodity’ hardware
● Multi-tenancy
● Diversifying our ecosystem
o Come talk to us if you’re building a Big Data product
2.0
10. ● Eight new committers
Zhang Duo (duozhang), Andrey Stepachev (octo47),
Liu Shaohui (liushaohui), Virag Kothari (virag),
Sean Busbey (busbey), Srikanth Srungarapu (ssrungarapu), Jing
Chen (Jerry) He (jerryjch),
Misty Stanley-Jones (misty)
● Now 43 committers! from a diverse group of companies
including Cask, Cloudera, Facebook, HortonWorks,
IBM, Intel, Salesforce, Xiaomi, Yahoo!, and Google
2.0Project: Committers
11. Project: PMC
● First chair rotation in the project lifetime
Michael Stack (stack), outgoing
Andrew Purtell (apurtell), incoming
● Four new members
Sean Busbey (busbey)
Matteo Bertozzi (mbertozzi)
Nick Dimiduk (ndimiduk)
Jeffrey Zhong (jefferyz)
2.0
26. Software : Semantic Versioning
Client / Server API cleanup continuing
Dependency isolation / shading
Goal is for full semver compliance
HBase-1.0 talk and HBase-2.0 panel for more
27. Software: Focus
● Smaller regions, more regions (scaling)
o Less write amplification
o 1M+ region clusters
● Stability
o Procedure Version2
o Assignment improvements/stability
● Scanners
o Chunking, Heartbeating, ‘Parking’, Streaming
2.0
28. Software: Focus
● Adaption: Work Loads
o HBase as Medium Object Store (MOB)
● Tunable Consistency
o TIMELINE Consistency
● Improving coprocessor API supportability
● Profile-driven optimization
● Improved GC-friendliness, use more RAM
o Offheaping
2.0
29. Software: Focus
● Multitenancy
o Table groups
o Quotas
o Priorities
● Using all of the machine
o RAM
o iops
o All of the CPUs
2.0
30. Outline
● State of the Project
● State of the Software
● State of the Ecosystem
2.0
32. Ecosystem: SQL
o Phoenix 4.4.0RC for HBase 1.0.0
o SQL over raw HBase tables
● Trafodion
o Trafodion 1.1.0 announced
o Heading for Apache Incubator!
● & LeanXcale
2.0
33. 2.0Ecosytem: Dogfooding
● YARN-2928 Application Timeline Service
● HIVE-9452 HBase to store Hive metadata
● AMBARI-5707 Ambari Metrics System
34. Get Involved!
Follow us on Twitter
@HBase
Follow us on Facebook
Follow our Blog
https://blogs.apache.org/hbase/
Join our mailing lists
user-subscribe@hbase.apache.org
dev-subscribe@hbase.apache.org
2.0
Welcome everyone! Today is going to be fantastic and we have quite the agenda for you. In the interest of time, lets dive right in.
Deploys you will hear about today.
Not compete with C*. MTTR stuff. Our replication is better than theirs. Master-master, large sequential scans. We should not have to do these C* vs HBase fights anymore… No advertising, no coherent leadership, no PM… Doesn’t help in sales. No one talks about it.
New logo
We won’t leave you behind. 0.94 might be finished. We want you to move up to newer versions…. Counts are since last hbasecon
Add and delete column family while table is online.
See the ecosystem track and use cases for sampling of what is going on in hbase ecosystem these times.
See the ecosystem track and use cases for sampling of what is going on in hbase ecosystem these times. Be sure attend SQL SmackDown
Ambari shipping
Welcome everyone! Today is going to be fantastic and we have quite the agenda for you. In the interest of time, lets dive right in.