SQL and Machine Learning on Hadoop using HAWQpivotalny
It is true to the extent it is almost considered rhetorical to say
“Many Enterprises have adopted HDFS as the foundational layer for their Data Lakes. HDFS provides the flexibility to store any kind of data and more importantly it’s infinitely scaleable on commodity hardware.”
But the conundrum till date is the solution for a low latency query engine for HDFS.
At Pivotal, we cracked that problem and the answer is HAWQ, which we intend to open source this year. During this event, we will present and demo HAWQ’s Architecture, it’s powerful ANSI SQL features and it’s ability to transcend traditional BI in the form of in-database analytics (or machine learning).
Pivotal is a trusted partner for IT innovation and transformation. From the technology, to the people, to the way people interact with technology, Pivotal is transforming how the world builds software.
At Strata NYC 2015, Pivotal, announced it will Supercharge the Hadoop Ecosystem by contributing the HAWQ advanced SQL on Hadoop analytics and MADlib machine learning technologies to The Apache Software Foundation.
HAWQ: a massively parallel processing SQL engine in hadoopBigData Research
HAWQ, developed at Pivotal, is a massively parallel processing SQL engine sitting on top of HDFS. As a hybrid of MPP database and Hadoop, it inherits the merits from both parties. It adopts a layered architecture and relies on the distributed file system for data replication and fault tolerance. In addition, it is standard SQL compliant, and unlike other SQL engines on Hadoop, it is fully transactional. This paper presents the novel design of HAWQ, including query processing, the scalable software interconnect based on UDP protocol, transaction management, fault tolerance, read optimized storage, the extensible framework for supporting various popular Hadoop based data stores and formats, and various optimization choices we considered to enhance the query performance. The extensive performance study shows that HAWQ is about 40x faster than Stinger, which is reported 35x-45x faster than the original Hive.
HBase provides many features for multi-tenancy and isolation. However, the operation of these features require integration into the broader operations of a cluster. This talk will cover some methods we use at Bloomberg for multi-tenancy and discuss some HBase-Oozie integration. Particularly of interest is our work on an Oozie action for secure snapshot export -- this extends the HBase security model via Oozie allowing self-service (non-hbase user) snapshot export on secure clusters.
Key topics:
* Bloomberg's Oozie HBase export snapshot action
* Oozie coordinated time based major compactions
* How we use LDAP with HBase (and why to take care with HADOOP-12291)
* Some of our multi-tenancy setups around monitoring for SLAs
* Suggesting HBase stays the course of being "just" a datastore -- and all projects following the Unix philosophy (this has made things like our Oozie integration much easier!)
How to Use Apache Zeppelin with HWX HDBHortonworks
Part five in a five-part series, this webcast will be a demonstration of the integration of Apache Zeppelin and Pivotal HDB. Apache Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more. This webinar will demonstrate the configuration of the psql interpreter and the basic operations of Apache Zeppelin when used in conjunction with Hortonworks HDB.
This guide was generated in Jan-Feb'2014 timeframe.
Using the feature of SAP HANA Smart Data Access(SDA), it is possible to access remote data, without having to replicate the data to the SAP HANA database beforehand. The following are supported as sources(till 2013):
- Teradata database,
- SAP Sybase ASE,
- SAP Sybase IQ,
- Intel Distribution for Apache Hadoop,
- SAP HANA.
SAP HANA handles the data like local tables on the database. Automatic data type conversion makes it possible to map data types from databases connected via SAP HANA Smart Data Access to SAP HANA data types.
This guide will explain the step-by-step approach SAP HANA SDA for Hadoop data - which also include the following :
- Hadoop Installation
- Data Load in Hadoop system
- Activities on Unstructured Data in Hadoop system
- ODBC Driver installation & configuration on HANA Server for Hadoop system data access
- Smart Data Access in SAP HANA (through SAP HANA Studio), using HADOOP as a remote data source
Setup used for this guide :
1) Hadoop : HDP 1.3 for Windows(Hortonworks Data Platform) - Standalone - on Dell Laptop, OS Win7 64bit with 8GB RAM
2) SAP HANA Sever : running on VM – 24GB Standalone HANA 1.0 SPS 7 – SLES 11 SP1
SQL and Machine Learning on Hadoop using HAWQpivotalny
It is true to the extent it is almost considered rhetorical to say
“Many Enterprises have adopted HDFS as the foundational layer for their Data Lakes. HDFS provides the flexibility to store any kind of data and more importantly it’s infinitely scaleable on commodity hardware.”
But the conundrum till date is the solution for a low latency query engine for HDFS.
At Pivotal, we cracked that problem and the answer is HAWQ, which we intend to open source this year. During this event, we will present and demo HAWQ’s Architecture, it’s powerful ANSI SQL features and it’s ability to transcend traditional BI in the form of in-database analytics (or machine learning).
Pivotal is a trusted partner for IT innovation and transformation. From the technology, to the people, to the way people interact with technology, Pivotal is transforming how the world builds software.
At Strata NYC 2015, Pivotal, announced it will Supercharge the Hadoop Ecosystem by contributing the HAWQ advanced SQL on Hadoop analytics and MADlib machine learning technologies to The Apache Software Foundation.
HAWQ: a massively parallel processing SQL engine in hadoopBigData Research
HAWQ, developed at Pivotal, is a massively parallel processing SQL engine sitting on top of HDFS. As a hybrid of MPP database and Hadoop, it inherits the merits from both parties. It adopts a layered architecture and relies on the distributed file system for data replication and fault tolerance. In addition, it is standard SQL compliant, and unlike other SQL engines on Hadoop, it is fully transactional. This paper presents the novel design of HAWQ, including query processing, the scalable software interconnect based on UDP protocol, transaction management, fault tolerance, read optimized storage, the extensible framework for supporting various popular Hadoop based data stores and formats, and various optimization choices we considered to enhance the query performance. The extensive performance study shows that HAWQ is about 40x faster than Stinger, which is reported 35x-45x faster than the original Hive.
HBase provides many features for multi-tenancy and isolation. However, the operation of these features require integration into the broader operations of a cluster. This talk will cover some methods we use at Bloomberg for multi-tenancy and discuss some HBase-Oozie integration. Particularly of interest is our work on an Oozie action for secure snapshot export -- this extends the HBase security model via Oozie allowing self-service (non-hbase user) snapshot export on secure clusters.
Key topics:
* Bloomberg's Oozie HBase export snapshot action
* Oozie coordinated time based major compactions
* How we use LDAP with HBase (and why to take care with HADOOP-12291)
* Some of our multi-tenancy setups around monitoring for SLAs
* Suggesting HBase stays the course of being "just" a datastore -- and all projects following the Unix philosophy (this has made things like our Oozie integration much easier!)
How to Use Apache Zeppelin with HWX HDBHortonworks
Part five in a five-part series, this webcast will be a demonstration of the integration of Apache Zeppelin and Pivotal HDB. Apache Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more. This webinar will demonstrate the configuration of the psql interpreter and the basic operations of Apache Zeppelin when used in conjunction with Hortonworks HDB.
This guide was generated in Jan-Feb'2014 timeframe.
Using the feature of SAP HANA Smart Data Access(SDA), it is possible to access remote data, without having to replicate the data to the SAP HANA database beforehand. The following are supported as sources(till 2013):
- Teradata database,
- SAP Sybase ASE,
- SAP Sybase IQ,
- Intel Distribution for Apache Hadoop,
- SAP HANA.
SAP HANA handles the data like local tables on the database. Automatic data type conversion makes it possible to map data types from databases connected via SAP HANA Smart Data Access to SAP HANA data types.
This guide will explain the step-by-step approach SAP HANA SDA for Hadoop data - which also include the following :
- Hadoop Installation
- Data Load in Hadoop system
- Activities on Unstructured Data in Hadoop system
- ODBC Driver installation & configuration on HANA Server for Hadoop system data access
- Smart Data Access in SAP HANA (through SAP HANA Studio), using HADOOP as a remote data source
Setup used for this guide :
1) Hadoop : HDP 1.3 for Windows(Hortonworks Data Platform) - Standalone - on Dell Laptop, OS Win7 64bit with 8GB RAM
2) SAP HANA Sever : running on VM – 24GB Standalone HANA 1.0 SPS 7 – SLES 11 SP1
Zeppelin has become a popular way to unlock the value of data lake due to its user interface and appeal to business users. These business users ask their IT department for access to Zeppelin. Enterprise IT department want to help their business users but they have several enterprise concerns such as enterprise security, integration with their corporate LDAP/AD, scalability and multi-user environment, integration with Ranger and Kerberos. This session will walk through enterprise concerns and how these concerns can be handled with Zeppelin.
Predictive Analytics and Machine Learning…with SAS and Apache HadoopHortonworks
In this interactive webinar, we'll walk through use cases on how you can use advanced analytics like SAS Visual Statistics and In-Memory Statistic with Hortonworks’ data platform (HDP) to reveal insights in your big data and redefine how your organization solves complex problems.
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
The recently launched HDP 2.3 is a major advancement of Open Enterprise Hadoop. It represents the best of community led development with innovations spanning Apache Hadoop, Apache Ambari, Ranger, HBase, Spark and Storm. In this session we will provide an in-depth overview of new functionality and discuss it's impact on new and ongoing big data initiatives.
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Spark Summit
Both Spark and HBase are widely used, but how to use them together with high performance and simplicity is a very challenging topic. Spark HBase Connector(SHC) provides feature rich and efficient access to HBase through Spark SQL. It bridges the gap between the simple HBase key value store and complex relational SQL queries and enables users to perform complex data analytics on top of HBase using Spark. SHC implements the standard Spark data source APIs, and leverages the Spark catalyst engine for query optimization. To achieve high performance, SHC constructs the RDD from scratch instead of using the standard HadoopRDD. With the customized RDD, all critical techniques can be applied and fully implemented, such as partition pruning, column pruning, predicate pushdown and data locality. The design makes the maintenance easy, while achieving a good tradeoff between performance and simplicity. In addition to fully supporting all the Avro schemas natively, SHC has also integrated natively with Phoenix data types. With SHC, Spark can execute batch jobs to read/write data from/into Phoenix tables. Phoenix can also read/write data from/into HBase tables created by SHC. For example, users can run a complex SQL query on top of an HBase table created by Phoenix inside Spark, perform a table join against an Dataframe which reads the data from a Hive table, or integrate with Spark Streaming to implement a more complicated system. In this talk, apart from explaining why SHC is of great use, we will also demo how SHC works, how to use SHC in secure/non-secure clusters, how SHC works with multiple secure HBase clusters, etc. This talk will also benefit people who use Spark and other data sources (besides HBase) as it inspires them with ideas of how to support high performance data source access at the Spark DataFrame level.
GPORCA is newly open source advanced query optimizer that is a subproject of Greenplum Database open source project. GPORCA is the query optimizer used in commercial distributions of both Greenplum and HAWQ. In these distributions GPORCA has achieved 1000x performance improvement across TPC-DS queries by focusing on three distinct areas: Dynamic Partition Elimination, SubQuery Unnesting, and Common Table Expression.
Now that GPORCA is open source, we are looking for collaborators to help us realize the ultimate dream for GPORCA - to work with any database.
The new breed of data management systems in Big Data have to process so much data that optimization mistakes are magnified in traditional optimizers. Furthermore, coding and manual optimization of complex queries has proven to be hard.
In this session, Venkatesh will discuss:
- Overview of GPORCA
- How to add GPORCA to HAWQ with a build option
- How GPORCA could be made to work with any database
- Future vision for GPORCA and more immediate plans
- How to work with GPORCA, and how to contribute to GPORCA
Apache Ambari: Managing Hadoop and YARNHortonworks
Part of the Hortonworks YARN Ready Webinar Series, this session is about management of Apache Hadoop and YARN using Apache Ambari. This series targets developers and we will feature a demo on Ambari.
Double Your Hadoop Hardware Performance with SmartSenseHortonworks
Hortonworks SmartSense provides proactive recommendations that improve cluster performance, security and operations. And since 30% of issues are configuration related, Hortonworks SmartSense makes an immediate impact on Hadoop system performance and availability, in some cases boosting hardware performance by two times. Learn how SmartSense can help you increase the efficiency of your Hadoop hardware, through customized cluster recommendations.
View the on-demand webinar: https://hortonworks.com/webinar/boosts-hadoop-hardware-performance-2x-smartsense/
Hortonworks tech workshop in-memory processing with sparkHortonworks
Apache Spark offers unique in-memory capabilities and is well suited to a wide variety of data processing workloads including machine learning and micro-batch processing. With HDP 2.2, Apache Spark is a fully supported component of the Hortonworks Data Platform. In this session we will cover the key fundamentals of Apache Spark and operational best practices for executing Spark jobs along with the rest of Big Data workloads. We will also provide a working example to showcase micro-batch and machine learning processing using Apache Spark.
Best Practices for Hadoop Data Analysis with Tableau and Hortonworks Data Pla...Hortonworks
This how-to guide is packed with tips and tricks to get started and then get the most out of Tableau and HDP including unique features of the connector, performance best practices and a few gotchas covering:
-- Pre-requisites and basic installation.
-- Performing On-the-Fly ETL
-- Working with UDFs or MapReduce
-- Performance Techniques
-- Known Limitations
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon
Both Spark and HBase are widely used, but how to use them together with high performance and simplicity is a very hard topic. Spark HBase Connector(SHC) provides feature rich and efficient access to HBase through Spark SQL. It bridges the gap between the simple HBase key value store and complex relational SQL queries and enables users to perform complex data analytics on top of HBase using Spark.
SHC implements the standard Spark data source APIs, and leverages the Spark catalyst engine for query optimization. To achieve high performance, SHC constructs the RDD from scratch instead of using the standard HadoopRDD. With the customized RDD, all critical techniques can be applied and fully implemented, such as partition pruning, column pruning, predicate pushdown and data locality. The design makes the maintenance very easy, while achieving a good tradeoff between performance and simplicity.
Also, SHC has supported Phoenix data as input to HBase in addition to Avro data. Defaulting to a simple native binary encoding seems susceptible to future changes and is a risk for users who write data from SHC into HBase. For example, with SHC going forward, backwards compatibility needs to be properly handled. So the default, SHC needs to support a more standard and well tested format like Phoenix.
In this talk, we will demo how SHC works, how to use SHC in secure/non-secure clusters, how SHC works with multi-HBase clusters, etc. This talk will also benefit people who use Spark and other data sources (besides HBase) as it inspires them with ideas of how to support high performance data source access at the Spark DataFrame level.
Zeppelin has become a popular way to unlock the value of data lake due to its user interface and appeal to business users. These business users ask their IT department for access to Zeppelin. Enterprise IT department want to help their business users but they have several enterprise concerns such as enterprise security, integration with their corporate LDAP/AD, scalability and multi-user environment, integration with Ranger and Kerberos. This session will walk through enterprise concerns and how these concerns can be handled with Zeppelin.
Predictive Analytics and Machine Learning…with SAS and Apache HadoopHortonworks
In this interactive webinar, we'll walk through use cases on how you can use advanced analytics like SAS Visual Statistics and In-Memory Statistic with Hortonworks’ data platform (HDP) to reveal insights in your big data and redefine how your organization solves complex problems.
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
The recently launched HDP 2.3 is a major advancement of Open Enterprise Hadoop. It represents the best of community led development with innovations spanning Apache Hadoop, Apache Ambari, Ranger, HBase, Spark and Storm. In this session we will provide an in-depth overview of new functionality and discuss it's impact on new and ongoing big data initiatives.
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Spark Summit
Both Spark and HBase are widely used, but how to use them together with high performance and simplicity is a very challenging topic. Spark HBase Connector(SHC) provides feature rich and efficient access to HBase through Spark SQL. It bridges the gap between the simple HBase key value store and complex relational SQL queries and enables users to perform complex data analytics on top of HBase using Spark. SHC implements the standard Spark data source APIs, and leverages the Spark catalyst engine for query optimization. To achieve high performance, SHC constructs the RDD from scratch instead of using the standard HadoopRDD. With the customized RDD, all critical techniques can be applied and fully implemented, such as partition pruning, column pruning, predicate pushdown and data locality. The design makes the maintenance easy, while achieving a good tradeoff between performance and simplicity. In addition to fully supporting all the Avro schemas natively, SHC has also integrated natively with Phoenix data types. With SHC, Spark can execute batch jobs to read/write data from/into Phoenix tables. Phoenix can also read/write data from/into HBase tables created by SHC. For example, users can run a complex SQL query on top of an HBase table created by Phoenix inside Spark, perform a table join against an Dataframe which reads the data from a Hive table, or integrate with Spark Streaming to implement a more complicated system. In this talk, apart from explaining why SHC is of great use, we will also demo how SHC works, how to use SHC in secure/non-secure clusters, how SHC works with multiple secure HBase clusters, etc. This talk will also benefit people who use Spark and other data sources (besides HBase) as it inspires them with ideas of how to support high performance data source access at the Spark DataFrame level.
GPORCA is newly open source advanced query optimizer that is a subproject of Greenplum Database open source project. GPORCA is the query optimizer used in commercial distributions of both Greenplum and HAWQ. In these distributions GPORCA has achieved 1000x performance improvement across TPC-DS queries by focusing on three distinct areas: Dynamic Partition Elimination, SubQuery Unnesting, and Common Table Expression.
Now that GPORCA is open source, we are looking for collaborators to help us realize the ultimate dream for GPORCA - to work with any database.
The new breed of data management systems in Big Data have to process so much data that optimization mistakes are magnified in traditional optimizers. Furthermore, coding and manual optimization of complex queries has proven to be hard.
In this session, Venkatesh will discuss:
- Overview of GPORCA
- How to add GPORCA to HAWQ with a build option
- How GPORCA could be made to work with any database
- Future vision for GPORCA and more immediate plans
- How to work with GPORCA, and how to contribute to GPORCA
Apache Ambari: Managing Hadoop and YARNHortonworks
Part of the Hortonworks YARN Ready Webinar Series, this session is about management of Apache Hadoop and YARN using Apache Ambari. This series targets developers and we will feature a demo on Ambari.
Double Your Hadoop Hardware Performance with SmartSenseHortonworks
Hortonworks SmartSense provides proactive recommendations that improve cluster performance, security and operations. And since 30% of issues are configuration related, Hortonworks SmartSense makes an immediate impact on Hadoop system performance and availability, in some cases boosting hardware performance by two times. Learn how SmartSense can help you increase the efficiency of your Hadoop hardware, through customized cluster recommendations.
View the on-demand webinar: https://hortonworks.com/webinar/boosts-hadoop-hardware-performance-2x-smartsense/
Hortonworks tech workshop in-memory processing with sparkHortonworks
Apache Spark offers unique in-memory capabilities and is well suited to a wide variety of data processing workloads including machine learning and micro-batch processing. With HDP 2.2, Apache Spark is a fully supported component of the Hortonworks Data Platform. In this session we will cover the key fundamentals of Apache Spark and operational best practices for executing Spark jobs along with the rest of Big Data workloads. We will also provide a working example to showcase micro-batch and machine learning processing using Apache Spark.
Best Practices for Hadoop Data Analysis with Tableau and Hortonworks Data Pla...Hortonworks
This how-to guide is packed with tips and tricks to get started and then get the most out of Tableau and HDP including unique features of the connector, performance best practices and a few gotchas covering:
-- Pre-requisites and basic installation.
-- Performing On-the-Fly ETL
-- Working with UDFs or MapReduce
-- Performance Techniques
-- Known Limitations
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon
Both Spark and HBase are widely used, but how to use them together with high performance and simplicity is a very hard topic. Spark HBase Connector(SHC) provides feature rich and efficient access to HBase through Spark SQL. It bridges the gap between the simple HBase key value store and complex relational SQL queries and enables users to perform complex data analytics on top of HBase using Spark.
SHC implements the standard Spark data source APIs, and leverages the Spark catalyst engine for query optimization. To achieve high performance, SHC constructs the RDD from scratch instead of using the standard HadoopRDD. With the customized RDD, all critical techniques can be applied and fully implemented, such as partition pruning, column pruning, predicate pushdown and data locality. The design makes the maintenance very easy, while achieving a good tradeoff between performance and simplicity.
Also, SHC has supported Phoenix data as input to HBase in addition to Avro data. Defaulting to a simple native binary encoding seems susceptible to future changes and is a risk for users who write data from SHC into HBase. For example, with SHC going forward, backwards compatibility needs to be properly handled. So the default, SHC needs to support a more standard and well tested format like Phoenix.
In this talk, we will demo how SHC works, how to use SHC in secure/non-secure clusters, how SHC works with multi-HBase clusters, etc. This talk will also benefit people who use Spark and other data sources (besides HBase) as it inspires them with ideas of how to support high performance data source access at the Spark DataFrame level.
Zeppelin Interpreters
PSQL (to became JDBC in 0.6.x)
Geode
SpringXD
Apache Ambari
Zeppelin Service
Geode, HAWQ and Spring XD services
Webpage Embedder View
Apache HAWQ (Incubating) along with its extension framework (PXF) provides a high-performance massively-parallel SQL processing framework on unmanaged data stores/formats in the hadoop ecosystem. HCatalog provides a glue for the entire Hadoop ecosystem by providing a relational abstraction for HDFS data. This presentation introduces the integration of Hcatalog metadata into HAWQ's in memory catalog, which provides a simple and seamless access paradigm to data managed by Hive.
Pivotal HAWQ is a high performance SQL engine on top of Hadoop. It support SQL 92, multi-way joins and has one of the best query processing engines on top of Hadoop. This presentation explains some of the design principles behind HAWQ HA and offers insight into how it works with Hadoop HA.
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...VMware Tanzu
Pivotal HAWQ, one of the world’s most advanced enterprise SQL on Hadoop technology, coupled with the Hortonworks Data Platform, the only 100% open source Apache Hadoop data platform, can turbocharge your analytic efforts. The slides from this technical webinar present a deep dive on this powerful modern data architecture for analytics and data science.
Learn more here: http://pivotal.io/big-data/pivotal-hawq
Slides from joint webinar. Pivotal HAWQ, one of the world’s most advanced enterprise SQL on Hadoop technology, coupled with the Hortonworks Data Platform, the only 100% open source Apache Hadoop data platform, can turbocharge your Data Science efforts.
Together, Pivotal HAWQ and the Hortonworks Data Platform provide businesses with a Modern Data Architecture for IT transformation.
This is the presentation I delivered on Hadoop User Group Ireland meetup in Dublin on Nov 28 2015. It covers at glance the architecture of GPDB and most important its features. Sorry for the colors - Slideshare is crappy with PDFs
I gave this talk on the Highload++ conference 2015 in Moscow. Slides have been translated into English. They cover the Apache HAWQ components, its architecture, query processing logic, and also competitive information
This is the presentation I made on the Hadoop User Group Ireland meetup in Dublin. It covers the main ideas of both MPP, Hadoop and the distributed systems in general, and also how to chose the best option for you
Many of the world’s largest enterprises are replacing their traditional SAP server environments with SAP running in the AWS Cloud. As well as increasing business agility and scalability, our cloud platform significantly reduces SAP infrastructure and support costs, simplifies operations and contributes directly to the bottom line.
Apache Ambari is the only 100% open source management and provisioning tool for Apache Hadoop and Hortonworks Data Platform (HDP). Recent innovations of Apache Ambari have focused on opening Apache Ambari into a pluggable management platform that can automate cluster provisioning, deploy 3rd party software and provide custom operational and developers views to the end user. In this session Hortonworks will cover 3 key integration points of Apache Ambari including Stacks, Views and Blueprints and deliver working examples of each.
Comprehensive and Simplified Management for VMware vSphere environmentsHitachi Vantara
Learn how to gain velocity and agility within your VMware vSphere environments while reducing costs and simplifying the management of your server, network and storage infrastructure. You will also learn how to leverage a unified, converged infrastructure to more quickly deploy business-critical workloads within a private cloud environment. View this webcast and learn how to: Increase IT efficiency and gain business velocity by leveraging a unified and converged infrastructure solution from Hitachi. Enable both physical and virtual infrastructure consolidation while supporting thousands of VMs across the data center. Achieve cost reductions through automation and orchestration of your VMware vSphere environment across server, network and storage tiers. For more information on Hitachi Solutions for VMware visit: http://www.hds.com/solutions/applications/vmware/?WT.ac=us_mg_sol_vmw
Pivotal, la plateforme Big Data signé EMC, embarque des technologies pour gérer des requêtes sql en mémoire très performante et pas que ...
Présentation de Alexandre Vasseur et Jérôme Campo de Pivotal
AWS Webcast - Deploying SAP HANA Workloads on the Amazon Web Services CloudAmazon Web Services
More and more customers are running mission-critical SAP applications on the Amazon Web Services (AWS) Cloud. AWS is the only public cloud certified by SAP to run SAP HANA in production in both single and multi-node configurations up to 1.22TB of memory. AWS offers a variety of options to trial and deploy SAP HANA, including how to utilize SAP HANA licenses that you have already purchased (in a bring your own license scenario) on the low cost, secure, and highly elastic AWS Cloud without having to make long-term commitments or costly capital expenditures for their underlying infrastructure.
Review this content designed for technical decision makers, architects, or IT administrators to gain an overview of the technical capabilities and benefits of running SAP HANA on the AWS Cloud. We will also highlight relevant SAP Support Notes and available tools for HANA such as an Implementation and Operations Guide, and the new SAP HANA on AWS Quick Start Reference Deployment Guide, which includes instructions on how to automate implementation of SAP HANA using AWS CloudFormation templates.
Reasons to Review:
• Gain an overview on the technical capabilities of running SAP HANA on AWS
• Learn more about how to design, deploy, and implement HANA on AWS
• Where to find deployment tools, support notes, available test drives or trials, and a new SAP HANA on AWS Quick Start Reference Deployment Guide
• Hear about customer success stories
• Listen to AWS experts and get your questions answered
• How to engage with AWS to get assistance or find the right AWS partner to help
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Managing Apache HAWQ with Apache AMBARI
1. Managing Apache HAWQ
with Apache AMBARI
Apache Ambari Meetup - June 27, 2016
Alexander Denissov
Bhuvnesh Chaudhary
Mithun Mathew
Apache HAWQ
(incubating)
Apache Ambari
2. Hadoop-native SQL query engine and advanced analytics
MPP database that offers:
1
2
3
4
5
interactive query execution
high performance
machine learning algorithms
tools for Data Analysts and Data Scientists
processing for large and complex data sets
APACHE HAWQ (incubating)
4. HAWQ - AMBARI INTEGRATION SCOPE
Installation and configuration
Topology and configuration recommendations and validations
Kerberos and High Availability support
HAWQ Master - HAWQ Standby failover
Service and Component Alerts
Visual Widgets
5. HAWQ - AMBARI INTEGRATION EFFORT
Praises
Ambari’s pluggable architecture makes integrations like this possible and easy
Kerberos setup is fully metadata driven — major kudos!
Challenges
HAWQ is not part of the HDP stack and is not available in Ambari out-of-the box
Advanced features and wizards require JavaScript code modifications
Driven by the team of engineers at Pivotal
Developed integrations from basic to more advanced
Invaluable support from Ambari Community
THANK YOU!
14. HAWQ AMBARI FUTURE INTEGRATION
Support automated upgrade independent of stack
Ongoing related work: AMBARI-14854, AMBARI-12885
Ambari requires service restart for pushing configuration changes.
What if, the service can reload configurations without restart?
Ongoing related work: AMBARI-17241
HAWQ Upgrade
Dynamic Configuration Reload
Display query history
Manage resource queues
HAWQ View
15. Currently Ambari does not support configuration
changes without restarting service
Some parameters do NOT require restart!
HDFS dfs.heartbeat.interval, dfs.namenode.heartbeat.recheck-interval
HAWQ default_hash_table_bucket_number, hawq_rm_memory_limit_perseg
DYNAMIC CONFIGURATION RELOAD
16. Currently Ambari does not support configuration
changes without restarting service
Some parameters do NOT require restart!
HDFS dfs.heartbeat.interval, dfs.namenode.heartbeat.recheck-interval
HAWQ default_hash_table_bucket_number, hawq_rm_memory_limit_perseg
DOWNTIME!!!
Consequence of Restarting the Service:
DYNAMIC CONFIGURATION RELOAD