Customers are preparing themselves to analyze and manage an increasing quantity of structured and unstructured data. Business leaders introduce new analytical workloads faster than what IT departments can handle. Legacy IT infrastructure needs to evolve to deliver operational improvements and cost containment, while increasing flexibility to meet future requirements. By providing HDP on IBM Power Systems, Hortonworks and IBM are giving customers have more choice in selecting the appropriate architectural platform that is right for them. In this webinar, we’ll discuss some of the challenges with deploying big data platforms, and how choosing solutions built with HDP on IBM Power Systems can offer tangible benefits and flexibility to accommodate changing needs.
In 2017, more and more corporations are looking to reduce operational overheads in their enterprise data warehouse (EDW) installations. Hortonworks just launched Industry’s first turn key EDW Optimization solution together with our partners Syncsort and AtScale. Join Hortonworks’ CTO Scott Gnau to learn more about this exciting solution and its 3 use cases.
View the recording:
http://hortonworks.com/webinar/accelerating-real-time-data-ingest-hadoop/
Hadoop didn’t disrupt the data center. The exploding amounts of data did. But, let’s face it, if you can’t move your data to Hadoop, then you can’t use it in Hadoop. The experts from Hortonworks, the #1 leader in Hadoop development, and Attunity, a leading data management software provider, cover:
- How to ingest your most valuable data into Hadoop using Attunity Replicate
- About how customers are using Hortonworks DataFlow (HDF) powered by Apache NiFi
- How to combine the real-time change data capture (CDC) technology with connected data platforms from Hortonworks
We discuss how Attunity Replicate and Hortonworks Data Flow (HDF) work together to move data into Hadoop.
How to Use Apache Zeppelin with HWX HDBHortonworks
Part five in a five-part series, this webcast will be a demonstration of the integration of Apache Zeppelin and Pivotal HDB. Apache Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more. This webinar will demonstrate the configuration of the psql interpreter and the basic operations of Apache Zeppelin when used in conjunction with Hortonworks HDB.
Double Your Hadoop Hardware Performance with SmartSenseHortonworks
Hortonworks SmartSense provides proactive recommendations that improve cluster performance, security and operations. And since 30% of issues are configuration related, Hortonworks SmartSense makes an immediate impact on Hadoop system performance and availability, in some cases boosting hardware performance by two times. Learn how SmartSense can help you increase the efficiency of your Hadoop hardware, through customized cluster recommendations.
View the on-demand webinar: https://hortonworks.com/webinar/boosts-hadoop-hardware-performance-2x-smartsense/
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks
How Hortonworks DataFlow (HDF), powered by Apache NIFi, MiNiFi, Kafka and Storm, and it’s associated HDF Certification Program make it easier and faster to integrate different systems together. Highlights on the latest partner integrations from HPE, SAS, Attunity, Impetus Technologies, Kepware and Midfin Systems. “
Watch the webinar on-demand: http://hortonworks.com/webinar/make-big-data-ecosystem-work-better/
HDF Partner certification program: http://hortonworks.com/partners/product-integration-certification/#hdf-integration
Scaling real time streaming architectures with HDF and Dell EMC IsilonHortonworks
Streaming Analytics are the new normal. Customers are exploring use cases that have quickly transitioned from batch to near real time. Hortonworks Data Flow / Apache NiFi and Isilon provide a robust scalable architecture to enable real time streaming architectures. Explore our use cases and demo on how Hortonworks Data Flow and Isilon can empower your business for real time success
Webinar Series Part 5 New Features of HDF 5Hortonworks
Overview of the newest features of Hortonworks DataFlow highlighting the new processors, new user interface, edge intelligence powered by Apache MiNiFi and new support for multi-tenancy and new zero master clustering architecture
In 2017, more and more corporations are looking to reduce operational overheads in their enterprise data warehouse (EDW) installations. Hortonworks just launched Industry’s first turn key EDW Optimization solution together with our partners Syncsort and AtScale. Join Hortonworks’ CTO Scott Gnau to learn more about this exciting solution and its 3 use cases.
View the recording:
http://hortonworks.com/webinar/accelerating-real-time-data-ingest-hadoop/
Hadoop didn’t disrupt the data center. The exploding amounts of data did. But, let’s face it, if you can’t move your data to Hadoop, then you can’t use it in Hadoop. The experts from Hortonworks, the #1 leader in Hadoop development, and Attunity, a leading data management software provider, cover:
- How to ingest your most valuable data into Hadoop using Attunity Replicate
- About how customers are using Hortonworks DataFlow (HDF) powered by Apache NiFi
- How to combine the real-time change data capture (CDC) technology with connected data platforms from Hortonworks
We discuss how Attunity Replicate and Hortonworks Data Flow (HDF) work together to move data into Hadoop.
How to Use Apache Zeppelin with HWX HDBHortonworks
Part five in a five-part series, this webcast will be a demonstration of the integration of Apache Zeppelin and Pivotal HDB. Apache Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more. This webinar will demonstrate the configuration of the psql interpreter and the basic operations of Apache Zeppelin when used in conjunction with Hortonworks HDB.
Double Your Hadoop Hardware Performance with SmartSenseHortonworks
Hortonworks SmartSense provides proactive recommendations that improve cluster performance, security and operations. And since 30% of issues are configuration related, Hortonworks SmartSense makes an immediate impact on Hadoop system performance and availability, in some cases boosting hardware performance by two times. Learn how SmartSense can help you increase the efficiency of your Hadoop hardware, through customized cluster recommendations.
View the on-demand webinar: https://hortonworks.com/webinar/boosts-hadoop-hardware-performance-2x-smartsense/
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks
How Hortonworks DataFlow (HDF), powered by Apache NIFi, MiNiFi, Kafka and Storm, and it’s associated HDF Certification Program make it easier and faster to integrate different systems together. Highlights on the latest partner integrations from HPE, SAS, Attunity, Impetus Technologies, Kepware and Midfin Systems. “
Watch the webinar on-demand: http://hortonworks.com/webinar/make-big-data-ecosystem-work-better/
HDF Partner certification program: http://hortonworks.com/partners/product-integration-certification/#hdf-integration
Scaling real time streaming architectures with HDF and Dell EMC IsilonHortonworks
Streaming Analytics are the new normal. Customers are exploring use cases that have quickly transitioned from batch to near real time. Hortonworks Data Flow / Apache NiFi and Isilon provide a robust scalable architecture to enable real time streaming architectures. Explore our use cases and demo on how Hortonworks Data Flow and Isilon can empower your business for real time success
Webinar Series Part 5 New Features of HDF 5Hortonworks
Overview of the newest features of Hortonworks DataFlow highlighting the new processors, new user interface, edge intelligence powered by Apache MiNiFi and new support for multi-tenancy and new zero master clustering architecture
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
Companies in every industry look for ways to explore new data types and large data sets that were previously too big to capture, store and process. They need to unlock insights from data such as clickstream, geo-location, sensor, server log, social, text and video data. However, becoming a data-first enterprise comes with many challenges.
Join this webinar organized by three leaders in their respective fields and learn from our experts how you can accelerate the implementation of a scalable, cost-efficient and robust Big Data solution. Cisco, Hortonworks and Red Hat will explore how new data sets can enrich existing analytic applications with new perspectives and insights and how they can help you drive the creation of innovative new apps that provide new value to your business.
Unlocking a fully integrated Spark experience within your enterprise Hadoop environment that is manageable, secure and deployable anywhere.
Presented at the Spark Summit by Arun C Murthy (co-Founder, Hortonworks) on Monday, June 15, 2015.
This workshop is a hands-on session to quickly deploy Hadoop and Streaming on AWS / Azure / Google Cloud.
Cloudbreak simplifies the deployment of Hadoop in cloud environments. It enables the enterprise to quickly run big data workloads in the cloud while optimizing the use of cloud resources
Objective
To provide a quick and short hands-on introduction to Hadoop on the cloud. Review key benefits of cluster deployment automation.
This lab will use Cloudbreak to quickly and effortlessly stand up Hadoop and Streaming clusters in a cloud provider of your choice. The lab shows the use of Ambari blueprints that are your declarative definitions of your Hadoop or Streaming clusters. Steps to dynamically change these blueprints and use external databases and external authentication sources and in essence showing a way to provide Shared Authentication, Authorization and Audit across ephemeral and long-lasting clusters. However it is not limited to only custom blueprints, the lab also shows how Cloudbreak provides easy to use custom scripts called recipes that can be executed before or after Ambari start or after cluster installation.
Pre-requisites
Registrants must bring a laptop for the lab. These labs will be done in the Cloud. Please follow below steps to setup an AWS or Azure account prior to this session starting.
a. Before launching Cloudbreak on AWS, you must meet the AWS prerequisites.
b. Before launching Cloudbreak on Azure, you must meet the Azure prerequisites.
Speaker: Santosh Gowda
Apache Hive has been continuously evolving to support a broad range of use cases, bringing it beyond its batch processing roots to its current support for interactive queries with sub-second response times using LLAP. However, the development of its execution internals is not sufficient to guarantee efficient performance, since poorly optimized queries can create a bottleneck in the system. Hence, each release of Hive has included new features for its optimizer aimed to generate better plans and deliver improvements to query execution. In this talk, we present the development of the optimizer since its initial release. We describe its current state and how Hive leverages the latest Apache Calcite features to generate the most efficient execution plans. We show numbers demonstrating the improvements brought to Hive performance, and we discuss future directions for the next-generation Hive optimizer, which include an enhanced cost model, materialized views support, and complex query decorrelation.
Learn more: http://hortonworks.com/hdf/
Log data can be complex to capture, typically collected in limited amounts and difficult to operationalize at scale. HDF expands the capabilities of log analytics integration options for easy and secure edge analytics of log files in the following ways:
More efficient collection and movement of log data by prioritizing, enriching and/or transforming data at the edge to dynamically separate critical data. The relevant data is then delivered into log analytics systems in a real-time, prioritized and secure manner.
Cost-effective expansion of existing log analytics infrastructure by improving error detection and troubleshooting through more comprehensive data sets.
Intelligent edge analytics to support real-time content-based routing, prioritization, and simultaneous delivery of data into Connected Data Platforms, log analytics and reporting systems for comprehensive coverage and retention of Internet of Anything data.
Powering Big Data Success On-Prem and in the CloudHortonworks
How do you optimize Apache Spark workloads in the cloud? How do you tune your resources for maximum performance and efficiency? Find out how the new Hortonworks Flex support subscriptions enables IT agility and success in the cloud. We will cover:
* Options for running Data Science, Analytics and ETL workloads in the cloud
* Hortonworks support offerings including new Flex Support Subscription
* How to run Cloud workloads more efficiently with SmartSense
* Case study on the impact of SmartSense
https://hortonworks.com/webinar/powering-big-data-success-cloud/
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsHortonworks
Verizon Global Technology Services (GTS) was challenged by a multi-tier, labor-intensive process when trying to migrate data from disparate sources into a data lake to create financial reports and business insights.
View the webinar on-demand here: https://hortonworks.com/webinar/verizon-centralizes-data-into-data-lake/
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks
How To: Hortonworks DataFlow 2.0 with Ambari and Ranger for integrated installation, deployment and operations of Apache NiFi.
On demand webinar with demo: http://hortonworks.com/webinar/getting-goal-big-data-faster-enterprise-readiness-data-motion/
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3DataWorks Summit
Deep learning is useful for enterprises tasks in the field of speech recognition, image classification, AI chatbots, and machine translation, just to name a few.
In order to train deep learning/machine learning models, applications such as TensorFlow, MXNet, Caffe, and XGBoost can be leveraged. And sometimes these applications will be used together to solve different problems.
To make distributed deep learning/machine learning applications easily launched, managed, and monitored, we introduced, in Apache Hadoop 3.x, YARN native services along with other improvements such as first-class GPU support, container-DNS support, scheduling improvements, etc. These improvements make distributed deep learning/machine learning applications run on YARN as simple as running it locally, which can let machine learning engineers focus on algorithms instead of worrying about underlying infrastructure. Also, YARN can better manage a shared cluster which runs deep learning/machine learning and other services and ETL jobs with these improvements.
In this session, we will take a closer look at these improvements and show how to run these applications on YARN with demos. Audiences can start trying running these applications on YARN after this talk.
Speakers
Wanga Tan, Staff Software Engineer, Hortonworks
Sunil Govindan, Staff Engineer, Hortonworks
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
The recently launched HDP 2.3 is a major advancement of Open Enterprise Hadoop. It represents the best of community led development with innovations spanning Apache Hadoop, Apache Ambari, Ranger, HBase, Spark and Storm. In this session we will provide an in-depth overview of new functionality and discuss it's impact on new and ongoing big data initiatives.
Apache Hadoop YARN is the modern Distributed Operating System. It enables the Hadoop compute layer to be a common resource-management platform that can host a wide variety of applications. Multiple organizations are able to leverage YARN in building their applications on top of Hadoop without themselves repeatedly worrying about resource management, isolation, multi-tenancy issues etc.
In this talk, we’ll first hit the ground with the current status of Apache Hadoop YARN – how it is faring today in deployments large and small. We will cover different types of YARN deployments, in different environments and scale.
We'll then move on to the exciting present & future of YARN – features that are further strengthening YARN as the first-class resource-management platform for datacenters running enterprise Hadoop. We’ll discuss the current status as well as the future promise of features and initiatives like – 10x scheduler throughput improvements, docker containers support on YARN, support for long running services (alongside applications) natively without any changes, seamless application upgrades, fine-grained isolation for multi-tenancy using CGroups on disk & network resources, powerful scheduling features like application priorities, intra-queue preemption across applications and operational enhancements including insights through Timeline Service V2, a new web UI and better queue management.
Speaker:
Sunil Govindan, Senior Software Engineer, Hortonworks
Rohith Sharma K S, Senior Software Engineer, Hortonworks
Analytics Modernization: Configuring SAS® Grid Manager for HadoopHortonworks
Improve the efficiency and accelerate job execution by moving traditional SAS workloads into Hadoop to modernize and optimize SAS analytics. How can we run traditional SAS® jobs, including SAS® Workspace Servers, on Hadoop worker nodes? The answer is SAS® Grid Manager for Hadoop, which is integrated with the Hadoop ecosystem to provide resource management, high availability and enterprise scheduling for SAS customers. By moving SAS workloads inside the Hadoop cluster, efficiency is improved and job execution is accelerated. We will also cover the role of Hadoop YARN, Hadoop Distributed File System (HDFS) storage, and Hadoop client services. We review SAS metadata definitions for SAS Grid Manager, SAS® Object Spawner, and SAS® Workspace Servers. Audio broadcast: https://hortonworks.com/webinar/configuring-sas-grid-manager-hadoop/
How Universities Use Big Data to Transform EducationHortonworks
Student performance data is increasingly being captured as part of software-based and online classroom exercises and testing. This data can be augmented with behavioral data captured from sources such as social media, student-professor meeting notes, blogs, student surveys, and so forth to discover new insights to improve student learning. The results transcend traditional IT departments to focus on issues like retention, research, and the delivery of content and courses through new modalities.
Hortonworks is partnering with Microsoft to show you how the Hortonworks Data Platform (HDP) running on the Microsoft stack enables you to develop a “single view of a student”.
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
Companies in every industry look for ways to explore new data types and large data sets that were previously too big to capture, store and process. They need to unlock insights from data such as clickstream, geo-location, sensor, server log, social, text and video data. However, becoming a data-first enterprise comes with many challenges.
Join this webinar organized by three leaders in their respective fields and learn from our experts how you can accelerate the implementation of a scalable, cost-efficient and robust Big Data solution. Cisco, Hortonworks and Red Hat will explore how new data sets can enrich existing analytic applications with new perspectives and insights and how they can help you drive the creation of innovative new apps that provide new value to your business.
Unlocking a fully integrated Spark experience within your enterprise Hadoop environment that is manageable, secure and deployable anywhere.
Presented at the Spark Summit by Arun C Murthy (co-Founder, Hortonworks) on Monday, June 15, 2015.
This workshop is a hands-on session to quickly deploy Hadoop and Streaming on AWS / Azure / Google Cloud.
Cloudbreak simplifies the deployment of Hadoop in cloud environments. It enables the enterprise to quickly run big data workloads in the cloud while optimizing the use of cloud resources
Objective
To provide a quick and short hands-on introduction to Hadoop on the cloud. Review key benefits of cluster deployment automation.
This lab will use Cloudbreak to quickly and effortlessly stand up Hadoop and Streaming clusters in a cloud provider of your choice. The lab shows the use of Ambari blueprints that are your declarative definitions of your Hadoop or Streaming clusters. Steps to dynamically change these blueprints and use external databases and external authentication sources and in essence showing a way to provide Shared Authentication, Authorization and Audit across ephemeral and long-lasting clusters. However it is not limited to only custom blueprints, the lab also shows how Cloudbreak provides easy to use custom scripts called recipes that can be executed before or after Ambari start or after cluster installation.
Pre-requisites
Registrants must bring a laptop for the lab. These labs will be done in the Cloud. Please follow below steps to setup an AWS or Azure account prior to this session starting.
a. Before launching Cloudbreak on AWS, you must meet the AWS prerequisites.
b. Before launching Cloudbreak on Azure, you must meet the Azure prerequisites.
Speaker: Santosh Gowda
Apache Hive has been continuously evolving to support a broad range of use cases, bringing it beyond its batch processing roots to its current support for interactive queries with sub-second response times using LLAP. However, the development of its execution internals is not sufficient to guarantee efficient performance, since poorly optimized queries can create a bottleneck in the system. Hence, each release of Hive has included new features for its optimizer aimed to generate better plans and deliver improvements to query execution. In this talk, we present the development of the optimizer since its initial release. We describe its current state and how Hive leverages the latest Apache Calcite features to generate the most efficient execution plans. We show numbers demonstrating the improvements brought to Hive performance, and we discuss future directions for the next-generation Hive optimizer, which include an enhanced cost model, materialized views support, and complex query decorrelation.
Learn more: http://hortonworks.com/hdf/
Log data can be complex to capture, typically collected in limited amounts and difficult to operationalize at scale. HDF expands the capabilities of log analytics integration options for easy and secure edge analytics of log files in the following ways:
More efficient collection and movement of log data by prioritizing, enriching and/or transforming data at the edge to dynamically separate critical data. The relevant data is then delivered into log analytics systems in a real-time, prioritized and secure manner.
Cost-effective expansion of existing log analytics infrastructure by improving error detection and troubleshooting through more comprehensive data sets.
Intelligent edge analytics to support real-time content-based routing, prioritization, and simultaneous delivery of data into Connected Data Platforms, log analytics and reporting systems for comprehensive coverage and retention of Internet of Anything data.
Powering Big Data Success On-Prem and in the CloudHortonworks
How do you optimize Apache Spark workloads in the cloud? How do you tune your resources for maximum performance and efficiency? Find out how the new Hortonworks Flex support subscriptions enables IT agility and success in the cloud. We will cover:
* Options for running Data Science, Analytics and ETL workloads in the cloud
* Hortonworks support offerings including new Flex Support Subscription
* How to run Cloud workloads more efficiently with SmartSense
* Case study on the impact of SmartSense
https://hortonworks.com/webinar/powering-big-data-success-cloud/
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsHortonworks
Verizon Global Technology Services (GTS) was challenged by a multi-tier, labor-intensive process when trying to migrate data from disparate sources into a data lake to create financial reports and business insights.
View the webinar on-demand here: https://hortonworks.com/webinar/verizon-centralizes-data-into-data-lake/
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks
How To: Hortonworks DataFlow 2.0 with Ambari and Ranger for integrated installation, deployment and operations of Apache NiFi.
On demand webinar with demo: http://hortonworks.com/webinar/getting-goal-big-data-faster-enterprise-readiness-data-motion/
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3DataWorks Summit
Deep learning is useful for enterprises tasks in the field of speech recognition, image classification, AI chatbots, and machine translation, just to name a few.
In order to train deep learning/machine learning models, applications such as TensorFlow, MXNet, Caffe, and XGBoost can be leveraged. And sometimes these applications will be used together to solve different problems.
To make distributed deep learning/machine learning applications easily launched, managed, and monitored, we introduced, in Apache Hadoop 3.x, YARN native services along with other improvements such as first-class GPU support, container-DNS support, scheduling improvements, etc. These improvements make distributed deep learning/machine learning applications run on YARN as simple as running it locally, which can let machine learning engineers focus on algorithms instead of worrying about underlying infrastructure. Also, YARN can better manage a shared cluster which runs deep learning/machine learning and other services and ETL jobs with these improvements.
In this session, we will take a closer look at these improvements and show how to run these applications on YARN with demos. Audiences can start trying running these applications on YARN after this talk.
Speakers
Wanga Tan, Staff Software Engineer, Hortonworks
Sunil Govindan, Staff Engineer, Hortonworks
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
The recently launched HDP 2.3 is a major advancement of Open Enterprise Hadoop. It represents the best of community led development with innovations spanning Apache Hadoop, Apache Ambari, Ranger, HBase, Spark and Storm. In this session we will provide an in-depth overview of new functionality and discuss it's impact on new and ongoing big data initiatives.
Apache Hadoop YARN is the modern Distributed Operating System. It enables the Hadoop compute layer to be a common resource-management platform that can host a wide variety of applications. Multiple organizations are able to leverage YARN in building their applications on top of Hadoop without themselves repeatedly worrying about resource management, isolation, multi-tenancy issues etc.
In this talk, we’ll first hit the ground with the current status of Apache Hadoop YARN – how it is faring today in deployments large and small. We will cover different types of YARN deployments, in different environments and scale.
We'll then move on to the exciting present & future of YARN – features that are further strengthening YARN as the first-class resource-management platform for datacenters running enterprise Hadoop. We’ll discuss the current status as well as the future promise of features and initiatives like – 10x scheduler throughput improvements, docker containers support on YARN, support for long running services (alongside applications) natively without any changes, seamless application upgrades, fine-grained isolation for multi-tenancy using CGroups on disk & network resources, powerful scheduling features like application priorities, intra-queue preemption across applications and operational enhancements including insights through Timeline Service V2, a new web UI and better queue management.
Speaker:
Sunil Govindan, Senior Software Engineer, Hortonworks
Rohith Sharma K S, Senior Software Engineer, Hortonworks
Analytics Modernization: Configuring SAS® Grid Manager for HadoopHortonworks
Improve the efficiency and accelerate job execution by moving traditional SAS workloads into Hadoop to modernize and optimize SAS analytics. How can we run traditional SAS® jobs, including SAS® Workspace Servers, on Hadoop worker nodes? The answer is SAS® Grid Manager for Hadoop, which is integrated with the Hadoop ecosystem to provide resource management, high availability and enterprise scheduling for SAS customers. By moving SAS workloads inside the Hadoop cluster, efficiency is improved and job execution is accelerated. We will also cover the role of Hadoop YARN, Hadoop Distributed File System (HDFS) storage, and Hadoop client services. We review SAS metadata definitions for SAS Grid Manager, SAS® Object Spawner, and SAS® Workspace Servers. Audio broadcast: https://hortonworks.com/webinar/configuring-sas-grid-manager-hadoop/
How Universities Use Big Data to Transform EducationHortonworks
Student performance data is increasingly being captured as part of software-based and online classroom exercises and testing. This data can be augmented with behavioral data captured from sources such as social media, student-professor meeting notes, blogs, student surveys, and so forth to discover new insights to improve student learning. The results transcend traditional IT departments to focus on issues like retention, research, and the delivery of content and courses through new modalities.
Hortonworks is partnering with Microsoft to show you how the Hortonworks Data Platform (HDP) running on the Microsoft stack enables you to develop a “single view of a student”.
The path to a Modern Data Architecture in Financial ServicesHortonworks
Delivering Data-Driven Applications at the Speed of Business: Global Banking AML use case.
Chief Data Officers in financial services have unique challenges: they need to establish an effective data ecosystem under strict governance and regulatory requirements. They need to build the data-driven applications that enable risk and compliance initiatives to run efficiently. In this webinar, we will discuss the case of a global banking leader and the anti-money laundering solution they built on the data lake. With a single platform to aggregate structured and unstructured information essential to determine and document AML case disposition, they reduced mean time for case resolution by 75%. They have a roadmap for building over 150 data-driven applications on the same search-based data discovery platform so they can mitigate risks and seize opportunities, at the speed of business.
Dynamic Column Masking and Row-Level Filtering in HDPHortonworks
As enterprises around the world bring more of their sensitive data into Hadoop data lakes, balancing the need for democratization of access to data without sacrificing strong security principles becomes paramount. In this webinar, Srikanth Venkat, director of product management for security & governance will demonstrate two new data protection capabilities in Apache Ranger – dynamic column masking and row level filtering of data stored in Apache Hive. These features have been introduced as part of HDP 2.5 platform release.
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks
Apache NiFi, Storm and Kafka augment each other in modern enterprise architectures. NiFi provides a coding free solution to get many different formats and protocols in and out of Kafka and compliments Kafka with full audit trails and interactive command and control. Storm compliments NiFi with the capability to handle complex event processing.
Join us to learn how Apache NiFi, Storm and Kafka can augment each other for creating a new dataplane connecting multiple systems within your enterprise with ease, speed and increased productivity.
https://www.brighttalk.com/webcast/9573/224063
Pivotal - Advanced Analytics for Telecommunications Hortonworks
Innovative mobile operators need to mine the vast troves of unstructured data now available to them to help develop compelling customer experiences and uncover new revenue opportunities. In this webinar, you’ll learn how HDB’s in-database analytics enable advanced use cases in network operations, customer care, and marketing for better customer experience. Join us, and get started on your advanced analytics journey today!
Top 5 Strategies for Retail Data AnalyticsHortonworks
It’s an exciting time for retailers as technology is driving a major disruption in the market. Whether you are just beginning to build a retail data analytics program or you have been gaining advanced insights from your data for quite some time, join Eric and Shish as we explore the trends, drivers and hurdles in retail data analytics
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...Hortonworks
Only 23% of businesses can integrate customer insights in real-time. Learn how to change that. Join us to hear from industry experts on how to transform your organization’s data into the best omnichannel customer experience. Through this webinar, participants will hear how one retailer, with over 5 million customers and 750 brands, developed precise customer lifetime models using trusted data and delivered personalized promotions at scale. Through a single customer view and customer analytics, the retailer was able to quickly learn what changes needed to be made to improve the customer buying journey, and make those changes rapidly and effectively.
Presenters : Dan Mitchell, Director of Global Retail and CPG Practice at SAS, Eric Thorsen, VP Retail at Hortonworks
Real time trade surveillance in financial marketsHortonworks
Who’s winning the deep forensic analysis ‘arms race’ for compliance? Real-time trade surveillance in global financial markets has created a data tsunami. With greater volumes of data comes greater compliance risk. CNBC reports U.S. Banks have been fined over $200B since the financial crisis. How are compliance teams fighting back to make more of the data and stay out of regulatory hot water? Rapid response to suspect trades means compliance teams need to access and visualize trade patterns, real time and historic data, to navigate the data in depth and flag possible violations. Join Hortonworks and Arcadia for this live webinar: we’ll cover the use case at a top 50 Global Bank who now has deep forensic analysis of trade activity. The result: interactive, ad hoc data visualization and access across multiple platforms – without limits on historic data – to detect irregularities as they happen. In-depth expert presentations by:
Shailesh Ambike, Executive Co-Chair of Compliance & Legal Section (CLS) Education Sub-Committee of the Investment Industry Regulatory Organization of Canada (IIROC)
Vamsi K Chemitiganti, GM – Financial Services at Hortonworks
Enabling the Real Time Analytical EnterpriseHortonworks
Combining IOT, Customer Experience and Real-Time Enterprise Data within Hadoop. What if you could derive real-time insights using ALL of your data? Join us for this webinar and learn how companies are combining “new” real-time data sources (i.e. IOT, Social, Web Logs) with continuously updated enterprise data from SAP and other enterprise transactional systems, providing deep and up-to-the-second analytical insights. This presentation will include a demonstration of how this can be achieved quickly, easily and affordably by utilizing a joint solution from Attunity and Hortonworks.
Hortonworks technical workshop operations with ambariHortonworks
Ambari continues on its journey of provisioning, monitoring and managing enterprise Hadoop deployments. With 2.0, Apache Ambari brings a host of new capabilities including updated metric collections; Kerberos setup automation and developer views for Big Data developers. In this Hortonworks Technical Workshop session we will provide an in-depth look into Apache Ambari 2.0 and showcase security setup automation using Ambari 2.0. View the recording at https://www.brighttalk.com/webcast/9573/155575. View the github demo work at https://github.com/abajwa-hw/ambari-workshops/blob/master/blueprints-demo-security.md. Recorded May 28, 2015.
The Apache Hadoop community is gearing up for the upcoming release of Apache Hadoop 0.23 - the first major release since 0.20 in 2009. This release has major enhancements to Hadoop such as HDFS Federation for hyper-scale and a Next Generation MapReduce framework. Arun, the Apache Hadoop Release Master for 0.23, willcover the highlights of the release and talk about efforts undertaken to test, stabilize and release Hadoop.next. The talk covers some of the timelines for the release, our plans for compatibility and upgrade paths for existing users of Hadoop.
Presented at Bay Area Hadoop User Group at Yahoo on 8/25/2011.
see the recording: http://youtu.be/qdhF1sfef10
Ofer Medelvitch, Director of Data Science of Hortonworks and Michael Zeller, Founder and CEO of Zementis present key learnings as to what drives successful implementations of big data analytics projects. Their knowledge comes from working with dozens of companies from small cloud-based start-ups to some of the largest companies in the world.
Think of big data as all data, no matter what the volume, velocity, or variety. The simple truth is a traditional on-prem data warehouse will not handle big data. So what is Microsoft’s strategy for building a big data solution? And why is it best to have this solution in the cloud? That is what this presentation will cover. Be prepared to discover all the various Microsoft technologies and products from collecting data, transforming it, storing it, to visualizing it. My goal is to help you not only understand each product but understand how they all fit together, so you can be the hero who builds your companies big data solution.
PASS Summit Data Storytelling with R Power BI and AzureMLJen Stirrup
How can we use technology to help the organization make data-driven decision-making part of its organizational DNA, while retaining the context of the business as a whole? How can we imprint data in the culture of the organization and make it easily accessible to everyone? Microsoft directly empowers businesses to derive insights and value from little and big data, through its release of user-friendly analytics through Azure Machine Learning (ML) combined with its acquisition of Revolution Analytics. Power BI can be used to create compelling visual stories around the analysis so that the work is not left to the data consumer. Together, these technologies can be used to make data and analytics part of the organization's DNA.
There are no prerequisites, but attendees are welcome to follow along with the demo if they have an Azure ML and Power BI account and R installed. Files will be released before the session.
This is a run-through at a 200 level of the Microsoft Azure Big Data Analytics for the Cloud data platform based on the Cortana Intelligence Suite offerings.
Le premier numéro de la collection des « Livrets de la France insoumise » détaille les mesures d’urgence et les grandes orientations sur l’agriculture et l’alimentation. Il a été préparé par un groupe de travail animé par Laurent Levard, agro-économiste, et Eve Saymard, agronome.
https://avenirencommun.fr/livrets-thematiques/
Calista Redmond from IBM presented this deck at the Switzerland HPC Conference.
“The OpenPOWER Foundation was founded in 2013 as an open technical membership organization that will enable data centers to rethink their approach to technology. Today, nearly 200 member companies are enabled to customize POWER CPU processors and system platforms for optimization and innovation for their business needs. These innovations include custom systems for large or warehouse scale data centers, workload acceleration through GPU, FPGA or advanced I/O, platform optimization for SW appliances, or advanced hardware technology exploitation. OpenPOWER members are actively pursing all of these innovations and more and welcome all parties to join in moving the state of the art of OpenPOWER systems design forward.”
Watch the video presentation: http://insidehpc.com/2016/03/openpower-foundation/
See more talks in the Swiss Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this video from Moabcon 2013, Dick Bland and Jérôme Labat from HP present: The New Style of IT: HP Update for Moabcon 2013.
"Cloud, Mobility, Security, and Big Data are transforming what the business expects from IT resulting in a “New Style of IT.” The result of alternative thinking from a proven industry leader, HP Moonshot is the world’s first software defined server that will accelerate innovation while delivering breakthrough efficiency and scale."
While the first spin of Moonshot is not targeted at HPC, Bland said that HP will be able to spin up new modules for the platform that could include FPGAs and ARM-based nodes more suited to high performance computing.
Learn more at: http://www.adaptivecomputing.com/company/news-and-events/events/moabcon-2013/moabcon-2013-full-agenda/
You can watch the video of this talk at this URL: http://inside-cloud.com/2013/04/video-the-new-style-of-it-hp-moonshot-update-for-moabcon-2013/
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
So you built your Hadoop cluster. How do you get data from hundreds of database tables, streaming Kafka sources, and data shared by 20-year-old COBOL programs, all in there and working together quickly, efficiently and securely? With many customers asking this same question, Hortonworks recently expanded its partnership with Syncsort to provide optimized ETL onboarding for Hadoop. During this talk, we'll discuss how a next-generation ETL tool, built on contributions to the open source community and natively integrated in Hadoop, can drive lasting value for your organization. 1) Seamlessly onboard data from all your enterprise sources – batch and streaming -- into Hadoop for fast and easy analytics. 2) Stay agile and simplify your environment with a "design once, deploy anywhere" approach that minimizes disruption and risk in the face of a rapidly evolving big data ecosystem. 3) Secure, govern and manage your data with full integration with Apache Ambari, Apache Ranger, and more. These benefits come to life with real customer case studies. Learn how a national insurance company and global hotel chain are using Hortonworks HDP and Syncsort DMX-h to get bigger insights from their enterprise data, securely, efficiently, and cost-effectively, without spending hundreds of man-hours.
Red Hat Summit 2015: Red Hat Storage Breakfast sessionRed_Hat_Storage
See the presentation shared during a special breakfast session during Red Hat Summit 2015. Learn about our mission, what areas and communities are seeing strong growth, and much more.
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...Anand Haridass
An unprecedented increase in the use of digital devices is causing an explosion in the amount of data generated & captured by businesses. The need to extract economic value from all this "Big Data", that has the potential to transform businesses completely, is immense and drives a whole slew of new workloads. Organizations need to continuously align strategy, business processes and infrastructure investments to derive these insights. This session will talk to how solutions based on POWER deliver this in a cost-effective, open, scalable, high performing and reliable manner.
A modern, flexible approach to Hadoop implementation incorporating innovation...DataWorks Summit
A modern, flexible approach to Hadoop implementation incorporating innovations from HP Haven
Jeff Veis
Vice President
HP Software Big Data
Gilles Noisette
Master Solution Architect
HP EMEA Big Data CoE
Red Hat Enterprise Linux 7.1 on IBM Power Systems helps you improve performance, increase reliability and reduce costs. See how Power Systems helps you put your data to work. Learn more: http://ibm.co/1JMO9MV
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
The HDF 3.3 release delivers several exciting enhancements and new features. But, the most noteworthy of them is the addition of support for Kafka 2.0 and Kafka Streams.
https://hortonworks.com/webinar/hortonworks-dataflow-hdf-3-3-taking-stream-processing-next-level/
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
Forrester forecasts* that direct spending on the Internet of Things (IoT) will exceed $400 Billion by 2023. From manufacturing and utilities, to oil & gas and transportation, IoT improves visibility, reduces downtime, and creates opportunities for entirely new business models.
But successful IoT implementations require far more than simply connecting sensors to a network. The data generated by these devices must be collected, aggregated, cleaned, processed, interpreted, understood, and used. Data-driven decisions and actions must be taken, without which an IoT implementation is bound to fail.
https://hortonworks.com/webinar/iot-predictions-2019-beyond-data-heart-iot-strategy/
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
Cloudbreak, a part of Hortonworks Data Platform (HDP), simplifies the provisioning and cluster management within any cloud environment to help your business toward its path to a hybrid cloud architecture.
https://hortonworks.com/webinar/getting-data-cloud-cloudbreak-live-demo/
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
In this webinar, we talk with experts from Johns Hopkins as they share techniques and lessons learned in real-world Apache Hadoop implementation.
https://hortonworks.com/webinar/johns-hopkins-using-hadoop-securely-access-log-events/
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
Cybersecurity today is a big data problem. There’s a ton of data landing on you faster than you can load, let alone search it. In order to make sense of it, we need to act on data-in-motion, use both machine learning, and the most advanced pattern recognition system on the planet: your SOC analysts. Advanced visualization makes your analysts more efficient, helps them find the hidden gems, or bombs in masses of logs and packets.
https://hortonworks.com/webinar/catch-hacker-real-time-live-visuals-bots-bad-guys/
We have introduced several new features as well as delivered some significant updates to keep the platform tightly integrated and compatible with HDP 3.0.
https://hortonworks.com/webinar/hortonworks-dataflow-hdf-3-2-release-raises-bar-operational-efficiency/
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
With the growth of Apache Kafka adoption in all major streaming initiatives across large organizations, the operational and visibility challenges associated with Kafka are on the rise as well. Kafka users want better visibility in understanding what is going on in the clusters as well as within the stream flows across producers, topics, brokers, and consumers.
With no tools in the market that readily address the challenges of the Kafka Ops teams, the development teams, and the security/governance teams, Hortonworks Streams Messaging Manager is a game-changer.
https://hortonworks.com/webinar/curing-kafka-blindness-hortonworks-streams-messaging-manager/
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
The healthcare industry—with its huge volumes of big data—is ripe for the application of analytics and machine learning. In this webinar, Hortonworks and Quanam present a tool that uses machine learning and natural language processing in the clinical classification of genomic variants to help identify mutations and determine clinical significance.
Watch the webinar: https://hortonworks.com/webinar/interpretation-tool-genomic-sequencing-data-clinical-environments/
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
Last year IBM and Hortonworks jointly announced a strategic and deep partnership. Join us as we take a close look at the partnership accomplishments and the conjoined road ahead with industry-leading analytics offers.
View the webinar here: https://hortonworks.com/webinar/ibmhortonworks-transformation-big-data-landscape/
In this exclusive Premier Inside Out, you will hear from Druid committer Slim Bouguerra, Staff Software Engineer and Product Manager Will Xu. These Hortonworkers will explain the vision of these components, review new features, share some best practices and answer your questions.
View the webinar here: https://hortonworks.com/webinar/hortonworks-premier-apache-druid/
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
Gaining business advantages from big data is moving beyond just the efficient storage and deep analytics on diverse data sources to using AI methods and analytics on streaming data to catch insights and take action at the edge of the network.
https://hortonworks.com/webinar/accelerating-data-science-real-time-analytics-scale/
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
Thanks to sensors and the Internet of Things, industrial processes now generate a sea of data. But are you plumbing its depths to find the insight it contains, or are you just drowning in it? Now, Hortonworks and Seeq team to bring advanced analytics and machine learning to time-series data from manufacturing and industrial processes.
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
Trimble Transportation Enterprise is a leading provider of enterprise software to over 2,000 transportation and logistics companies. They have designed an architecture that leverages Hortonworks Big Data solutions and Machine Learning models to power up multiple Blockchains, which improves operational efficiency, cuts down costs and enables building strategic partnerships.
https://hortonworks.com/webinar/blockchain-with-machine-learning-powered-by-big-data-trimble-transportation-enterprise/
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
For years, the healthcare industry has had problems of data scarcity and latency. Clearsense solved the problem by building an open-source Hortonworks Data Platform (HDP) solution while providing decades worth of clinical expertise. Clearsense is delivering smart, real-time streaming data, to its healthcare customers enabling mission-critical data to feed clinical decisions.
https://hortonworks.com/webinar/delivering-smart-real-time-streaming-data-healthcare-customers-clearsense/
Making Enterprise Big Data Small with EaseHortonworks
Every division in an organization builds its own database to keep track of its business. When the organization becomes big, those individual databases grow as well. The data from each database may become silo-ed and have no idea about the data in the other database.
https://hortonworks.com/webinar/making-enterprise-big-data-small-ease/
Driving Digital Transformation Through Global Data ManagementHortonworks
Using your data smarter and faster than your peers could be the difference between dominating your market and merely surviving. Organizations are investing in IoT, big data, and data science to drive better customer experience and create new products, yet these projects often stall in ideation phase to a lack of global data management processes and technologies. Your new data architecture may be taking shape around you, but your goal of globally managing, governing, and securing your data across a hybrid, multi-cloud landscape can remain elusive. Learn how industry leaders are developing their global data management strategy to drive innovation and ROI.
Presented at Gartner Data and Analytics Summit
Speaker:
Dinesh Chandrasekhar
Director of Product Marketing, Hortonworks
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
Hortonworks DataFlow (HDF) is the complete solution that addresses the most complex streaming architectures of today’s enterprises. More than 20 billion IoT devices are active on the planet today and thousands of use cases across IIOT, Healthcare and Manufacturing warrant capturing data-in-motion and delivering actionable intelligence right NOW. “Data decay” happens in a matter of seconds in today’s digital enterprises.
To meet all the needs of such fast-moving businesses, we have made significant enhancements and new streaming features in HDF 3.1.
https://hortonworks.com/webinar/series-hdf-3-1-technical-deep-dive-new-streaming-features/
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
Join the Hortonworks product team as they introduce HDF 3.1 and the core components for a modern data architecture to support stream processing and analytics.
You will learn about the three main themes that HDF addresses:
Developer productivity
Operational efficiency
Platform interoperability
https://hortonworks.com/webinar/series-hdf-3-1-redefining-data-motion-modern-data-architectures/
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data. It provides an end-to-end platform that can collect, curate, analyze, and act on data in real-time, on-premises, or in the cloud with a drag-and-drop visual interface. It’s being used across industries on large amounts of data that had stored in isolation which made collaboration and analysis difficult.
Join industry experts from Hortonworks and Attunity as they explain how Apache NiFi and streaming CDC technology provides a distributed, resilient platform for unlocking the value of data in new ways.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
8. Apache Hadoop Committers
We Employ the Committers
one third of all committers to the Apache®
Hadoop™ project, and a majority in other
important projects
Our Committers Innovate
and expand Open Enterprise Hadoop
We Influence the Hadoop Roadmap
by communicating important requirements
to the community through our leaders
Hortonworks Influences the Apache Community
11. Client Value proposition for HDP on Power:
• 100% Open Source Hadoop running on OpenPOWER Hardware Platform
– OpenPOWER is open-source hardware solution for open source SW
– Intel x86 – perceived as default/commodity option - but x86 is not open
– Open = no vendor lock-in and flexibility
• Processor and Server Architecture optimized for big data processing
– 2x per core performance compared to intel x86
o Fewer cores & servers needed: contain server sprawl
o Improved HW price/performance
– Higher server & data reliability – designed to run the enterprise
11
12. 5 IBMers contributing to
Linux and Apache Projects
1999
IBM is investing in the
Linux ecosystems &
open innovation
270+ OpenPOWER-based
innovations under way
2016
50k+ IBMers contributing to 150+ open organizations
1. Source: https://developer.ibm.com/start/
1
Blockchain
Hyperledger
Open Source
Databases
12
13. 13 13
Accelerated innovation through
collaboration of partners
Accelerated innovation through
collaboration of partners
Amplified capabilities driving
industry performance leadership
Amplified capabilities driving
industry performance leadership
Vibrant ecosystem through
open development
Vibrant ecosystem through
open development
Cloud Computing
Hyperscale & Large scale
Datacenters
High Performance
Computing & Analytics
Domestic
IT Agendas
Industry adoption, Open choice
OpenPOWER Strategy
Moore’s law no longer satisfies
performance gain
Numerous IT consumption
models
Growing workload
demands
Mature Open software
ecosystem
Market Shifts
OpenPOWER is an open development community,
using the POWER Architecture to serve the evolving needs of customers.
OpenPOWER, a catalyst for Open Innovation
14. Fueling an Open Development Community
14
Chip /
SOC
Boards
Systems
I/O / Storage
Acceleration
System / Software
Integration
Implementation /
HPC /
Research
16. Introducing the IBM Power Systems LC Line
S822LC For High
Performance Computing
•Incorporates the new
POWER8 processor with
NVIDIA NVLink
•Delivers 2.8X the bandwidth
to GPUs accelerators
•Up to 4 integrated NVIDIA
“Pascal” GPUs
S822LC For Big Data
•Ideal for storage-centric
and high data through-
put workloads
•Brings 2 POWER8
sockets for Big Data
workloads
•Big data acceleration with
work CAPI and GPUs
S821LC
•Storage rich single
socket system for big
data applications
•Memory Intensive
workloads
S822LC
•2X memory bandwidth
of Intel x86 systems
•Memory Intensive
workloads
S812LC
•2 POWER8 sockets in a
1U form factor
•Ideal for environments
requiring dense
computing
NEW
NEW
NEWBig Data
High Performance
Computing
ComputeIntensive
Announce 9/8, GA 9/26
Announce and GA 9/8
Announce and GA 9/8
OpenPOWER servers for cloud and cluster deployments that are different by design
17. Innovation Pervasive in the Design
Power Systems S822LC for
Big Data
NVIDIA:
Tesla K80 GPU Accelerator
Red Hat, Ubuntu, SUSE:
Linux OS
Mellanox: InfiniBand/Ethernet
Connectivity in and out of server
HGST: Optional NVMe Adapters
Alpha Data with Xilinx FPGA:
Optional CAPI Accelerator
Broadcom: Optional PCIe Adapters
QLogic: Optional Fiber Channel PCIe
Samsung: SSDs & NVMe
Hynix, Samsung, Micron: DDR4
IBM: POWER8 CPU
17
18. Leading Operational DBMSs Available & Optimized for Linux on Power
In-Memory, NoSQL
SAP HANA
MongoDB
Neo4j
EnterpriseDB
MariaDB
Open Source
IBM DB2 BLU
RedisLabs
Cassandra
PostgreSQL
18
19. 19
IBM is your single, trusted vendor to support and help you
manage your Linux infrastructure
1
Based on IBM internal data. 2
Original equipment manufacturer
20. Price/Performance
Moore’s Law
Processor
Technology
2000 2020
Firmware / OS
Accelerators
Software
Storage
Network
20
Data holds competitive valueFull system and stack
open innovation required
You are here
44 zettabytes
unstructured data
2010 2020
structured data
DataGrowth
Today’s challenges demand innovation
25. YCSB running MongoDB on POWER8 delivers leadership performance and 2.2X
better price-performance than Intel Xeon E5-2690 v3 Haswell
IBM Power
S822LC
(16-core, 256GB)
HP
DL380 Gen9
(24-core, 256GB)
Server web price*
-3-year warranty
$16,295 $24,615
System Cost
-Server + RHEL OS + MongoDB
Annual Subscription
$29,584
($16,295 + $1,299 + $11,990)
$37,904
($24,615 + $1,299 +
$11,990)
MongoDB YCSB
(total operations per second)
297.5 k ops 169.5 k ops
$ / Op per Sec 100 $ / ops 223 $ / ops
2.2X
better
2.2X
Better Price-Performance
33%
Lower HW costs and
maintenance
75%
More Performance
per Server
@
•Based on IBM internal testing of single system and OS image running Yahoo Cloud Services Benchmark (YCSB) 0.6.0, 1M record workload at 50/50 read/write factor. Conducted under laboratory condition, individual result can vary based on workload size, use of
storage subsystems & other conditions.
• IBM Power System S822LC; 16 cores (2 x 8c chips) / 128 threads, POWER8; 3.3 GHz, 256 GB memory, MongoDB 3.3.4, RHEL 7.2. Competitive stack: HP Proliant DL380 Gen9; 24 cores (2 x 12c chips) / 48 threads; Intel E5-2690 v3; 2.6 GHz; 256 GB memory, MongoDB
3.3, RHEL 7.2 . Both server priced with 2 x 1TB SATA 7.2K rpm HDD, 1 Gb 2-port, 2 x 16gbps FCA. Configurations represent the highest processor frequency for that specific processor running the MongoDB server on 1 socket & the YCSB application workload on the 2nd
socket. RAM disk was used to focus testing on processor technology differences.
Pricing is based on web pricing for S822LC http://www-03.ibm.com/systems/power/hardware/s822lc-commercial/buy.html and HP DL380 Gen9 https://h22174.www2.hp.com/SimplifiedConfig/Index MongoDB https://www.mongodb.com/compare/mongodb-oracle
Page: 6
27. Future Proof Your Hadoop Infrastructure
• Total Cost of Ownership benefits of a Linux on Power decision
– Less infrastructure means reduced costs in many areas:
o Energy, cooling, server administration, floor space, SW licensing
– Position for future growth, avoid hitting the data center wall with cluster sprawl
– As your workloads evolve, POWER8 gives you options:
o Scale up each node by exploiting the memory bandwidth and multi-threading
o Add new workload optimized servers to the cluster (such a GPU with NVLink)
27
TALK TRACK
This presentation provides an overview of the Hortonworks and IBM Power partnership and how it will enable clients rapid access to open innovation and drive key business value on their analytics journey.
TALK TRACK
1st I will review the partnership announce at IBM Edge in Sept 2016
2nd I will look at an example of a client’s planned journey with HDP and Power Systems and the renovate to innovate philosophy
Next I will review how Hortonworks leads the open community innovation for Hadoop and supports their client with business success
Lastly I will overview how OpenPower innovation delivers outstanding time to insights for analytics
TALK TRACK
Overview the recently announced partnership between IBM and Hortonworks that will bring open Hadoop to OpenPOWER
Encourage clients to view the videos of Scott Gnau talking about Hortonworks perspective and James Wade who talked about how Florida Blue is leveraging open source HDP and OpenPower to power his analytics journey.
TALK TRACK
Refer to the linked blog for the full background: http://hortonworks.com/blog/delivering-flexible-infrastructure-analytics-hortonworks-hdp-ibm-power-systems/
Summary of Florida Blue’s three new business models (expanding from one to four businesses): started with a B2B insurance model, entering retail insurance, now handling 32% of US government Medicare claims and opening provider clinics in Florida
Florida Blue has seen outstanding 3-5x performance results of running MongoDB on IBM OpenPOWER servers to support their call center initiatives.
Note that they are using both Hortonworks and IBM Power today and are planning to bring these capabilities together as one of the 1st adopters of the HDP 2.5 for Linux on Power within the early client program.
TALK TRACK
At Hortonworks, we partner with our customers and guide them on their Journey to Actionable Intelligence.
You can start your journey anywhere you want.
You can renovate your IT architecture to reduce costs and boost functionality.
Or you can innovate modern data applications that you use at your own company or sell on the open market.
You can start with the most sophisticated use cases if your team is experienced, or you can build your expertise by beginning with less complex use cases that bring quick results.
As you build your team’s expertise and comfort with Hortonworks Data Platform and Hortonworks DataFlow, you can then tackle more challenging aspects of your road map.
We will help you plan the right path to meet your objectives.
Now I’d like to tell you about some Hortonworks customers and how they are transforming their industries with the actionable intelligence that comes from Connected Data Platforms.
[NEXT SLIDE]
DISCUSSION STRATEGY
[For Business Prospects]
Focus questions
What business problems can we help you solve?
Which use case would you like to tackle first?
What type of challenge is most important: data discovery, building a single view or creating predictive analytics?
Calls to action
Recommend the Jumpstart package: http://hortonworks.com/products/subscriptions/jumpstart/
Schedule a use case workshop and plan your journey across your most important use cases.
Give them an industry-specific White Paper to read.
[For IT Prospects]
Focus questions
Where do you face the most cost pressure to store and process data?
Which use case would you like to tackle first: active archive, ETL offload or data enrichment?
Calls to Action
Recommend that they download Hortonworks Sandbox: http://hortonworks.com/products/hortonworks-sandbox/
Schedule a use case workshop
Give them the EDW Optimization White Paper to read: http://hortonworks.com/solutions/edw-optimization/
TALK TRACK
When you choose a vendor, you choose a platform that will be with you for a long time. As the originators of both the Apache Hadoop and Apache NiFi technologies as well as the Open Enterprise Hadoop category, Hortonworks is uniquely positioned to help you transform your business with actionable intelligence.
We are the only publicly traded pure-play Hadoop company
Our momentum is accelerating.
Our customers come to us for all the reasons I’ve described: our superior technology that evolves at the pace of open innovation, our proven model for partnering with our customers and our dedication to helping them succeed.
I’d like to come back for a larger meeting and a use case workshop to start the journey together.
Would next week work for you?
[NEXT SLIDE]
SOURCE: http://hortonworks.com/about-us/quick-facts/
TALK TRACK
Hortonworks has always followed a 100-percent open approach to Hadoop and Hortonworks Data Platform is the only genuinely open Hadoop distribution.
Our open approach:
Eliminates risk of becoming locked in with one proprietary vendors
It maximizes innovation by tapping into the ingenuity of the largest number of talented developers with the broadest perspective of the capabilities and areas for improvement
It integrates seamlessly with other datacenter technologies. Like our customers, our technology partners also want to avoid getting locked in to a proprietary approach.
[NEXT SLIDE]
SUPPORTING DETAIL
Eliminates Risk
Why this matters to our customers: Their interests in Hadoop align with Hortonworks’ incentives to deliver world-class support and constantly improve Hadoop
Proof point: Spotify was running Cloudera without support. As the importance of Hadoop grew in their business, they needed support. They chose to migrate from CDH AND to begin paying Hortonworks for support, instead of staying on CDH unsupported
Citation: "[Hortonworks'] true open source approach and the work they have done to improve the Apache Hive data warehouse system aligns well with our needs," said Wouter de Bie, team lead for data infrastructure at Spotify, in a statement. "We use Hive extensively for ad-hoc queries and for the analysis of large data sets.” | http://www.informationweek.com/software/information-management/spotify-embraces-hortonworks-dumps-cloudera/d/d-id/1111563?
Maximizes Community Innovation
Why this matters to our customers: They benefit from the fastest possible innovation that comes from the real-world needs felt by our HDP subscribers
Proof point: The Stinger Initiative organized the efforts of 144 developers at 45 companies to write 390,000 lines of java code in 13 months. This combined effort made the most important Hive queries 100X faster.
Citation: “Combining the capabilities of HARMAN services and Hortonworks, automakers and their suppliers will have access to a scalable platform for, real-time insights, new innovative service creation and predictive analytics-based solutions that can minimize the risk of costly recalls and reduce warranty expenses,” said Sanjay Dhawan, President, HARMAN Services Division. “This is a very exciting step forward in the evolution of the connected car as we speed up the time to market with powerful functionalities that benefit the OEMs and their drivers.”
Integrates seamlessly
Why this matters to our customers: Rapid, easy adoption that reinforces your existing datacenter technologies
Proof point: More than 130 technology partners are HDP-certified, including: EMC, HP, Microsoft, Red Hat, SAP, Teradata, Cisco, SAS and Google Cloud Platform
Citation: “The new features in the Hortonworks Data Platform 2.3 allow us to empower our joint customers with access to more data through Apache Hadoop,” said Randy Guard, Vice President of Product Management, SAS. “This allows them to perform deeper analysis on their data with extensive security and data governance capabilities in place. We want our customers to know that we support the platforms and environments they need to handle their data-driven workloads in a flexible, secure and efficient way.” | http://hortonworks.com/press-releases/hortonworks-accelerates-big-data-transformations-with-hortonworks-data-platform-2-3/
TALK TRACK
Hortonworks employs the largest number of project committers to Apache Hadoop, Apache NiFi and their related Apache projects.
Our committers innovate with other leaders working through the Apache Software Foundation.
We have more than 90 Apache committers, and they interact with each other every day in meetings, in our hallways or over lunch,
And our leadership in the Apache community also helps us influence the roadmap, to meet the needs that we learn about from our hundreds of enterprise subscribers.
The Apache way is democratic, but we have the largest voting block.
[NEXT SLIDE]
SUPPORTING DETAIL
We Employ the Committers
Why this matters to our customers: No company understands Hadoop better than Hortonworks. This correlates directly to the quality of our enterprise support.
Proof point: Specific Apache Hadoop committer numbers as of January 2016 (# of committer seats / % of total committers):
Hadoop: Hortonworks = 30/31%, Cloudera = 17/71%
Customer quote: Paul Boal, “Whenever our developers need access to some knowledge, and we say ‘Could we talk to somebody about that?’, Hortonworks says, ‘Sure, here are a few of our committers. Why don’t you talk to them?’“ | https://youtu.be/iQ2V3FuqxHg @ 3:04
Our Committers Innovate
Why this matters: Choose the team that leads Hadoop innovation, so you can benefit from a rapid flow of new features that directly benefit your business.
Proof points:
YARN – YARN is the data operating system of Hadoop 2. Arun Murthy originally conceived of YARN in the JIRA ticket MapReduce 279. The founding architects of Hortonworks developed YARN until it was ready for release as part of Apache Hadoop 2.0, which was GA in October 2013. | http://hortonworks.com/blog/introducing-apache-hadoop-yarn/
The Stinger Initiative: In 2013, our customers told us that Hive was too slow and that it didn’t include all the semantics they needed. So we spearheaded the Stinger Initiative to improve the speed, semantics and scale of Hive queries. Over 13 months and three versions of Hive (0.11, 0.12 and 0.13) – 144 developers from 45 companies wrote 390 lines of java code. This improved Hive’s speed by more than 50x and added the missing SQL semantics. | http://hortonworks.com/blog/announcing-apache-hive-0-13-completion-stinger-initiative/
Partner quote: “Hortonworks has partnered with Red Hat since its early days. They have been a champion of the open source model and a key enabler of the communities needed to support the big data ecosystem,” said Greg Kleiman, director of big data strategy at Red Hat. “As a strategic partner, Hortonworks brought their flavor of open source innovation to build more agile and cost effective big data solutions with Red Hat.” | http://hortonworks.com/press-releases/hortonworks-recognized-as-crn-emerging-vendor/
We Influence the Roadmap
Why this matters to our customers: They can influence the roadmap and attract top technical talent to join their teams
Proof point: The Data Governance Initiative initiative led to the creation of Apache Ranger. Hortonworks encouraged its subscribers Schlumberger, Aetna, Target and Merck to join the initiative to speak for their requirements and to engage their teams to solve for those requirements. This led directly to creation of the Apache Atlas project, and each of these customers employs at least one committer to Atlas. | http://hortonworks.com/blog/apache-atlas-project-proposed-for-hadoop-governance/
Analyst Quote: “Applying old-school dictatorial-style governance to Hadoop would be a disaster for enterprises because it would tamp down the agility of Hadoop,” he said. “The Data Governance Initiative is intriguing because it brings the Hadoop community together with real enterprises. I hope they partner in a way that create ‘minimally viable governance.’ That would keep Hadoop flexible while giving enterprises the governance they need to prevent data pandemonium.” | http://www.bigdatatechcon.com/news/data-governance-initiative-expands-the-hadoop-ecosystem
TALK TRACK
We know that the community powers the ongoing success of our technology, and so we provide a platform for that community through Hortonworks Community Connection.
As a founding member of ODPI, we help influence the larger ecosystem that is standardizing around Apache Hadoop as they develop new Big Data applications.
And we engage our partner ecosystem community through the Hortonworks Partnerworks program.
[NEXT SLIDE]
TALK TRACK
Hortonworks Support Subscriptions come with Hortonworks SmartSense to help you proactively and predictively optimize your HDP cluster.
SmartSense analyzes the data on how you use your cluster, compares that to best practices and makes automated recommendations about how you can optimize your investment in HDP.
We also offer a variety of self-service tools through our integrated customer portal.
And of course, you can calls us 24x7. We’ll always pick up, and for any tough issues, our support engineers have access to our committers.
[NEXT SLIDE]
SUPPORTING DETAIL
Hortonworks SmartSense
Why this matters to our customers: Hortonworks SmartSense helps our customers make the most of their Hadoop cluster – using data.
Proof point: Just like our customers use machine data for preventative maintenance on their telco infrastructures, oil rigs or military aircraft, they can use the same type predictive analytics using data from their own cluster to optimize that resource. In fact, Hortonworks subscribers benefit (through SmartSense) from the usage data from all other Hortonworks customers.
Citation: “SmartSense’s capabilities, says Cheolho Minale, vice president of technology at The Mobile Majority will allow his Hadoop team to optimize it’s HDP cluster’s ad performance, ‘At The Mobile Majority, we have been using Hortonworks Data Platform to optimize ad performance on behalf of our customers. We’re excited to look into Hortonworks SmartSense as a way to continuously optimize our HDP cluster as it grows over time.’” | Source: http://hortonworks.com/blog/introducing-availability-of-hdp-2-3-part-3/
Integrated customer portal
Why this matters to our customers: Hortonworks subscribers can scale their organization’s Hadoop expertise as quickly as they want.
TALK TRACK
IBM Power Systems, with our OpenPOWER partners, offer a full range of server solutions from public cloud to distributed scale out servers to 192 core scale up servers
LC and L line servers have been developed specifically for scale out Linux applications
LC servers provide clear price/performance leadership with form factors and price points competitive with commodity Intel
Unlike commodity Intel, Power Systems supports an open ecosystems via the OpenPOWER initiative which ensures continuous leading innovation
For example:
Intel per core performance generally stays flat each new generation while the number of cores per server increases driving up SW costs
Power per core performance has demonstrated significant improvement with each successive generation.
[NEXT SLIDE]
We know that IT system downtime is much more than an inconvenience or aggravation. When your systems are down or not performing optimally, people across your company can’t do their jobs effectively—if at all. As a result, revenue, profit, company reputation and customer loyalty can suffer. We know that you need one dependable, trusted provider to help manage and resolve problems regardless of the platform or vendor. In particular, you need open source and Linux infrastructure specialists.
IBM has been committed to open technologies from the onset, bringing more than 16 years of experience supporting open source environments. Through our alliances with SUSE, Ubuntu and Red Hat, we demonstrate our commitment to the success of open technologies. We bring a virtually unparalleled expertise in supporting Linux across all IBM systems and OEM x86 platforms certified for Linux.
Our well-established global support infrastructure is key. With that you can gain:
In-depth skills and experience that encompass the many products and technologies that make up your IT environments
A streamlined process that expedites problem determination and resolution
A proven track record in delivering result-producing support services
The ability to access your support provider as often as needed, around the clock and around the world
With IBM, you can gain accountability. When you report a problem, you can feel confident that it will be handled professionally and expeditiously, and that our specialists will manage it diligently from initial report through resolution.
We know that IT system downtime is much more than an inconvenience or aggravation. When your systems are down or not performing optimally, people across your company can’t do their jobs effectively—if at all. As a result, revenue, profit, company reputation and customer loyalty can suffer. We know that you need one dependable, trusted provider to help manage and resolve problems regardless of the platform or vendor. In particular, you need open source and Linux infrastructure specialists.
IBM has been committed to open technologies from the onset, bringing more than 16 years of experience supporting open source environments. Through our alliances with SUSE, Ubuntu and Red Hat, we demonstrate our commitment to the success of open technologies. We bring a virtually unparalleled expertise in supporting Linux across all IBM systems and OEM x86 platforms certified for Linux.
Our well-established global support infrastructure is key. With that you can gain:
In-depth skills and experience that encompass the many products and technologies that make up your IT environments
A streamlined process that expedites problem determination and resolution
A proven track record in delivering result-producing support services
The ability to access your support provider as often as needed, around the clock and around the world
With IBM, you can gain accountability. When you report a problem, you can feel confident that it will be handled professionally and expeditiously, and that our specialists will manage it diligently from initial report through resolution.
TALK TRACK
POWER8’s leading IO and Memory capabilities power fast data access and movement across a wide range of BDA applications.
POWER8 is the first microprocessor designed for Big Data and Analytics.
When systems are designed for big data, there are a couple of key attributes that are important to create a balanced system design.
First having the processing capability, second having the memory space, the workspace, and the third is having the bandwidth, the ability to move the information in and out of the system at the rapid speeds required.
POWER8 delivers 4 times more threads per core vs. commodity infrastructure. We can easily support a growing number of users who need reports, or to perform ad hoc analytics. This is because the processor can run more concurrent queries in parallel faster, across multiple cores with more threads per core.
We’re delivering up to 4 times more memory bandwidth. Increased memory bandwidth to access up to 1 TB of memory for data operations and enlarged cache in every processor. This delivers the levels of performance your teams need to make decisions in real time.
We’re delivering faster IO and high speed caching to ingest, move and access large volumes of data so that analytics results are available faster.
Moore’s law is no longer delivering the processor speed up needed to meet the demands of Big Data, the OpenPOWER foundation is driving innovations such as hardware accelerators that will deliver on a continuous improvement in price/performance leadership.
Power Systems provide the capabilities needed to handle the varying analytics initiatives your business requires.
Broad range of data and analytics – from operational to computational to business analytics, as well as cognitive solutions leveraging IBM Watson technology, Power Systems are optimized for performance and can scale to support demanding and growing workloads.
These solutions help you capitalize on the currency of data by finding business insights faster and more efficiently.
Supporting Data:
threads per core vs. x86 (up to 1536 threads per system)
memory bandwidth vs. x86 (up to 32TB of memory)
more cache vs. x86 (>19MB cache per core)
The POWER8 cache numbers are
L1 - 96 KB per Core ==>
L2 - 512KB per Core ==>
L3 - 8 MB per core ==>(8192KB)
L4 - 128 MB per Socket 128/12 = 10.7MB per core (10922KB per core)
Total Cache per Core - 19.26MB (19722KB)
The Intel side (Broadwell-EP) also has L1 and L2...
L1 - 64 KB per Core ==>
L2 - 256KB per Core ==>
L3 - 55 MB per socket ==> 2.5MB per Core (2560KB per core)
Total Cache per Core – 2.81MB (2880KB)
Ratio is 19722/2880 = 6.85 round down to 6X
TALK TRACK
- One good example of OpenPOWER innovation that could be exploited by GPU aware applications on Hadoop is the POWER8 NVLink support
- In partnership with IBM’s OpenPOWER partner Nvidia, we released a POWER8 Linux server in Sept 2016 with 2.8X the IO throughput from CPU to GPU of what is possible on x86
- This removes the PCIe data pipe bottleneck to ensure full exploitation of the GPUs and is only available today on OpenPOWER servers
Links:
IBM S822LC for HPC with NVLink: http://www-03.ibm.com/systems/power/hardware/s822lc-hpc/
Related press:
https://www.hpcwire.com/2016/09/08/ibm-debuts-power8-chip-nvlink-3-new-systems/
http://www.pcworld.com/article/3117718/ibms-new-power8-server-packs-in-nvidias-speedy-nvlink-interconnect.html
Note: This chart compares 16-c S822LC to the 36-c HP DL380. So the below
System-level performance is more compelling when this is considered.
S822LC system-level total through put is 20%+ higher than DL380. The price
Performance shown is outcome of this higher performance and the lower HW street
price of S822LC.
On Trans/$ graph, note that the VM quantities show 20/24 VMs on Power and 16/20
VMs on x86 are intentionally different. As the goal is to show virtual machines that
Produce similar per VM performance through put level. The slight raise in x86’s second
bar is not significant and is due to the fact that x86 system is under allocated on first data
point and not on second, where as p8 is already fully allocated on both data points shown.
Memory: 256GB for S822LC and 256 GB for HP DL380
HBA: 2 x 16 gbps FCA for S822LC and HP Eth: 2port 10GbHDD: 2 x 1TB for S822LC & 2x 1TB for HP Price (match the configuration) BUT 2 x 300GB SAS 10k SFF for HP testcase
Power S822LC delivers leadership performance vs Intel Xeon E5-2699 v3
2.7X more transactions/second per core
Improve your revenue potential – Power S822LC delivers 25% more VMs in the same rack space as HP DL380
25% more virtual machines per server than HP DL380
76 more revenue generating virtual machines per rack than HP DL380
TALK TRACK
- Spark workloads are ideal to gain advantage from the core capabilities of the POWER8 processor for multithreading and large memory
- Streaming and SQL workload can be broken down into many parallel threads and exploit the 8 thread per core unique to POWER8
- Complex, memory intensive workloads like machine learning and graph, can benefit from the large memory bandwidth and cache
- The result shown here are from an open source Spark Bench workload (contributed by IBM) which drives a set of representative workloads across machine learning, SQL and Graph on Spark. (https://github.com/SparkTC/spark-bench)
- You can see that overall POWER8 provides a 2X per core advantage and 1.5X price performance advantage in these 7 node cluster results
TALK TRACK
- To summarize, the combination of HDP and OpenPOWER offers the fastest route to exploitation through open community innovation
- ODPi and OpenPOWER foundations ensure no risk of vendor lock-in and no other Hadoop + System combination can offer this pairing
- Hortonworks and IBM both have deep industry experience and a proven history of outstanding dedication to client success and support
- Ultimately, clients need fast time to insights and OpenPOWER delivers on performance with a reduced infrastructure cost
TALK TRACK
- As mentioned earlier, the Hortonworks and IBM Power Systems partnership was announced at IBM Edge conference in Sept 2016
- Early adopter clients will gain access to technical preview builds in late 4Q2016
- General availability for HDP 2.5 for Linux on Power will follow in 1Q2017
- Future HDP releases will provide Linux on Power support at the same time as the Linux on Intel releases
TALK TRACK
- If you are not already a member, join the Hortonworks community to access to valuable training resources and member insights
- Get to know more about the value of Power Systems and OpenPOWER and how it can provide a unique, open and optimized alternative for your Big Data and Analytics workloads
- To learn more about HDP and Power Systems, please join us for an upcoming webiniar or catch the replay.