Hortonworks DataFlow (HDF) is the complete solution that addresses the most complex streaming architectures of today’s enterprises. More than 20 billion IoT devices are active on the planet today and thousands of use cases across IIOT, Healthcare and Manufacturing warrant capturing data-in-motion and delivering actionable intelligence right NOW. “Data decay” happens in a matter of seconds in today’s digital enterprises.
To meet all the needs of such fast-moving businesses, we have made significant enhancements and new streaming features in HDF 3.1.
https://hortonworks.com/webinar/series-hdf-3-1-technical-deep-dive-new-streaming-features/
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
Gaining business advantages from big data is moving beyond just the efficient storage and deep analytics on diverse data sources to using AI methods and analytics on streaming data to catch insights and take action at the edge of the network.
https://hortonworks.com/webinar/accelerating-data-science-real-time-analytics-scale/
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data. It provides an end-to-end platform that can collect, curate, analyze, and act on data in real-time, on-premises, or in the cloud with a drag-and-drop visual interface. It’s being used across industries on large amounts of data that had stored in isolation which made collaboration and analysis difficult.
Join industry experts from Hortonworks and Attunity as they explain how Apache NiFi and streaming CDC technology provides a distributed, resilient platform for unlocking the value of data in new ways.
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
For years, the healthcare industry has had problems of data scarcity and latency. Clearsense solved the problem by building an open-source Hortonworks Data Platform (HDP) solution while providing decades worth of clinical expertise. Clearsense is delivering smart, real-time streaming data, to its healthcare customers enabling mission-critical data to feed clinical decisions.
https://hortonworks.com/webinar/delivering-smart-real-time-streaming-data-healthcare-customers-clearsense/
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformHortonworks
Find out how Hortonworks and IBM help you address these challenges to enable success to optimize your existing EDW environment.
https://hortonworks.com/webinar/modernize-existing-edw-ibm-big-sql-hortonworks-data-platform/
Dynamic Column Masking and Row-Level Filtering in HDPHortonworks
As enterprises around the world bring more of their sensitive data into Hadoop data lakes, balancing the need for democratization of access to data without sacrificing strong security principles becomes paramount. In this webinar, Srikanth Venkat, director of product management for security & governance will demonstrate two new data protection capabilities in Apache Ranger – dynamic column masking and row level filtering of data stored in Apache Hive. These features have been introduced as part of HDP 2.5 platform release.
Hortonworks Data In Motion Series Part 4Hortonworks
How real-world enterprises leverage Hortonworks DataFlow/Apache NiFi to to create real-time data flows in record time to enable new business opportunities, improve customer retention, accelerate big data projects from months to minutes through increased efficiency and reduced costs.
On-Demand webinar: http://hortonworks.com/webinar/paradigm-shift-business-usual-real-time-dataflows-record-time/
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
Gaining business advantages from big data is moving beyond just the efficient storage and deep analytics on diverse data sources to using AI methods and analytics on streaming data to catch insights and take action at the edge of the network.
https://hortonworks.com/webinar/accelerating-data-science-real-time-analytics-scale/
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data. It provides an end-to-end platform that can collect, curate, analyze, and act on data in real-time, on-premises, or in the cloud with a drag-and-drop visual interface. It’s being used across industries on large amounts of data that had stored in isolation which made collaboration and analysis difficult.
Join industry experts from Hortonworks and Attunity as they explain how Apache NiFi and streaming CDC technology provides a distributed, resilient platform for unlocking the value of data in new ways.
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
For years, the healthcare industry has had problems of data scarcity and latency. Clearsense solved the problem by building an open-source Hortonworks Data Platform (HDP) solution while providing decades worth of clinical expertise. Clearsense is delivering smart, real-time streaming data, to its healthcare customers enabling mission-critical data to feed clinical decisions.
https://hortonworks.com/webinar/delivering-smart-real-time-streaming-data-healthcare-customers-clearsense/
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformHortonworks
Find out how Hortonworks and IBM help you address these challenges to enable success to optimize your existing EDW environment.
https://hortonworks.com/webinar/modernize-existing-edw-ibm-big-sql-hortonworks-data-platform/
Dynamic Column Masking and Row-Level Filtering in HDPHortonworks
As enterprises around the world bring more of their sensitive data into Hadoop data lakes, balancing the need for democratization of access to data without sacrificing strong security principles becomes paramount. In this webinar, Srikanth Venkat, director of product management for security & governance will demonstrate two new data protection capabilities in Apache Ranger – dynamic column masking and row level filtering of data stored in Apache Hive. These features have been introduced as part of HDP 2.5 platform release.
Hortonworks Data In Motion Series Part 4Hortonworks
How real-world enterprises leverage Hortonworks DataFlow/Apache NiFi to to create real-time data flows in record time to enable new business opportunities, improve customer retention, accelerate big data projects from months to minutes through increased efficiency and reduced costs.
On-Demand webinar: http://hortonworks.com/webinar/paradigm-shift-business-usual-real-time-dataflows-record-time/
Risk listening: monitoring for profitable growthDataWorks Summit
Historically, insurers used 50-, 100-, and 500-year flood models for risk evaluation and pricing. The extreme weather events we have experienced in 2017 alone prove how dated these methods really are.
To better understand their customers and potential current/future liability claims, forward thinking insurers are monitoring, analyzing, and integrating external data sources in real time (weather feeds from USGS.gov, news and stock feeds, and satellite imagery, to name just a few). By integrating and injecting these new data sources into their risk models and underwriting, insurers are better able to identify their risk appetites and effectively price.
The session will include real-world case studies, including how a global P and C insurer is now quickly analyzing and monitoring 50,000 customers and targets, gaining new insights into the market. Another example is a global reinsurance and specialty company that now leverages digital news channels to monitor its risk portfolio for early warning claims indicators to help drive down loss costs. CINDY MAIKE, VP Industry Solutions, GM of Insurance, Hortonworks, Inc.
EDW Optimization: A Modern Twist on an Old FavoriteHortonworks
BI and Big Data veterans Carter Shanklin, Sr. Director of Product at Hortonworks and Josh Klahr,, VP of Product at AtScale will deliver this interactive session covering insights, real-world experiences, and answering questions from the online audience. They’ll share real customer stories across industries and pain points to bring to life how you can use EDW Optimization today to drive insights across any and all of your enterprise data – quickly, simply, securely, and widely.
Overcoming the AI hype — and what enterprises should really focus onDataWorks Summit
Deep learning for all its hype is brittle, non-generalizeable, and its learnings are not readily transferable from one application to another. Since we are unlikely to see anything close to artificial general intelligence in the next few decades., we should instead focus on how enterprises can capitalize on the state of the art in machine learning and re-implement successful algorithms and follow the data science lifecycles that generate highest ROI.
This talk will cover the current state of the art in AI, its limits vs. hype, and discuss concrete steps that enterprises can take to achieve desired ROI by re-implementing production-grade-ready machine learning algorithms, that have been hardened and demonstrated to work very well in specific, constrained domains.
By the end of this talk, attendees should have a better grasp on how to avoid costly and unnecessary investments into yet unproven technologies, be better equipped to navigate the complex space of AI, and understand where to best focus their resources to maximize ROI. ROBERT HRYNIEWICZ, Technical Evangelist, Hortonworks
Streamline Apache Hadoop Operations with Apache Ambari and SmartSenseHortonworks
Apache Ambari 2.5 helps customers simplify the experience for provisioning, managing, monitoring, securing and troubleshooting Hadoop deployments. Find out how the combination of Ambari and SmartSense delivers a path to success to help IT get Hadoop up and running effectively. The end result – you get the full business impact management and benefits of Big Data for your organization.
https://hortonworks.com/webinar/streamline-apache-hadoop-operations-apache-ambari-smartsense/
Pivotal - Advanced Analytics for Telecommunications Hortonworks
Innovative mobile operators need to mine the vast troves of unstructured data now available to them to help develop compelling customer experiences and uncover new revenue opportunities. In this webinar, you’ll learn how HDB’s in-database analytics enable advanced use cases in network operations, customer care, and marketing for better customer experience. Join us, and get started on your advanced analytics journey today!
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
In recent years, big data has moved from batch processing to stream-based processing since no one wants to wait hours or days to gain insights. Dozens of stream processing frameworks exist today and the same trend that occurred in the batch-based big data processing realm has taken place in the streaming world so that nearly every streaming framework now supports higher level relational operations.
On paper, combining Apache NiFi, Kafka, and Spark Streaming provides a compelling architecture option for building your next generation ETL data pipeline in near real time. What does this look like in an enterprise production environment to deploy and operationalized?
The newer Spark Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream processing with elegant code samples, but is that the whole story?
We discuss the drivers and expected benefits of changing the existing event processing systems. In presenting the integrated solution, we will explore the key components of using NiFi, Kafka, and Spark, then share the good, the bad, and the ugly when trying to adopt these technologies into the enterprise. This session is targeted toward architects and other senior IT staff looking to continue their adoption of open source technology and modernize ingest/ETL processing. Attendees will take away lessons learned and experience in deploying these technologies to make their journey easier.
How Customers are Optimizing their EDW for Fast, Secure, and Effective InsightsHortonworks
Hortonwork’s Hadoop Powered EDW (Enterprise Data Warehouse) Optimization Solution with Syncsort DMX-h enables organizations to liberate data from across the enterprise, quickly create and populate the data lake, and deliver actionable insights.
Customer case studies across a variety of industries will bring to life how organizations are using this solution to gain bigger insights from their enterprise data – securely and cost-effectively – with faster time to time value.
Apache Hive is a rapidly evolving project, many people are loved by the big data ecosystem. Hive continues to expand support for analytics, reporting, and bilateral queries, and the community is striving to improve support along with many other aspects and use cases. In this lecture, we introduce the latest and greatest features and optimization that appeared in this project last year. This includes benchmarks covering LLAP, Apache Druid's materialized views and integration, workload management, ACID improvements, using Hive in the cloud, and performance improvements. I will also tell you a little about what you can expect in the future.
Hortonworks Open Connected Data Platforms for IoT and Predictive Big Data Ana...DataWorks Summit
The energy industry is well known to be laggard adopters of new technology. However, industry challenges such as aging assets & workforce, increased regulatory scrutiny, renewable energy sources, depressed commodity prices, changing customer expectations, and growing data volumes are pushing companies to explore new technologies to help solve these problems. Learn how energy companies are leveraging Hortonworks Open and Connected Data Platforms to provide the predictive analysis and data insights to optimize performance for the energy industry.
Speaker
Kenneth Smith, General Manager, Energy, Hortonworks
In this talk, we will give a high-level overview of what is Deep Learning (DL) is and provide a lot of visual examples for an intuitive take-aways from the session – and no worries, no math here!
The session will:
- Illustrate what neural nets are via interactive examples and mini-demos
- Show applications of DL to solve specific real-world problems with State-of-the-Art (SOTA) results
- Demonstrate how latest research in DL can be applied today to areas of vision, language, recommender systems, and tabular data
- Provide pro-tips on how to get started
Speaker: Robert Hryniewicz, AI Evangelist, Hortonworks
Presentation from Future of Data Boston Meetup on Oct 24, 2017.
Streaming data is rich with insights but these insights can be difficult to find due to the difficulty of developing and deploying streaming applications. During this presentation we will show how to build and deploy a complex streaming application in a few minutes using open source tools. First we will build an application using Streaming Analytics Manager and Schema Registry that ingests data into Apache Druid. Then we will use Apache Superset to build beautiful, informative dashboards.
How is it that one system can query terabytes of data, yet still provide interactive query support? This talk will discuss two of the underlying technologies that allow Apache Hive to support fast query response, both on-premise in HDFS and in cloud object stores such as S3 and WASB.
LLAP was introduced in Hive 2.6. It provides standing processes that securely cache Hive’s columnar data and can do query processing without ever needing to start tasks in Hadoop. We will cover LLAP’s architecture, intended uses cases, and performance numbers for both on-premise and in the cloud.
The second technology is the integration of Hive with Apache Druid. Druid excels at low-latency, interactive queries over streaming data. Its method of storing data makes it very well suited for OLAP style queries. We will cover how Hive can be integrated with Druid to support real-time streaming of data from Kafka and OLAP queries.
With the rise of IoT and complexity of applications, clouds, networks and infrastructure, it is becoming more difficult to protect data and infrastructure from attackers. When groups of bad actors collaborate, share information, provide unauthorized access, and do botnet as a service, attacks in terabit units also start easily. On the other hand, it is also difficult to find enough security analysts to deal with and defend against such attacks.
Here is the emergence of community cooperation like Apache Metron and efforts to open source. Metron provides a comprehensive framework for applications, networks and security built on Apache Hadoop and open source streaming analysis (eg Apache Nifi, Apache Kafka) tools in scalable data management and processing stacks. Extensions such as profiling, machine learning, and visualization work and real-time streaming detection make SOC analysts more efficient, while intrinsic scalability of open source gives data scientists security insight from data laboratories So that it can be quickly incorporated into production.
This section explains how real-world businesses and managed service providers use Apache Metron, identify and resolve security threats on a large scale, and explain methods and ideas for adapting the platform to your security architecture · I will demonstrate.
Next gen tooling for building streaming analytics apps: code-less development...DataWorks Summit
Building apps with less code and more abstractions, unit and integration test frameworks/harnesses, and continuous integration and delivery pipelines with Jenkins are all best practices and proven patterns in the enterprise for traditional apps. But they are difficult to follow with distributed systems like streaming analytics apps.
In this talk we will introduce some powerful new open source tooling based on Hortonworks Streaming Analytics Manager (SAM) that allows enterprises to easily implement these best practices. In 35 minutes, we will walk through the building of a complex streaming analytics app using a drag-and-drop approach (similar to Apache NiFi) that does the following:
• Joining streams
• Aggregations over windows (time or count based)
• Complex event processing
• Real-time streaming enrichment
• Model scoring on the stream
Once the app is built, we will create automated unit and integration tests of the streaming application, build continuous integration and delivery pipelines with Jenkins, and introduce powerful new monitoring, management, and dev-ops troubleshooting tools.
Speaker
George Vetticaden, VP of Product Management, Hortonworks
Risk listening: monitoring for profitable growthDataWorks Summit
Historically, insurers used 50-, 100-, and 500-year flood models for risk evaluation and pricing. The extreme weather events we have experienced in 2017 alone prove how dated these methods really are.
To better understand their customers and potential current/future liability claims, forward thinking insurers are monitoring, analyzing, and integrating external data sources in real time (weather feeds from USGS.gov, news and stock feeds, and satellite imagery, to name just a few). By integrating and injecting these new data sources into their risk models and underwriting, insurers are better able to identify their risk appetites and effectively price.
The session will include real-world case studies, including how a global P and C insurer is now quickly analyzing and monitoring 50,000 customers and targets, gaining new insights into the market. Another example is a global reinsurance and specialty company that now leverages digital news channels to monitor its risk portfolio for early warning claims indicators to help drive down loss costs. CINDY MAIKE, VP Industry Solutions, GM of Insurance, Hortonworks, Inc.
EDW Optimization: A Modern Twist on an Old FavoriteHortonworks
BI and Big Data veterans Carter Shanklin, Sr. Director of Product at Hortonworks and Josh Klahr,, VP of Product at AtScale will deliver this interactive session covering insights, real-world experiences, and answering questions from the online audience. They’ll share real customer stories across industries and pain points to bring to life how you can use EDW Optimization today to drive insights across any and all of your enterprise data – quickly, simply, securely, and widely.
Overcoming the AI hype — and what enterprises should really focus onDataWorks Summit
Deep learning for all its hype is brittle, non-generalizeable, and its learnings are not readily transferable from one application to another. Since we are unlikely to see anything close to artificial general intelligence in the next few decades., we should instead focus on how enterprises can capitalize on the state of the art in machine learning and re-implement successful algorithms and follow the data science lifecycles that generate highest ROI.
This talk will cover the current state of the art in AI, its limits vs. hype, and discuss concrete steps that enterprises can take to achieve desired ROI by re-implementing production-grade-ready machine learning algorithms, that have been hardened and demonstrated to work very well in specific, constrained domains.
By the end of this talk, attendees should have a better grasp on how to avoid costly and unnecessary investments into yet unproven technologies, be better equipped to navigate the complex space of AI, and understand where to best focus their resources to maximize ROI. ROBERT HRYNIEWICZ, Technical Evangelist, Hortonworks
Streamline Apache Hadoop Operations with Apache Ambari and SmartSenseHortonworks
Apache Ambari 2.5 helps customers simplify the experience for provisioning, managing, monitoring, securing and troubleshooting Hadoop deployments. Find out how the combination of Ambari and SmartSense delivers a path to success to help IT get Hadoop up and running effectively. The end result – you get the full business impact management and benefits of Big Data for your organization.
https://hortonworks.com/webinar/streamline-apache-hadoop-operations-apache-ambari-smartsense/
Pivotal - Advanced Analytics for Telecommunications Hortonworks
Innovative mobile operators need to mine the vast troves of unstructured data now available to them to help develop compelling customer experiences and uncover new revenue opportunities. In this webinar, you’ll learn how HDB’s in-database analytics enable advanced use cases in network operations, customer care, and marketing for better customer experience. Join us, and get started on your advanced analytics journey today!
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
In recent years, big data has moved from batch processing to stream-based processing since no one wants to wait hours or days to gain insights. Dozens of stream processing frameworks exist today and the same trend that occurred in the batch-based big data processing realm has taken place in the streaming world so that nearly every streaming framework now supports higher level relational operations.
On paper, combining Apache NiFi, Kafka, and Spark Streaming provides a compelling architecture option for building your next generation ETL data pipeline in near real time. What does this look like in an enterprise production environment to deploy and operationalized?
The newer Spark Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream processing with elegant code samples, but is that the whole story?
We discuss the drivers and expected benefits of changing the existing event processing systems. In presenting the integrated solution, we will explore the key components of using NiFi, Kafka, and Spark, then share the good, the bad, and the ugly when trying to adopt these technologies into the enterprise. This session is targeted toward architects and other senior IT staff looking to continue their adoption of open source technology and modernize ingest/ETL processing. Attendees will take away lessons learned and experience in deploying these technologies to make their journey easier.
How Customers are Optimizing their EDW for Fast, Secure, and Effective InsightsHortonworks
Hortonwork’s Hadoop Powered EDW (Enterprise Data Warehouse) Optimization Solution with Syncsort DMX-h enables organizations to liberate data from across the enterprise, quickly create and populate the data lake, and deliver actionable insights.
Customer case studies across a variety of industries will bring to life how organizations are using this solution to gain bigger insights from their enterprise data – securely and cost-effectively – with faster time to time value.
Apache Hive is a rapidly evolving project, many people are loved by the big data ecosystem. Hive continues to expand support for analytics, reporting, and bilateral queries, and the community is striving to improve support along with many other aspects and use cases. In this lecture, we introduce the latest and greatest features and optimization that appeared in this project last year. This includes benchmarks covering LLAP, Apache Druid's materialized views and integration, workload management, ACID improvements, using Hive in the cloud, and performance improvements. I will also tell you a little about what you can expect in the future.
Hortonworks Open Connected Data Platforms for IoT and Predictive Big Data Ana...DataWorks Summit
The energy industry is well known to be laggard adopters of new technology. However, industry challenges such as aging assets & workforce, increased regulatory scrutiny, renewable energy sources, depressed commodity prices, changing customer expectations, and growing data volumes are pushing companies to explore new technologies to help solve these problems. Learn how energy companies are leveraging Hortonworks Open and Connected Data Platforms to provide the predictive analysis and data insights to optimize performance for the energy industry.
Speaker
Kenneth Smith, General Manager, Energy, Hortonworks
In this talk, we will give a high-level overview of what is Deep Learning (DL) is and provide a lot of visual examples for an intuitive take-aways from the session – and no worries, no math here!
The session will:
- Illustrate what neural nets are via interactive examples and mini-demos
- Show applications of DL to solve specific real-world problems with State-of-the-Art (SOTA) results
- Demonstrate how latest research in DL can be applied today to areas of vision, language, recommender systems, and tabular data
- Provide pro-tips on how to get started
Speaker: Robert Hryniewicz, AI Evangelist, Hortonworks
Presentation from Future of Data Boston Meetup on Oct 24, 2017.
Streaming data is rich with insights but these insights can be difficult to find due to the difficulty of developing and deploying streaming applications. During this presentation we will show how to build and deploy a complex streaming application in a few minutes using open source tools. First we will build an application using Streaming Analytics Manager and Schema Registry that ingests data into Apache Druid. Then we will use Apache Superset to build beautiful, informative dashboards.
How is it that one system can query terabytes of data, yet still provide interactive query support? This talk will discuss two of the underlying technologies that allow Apache Hive to support fast query response, both on-premise in HDFS and in cloud object stores such as S3 and WASB.
LLAP was introduced in Hive 2.6. It provides standing processes that securely cache Hive’s columnar data and can do query processing without ever needing to start tasks in Hadoop. We will cover LLAP’s architecture, intended uses cases, and performance numbers for both on-premise and in the cloud.
The second technology is the integration of Hive with Apache Druid. Druid excels at low-latency, interactive queries over streaming data. Its method of storing data makes it very well suited for OLAP style queries. We will cover how Hive can be integrated with Druid to support real-time streaming of data from Kafka and OLAP queries.
With the rise of IoT and complexity of applications, clouds, networks and infrastructure, it is becoming more difficult to protect data and infrastructure from attackers. When groups of bad actors collaborate, share information, provide unauthorized access, and do botnet as a service, attacks in terabit units also start easily. On the other hand, it is also difficult to find enough security analysts to deal with and defend against such attacks.
Here is the emergence of community cooperation like Apache Metron and efforts to open source. Metron provides a comprehensive framework for applications, networks and security built on Apache Hadoop and open source streaming analysis (eg Apache Nifi, Apache Kafka) tools in scalable data management and processing stacks. Extensions such as profiling, machine learning, and visualization work and real-time streaming detection make SOC analysts more efficient, while intrinsic scalability of open source gives data scientists security insight from data laboratories So that it can be quickly incorporated into production.
This section explains how real-world businesses and managed service providers use Apache Metron, identify and resolve security threats on a large scale, and explain methods and ideas for adapting the platform to your security architecture · I will demonstrate.
Next gen tooling for building streaming analytics apps: code-less development...DataWorks Summit
Building apps with less code and more abstractions, unit and integration test frameworks/harnesses, and continuous integration and delivery pipelines with Jenkins are all best practices and proven patterns in the enterprise for traditional apps. But they are difficult to follow with distributed systems like streaming analytics apps.
In this talk we will introduce some powerful new open source tooling based on Hortonworks Streaming Analytics Manager (SAM) that allows enterprises to easily implement these best practices. In 35 minutes, we will walk through the building of a complex streaming analytics app using a drag-and-drop approach (similar to Apache NiFi) that does the following:
• Joining streams
• Aggregations over windows (time or count based)
• Complex event processing
• Real-time streaming enrichment
• Model scoring on the stream
Once the app is built, we will create automated unit and integration tests of the streaming application, build continuous integration and delivery pipelines with Jenkins, and introduce powerful new monitoring, management, and dev-ops troubleshooting tools.
Speaker
George Vetticaden, VP of Product Management, Hortonworks
Stream processing has become the defacto standard for building real-time ETL and Stream Analytics applications. We see batch workloads move into Stream processing to to act on the data and derive insights faster. With the explosion of data with "Perishable Insights" such IoT and machine-generated data, Stream Processing + Predictive Analytics is driving tremendous business value. This is evidenced by the explosion of Stream Processing frameworks like proven and evolving Apache Storm and newer frameworks such as Apache Flink, Apache Apex, and Spark Streaming.
Today, users have to choose and try to understand the benefits of each of these frameworks and not only that they have to learn the new APIs and also operationalize their applications. To create value faster, we are introducing new open source tool - Streamline. It is a self-service tool that will ease building streaming application and deploy the streaming application across multiple frameworks/engines that users prefer in a snap. It simplifies integration with Machine Learning models for scoring and classification of data for Predictive Analytics. It provides an elegant way to build Analytics dashboards to derive business insights out of the streaming data and to allow the business users to consume it easily.
In this talk, we will outline the fundamentals of real-time stream processing and demonstrate Streamline capabilities to show how it simplifies building real-time streaming analytics applications.
Stream processing has become the defacto standard for building real-time ETL and Stream Analytics applications. We see batch workloads move into Stream processing to act on the data and derive insights faster. With the explosion of data with "Perishable Insights" such IoT and machine-generated data, Stream Processing + Predictive Analytics is driving tremendous business value. This is evidenced by the explosion of Stream Processing frameworks like proven and evolving Apache Storm and newer frameworks such as Apache Flink, Apache Apex, and Spark Streaming.
Today, users have to choose and try to understand the benefits of each of these frameworks and not only that they have to learn the new APIs and also operationalize their applications. To create value faster, we are introducing new open source tool - Streamline. It is a self-service framework that will ease building streaming application and deploy the streaming application across multiple frameworks/engines that users prefer in a snap. It simplifies integration with Machine Learning models for scoring and classification of data for Predictive Analytics. It provides an elegant way to build Analytics dashboards to derive business insights out of the streaming data and to allow the business users to consume it easily.
In this talk, we will outline the fundamentals of real-time stream processing and demonstrate Streamline capabilities to show how it simplifies building real-time streaming analytics applications.
Speaker:
Priyank Shah, Staff Software Engineer, Hortonworks
Develop and deploy Streaming Analytics applications visually with bindings for streaming engine and multiple source/sinks, rich set of streaming operators and operational lifecycle management. Streaming Analytics Manager makes it easy to develop, monitor streaming applications and also provides analytics of data thats being processed by streaming application.
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...DataWorks Summit
Imagine if you could build and deploy an end to end complex streaming analytics app on a streaming engine like Storm or Flink that did the following:
1. Joining Streams
2. Aggregations over Windows (Time or Count based)
3. Complex Event Processing
4. Pattern Matching
5. Model scoring.
Now imagine implementing and deploying this without writing a single line of code in under 10 mins.
Imagine no more; it is indeed here. In this talk, we will discuss an exciting open source project led by Hortonworks on building and deploying streaming applications using a drag and drop paradigm.
Registry is a central metadata repository that allows users to collaboratively use Schema definitions for stream processing.
Stream Analytics Manager, provides a framework to build Streaming applications faster, easier.
Streaming Analytics Manager (SAM) simplifies the development and reduces the delivery time of analytic applications geared towards data in motion. Using a drag-and-drop interface, application developers can create complex streaming analytics apps for event correlation, context enrichment, complex pattern matching, and analytical aggregations, eliminating the need for specialized skill sets. SAM also allows users to easily define the streaming engine and environments their application will use for execution and a streaming operations view to give users insight into their application’s performance during runtime.
In this talk we will cover the key features of the Streaming Analytics Manager. We will then go over the new features recently added to SAM around ease of debugging and troubleshooting, log search, event sampling, the metrics view, test simulation mode, and more.
With SAM as an analytics solution, users get a rich experience for building and managing streaming analytics applications and bringing these applications to market considerably faster.
Speaker
Arun Iyer, Hortonworks, Software Engineer
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
The HDF 3.3 release delivers several exciting enhancements and new features. But, the most noteworthy of them is the addition of support for Kafka 2.0 and Kafka Streams.
https://hortonworks.com/webinar/hortonworks-dataflow-hdf-3-3-taking-stream-processing-next-level/
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
With the growth of Apache Kafka adoption in all major streaming initiatives across large organizations, the operational and visibility challenges associated with Kafka are on the rise as well. Kafka users want better visibility in understanding what is going on in the clusters as well as within the stream flows across producers, topics, brokers, and consumers.
With no tools in the market that readily address the challenges of the Kafka Ops teams, the development teams, and the security/governance teams, Hortonworks Streams Messaging Manager is a game-changer.
https://hortonworks.com/webinar/curing-kafka-blindness-hortonworks-streams-messaging-manager/
Learn about the exciting integration work that has been done with YARN, Red Hat OpenShift and Kurbernetes Docker container orchestration. During this presentation we will cover the basics of this exciting YARN integration effort and then launch into a demo. You won’t want to miss seeing web application docker container, Storm, and Hive SQL queries all running in the same HDP cluster!
APIs and Services for Fleet Management - Talks given @ APIDays Berlin and Ba...Toralf Richter
Insights and expereinces into making of APIs for a fleet management SaaS platform. Introduces and describes the functionality of the WEBFLEET platform and how this is made generally available through APIs. Focuses on the aspects of API making and API mangement especially in a B2B environment and the fleet mangement industry.
Introduce a connected vehicle blueprint; a Linux Akraino Project. The presentation consists of general background introduction, application use cases, network/technical/deployment architecture and the future plan.
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
Forrester forecasts* that direct spending on the Internet of Things (IoT) will exceed $400 Billion by 2023. From manufacturing and utilities, to oil & gas and transportation, IoT improves visibility, reduces downtime, and creates opportunities for entirely new business models.
But successful IoT implementations require far more than simply connecting sensors to a network. The data generated by these devices must be collected, aggregated, cleaned, processed, interpreted, understood, and used. Data-driven decisions and actions must be taken, without which an IoT implementation is bound to fail.
https://hortonworks.com/webinar/iot-predictions-2019-beyond-data-heart-iot-strategy/
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
Cloudbreak, a part of Hortonworks Data Platform (HDP), simplifies the provisioning and cluster management within any cloud environment to help your business toward its path to a hybrid cloud architecture.
https://hortonworks.com/webinar/getting-data-cloud-cloudbreak-live-demo/
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
In this webinar, we talk with experts from Johns Hopkins as they share techniques and lessons learned in real-world Apache Hadoop implementation.
https://hortonworks.com/webinar/johns-hopkins-using-hadoop-securely-access-log-events/
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
Cybersecurity today is a big data problem. There’s a ton of data landing on you faster than you can load, let alone search it. In order to make sense of it, we need to act on data-in-motion, use both machine learning, and the most advanced pattern recognition system on the planet: your SOC analysts. Advanced visualization makes your analysts more efficient, helps them find the hidden gems, or bombs in masses of logs and packets.
https://hortonworks.com/webinar/catch-hacker-real-time-live-visuals-bots-bad-guys/
We have introduced several new features as well as delivered some significant updates to keep the platform tightly integrated and compatible with HDP 3.0.
https://hortonworks.com/webinar/hortonworks-dataflow-hdf-3-2-release-raises-bar-operational-efficiency/
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
The healthcare industry—with its huge volumes of big data—is ripe for the application of analytics and machine learning. In this webinar, Hortonworks and Quanam present a tool that uses machine learning and natural language processing in the clinical classification of genomic variants to help identify mutations and determine clinical significance.
Watch the webinar: https://hortonworks.com/webinar/interpretation-tool-genomic-sequencing-data-clinical-environments/
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
Last year IBM and Hortonworks jointly announced a strategic and deep partnership. Join us as we take a close look at the partnership accomplishments and the conjoined road ahead with industry-leading analytics offers.
View the webinar here: https://hortonworks.com/webinar/ibmhortonworks-transformation-big-data-landscape/
In this exclusive Premier Inside Out, you will hear from Druid committer Slim Bouguerra, Staff Software Engineer and Product Manager Will Xu. These Hortonworkers will explain the vision of these components, review new features, share some best practices and answer your questions.
View the webinar here: https://hortonworks.com/webinar/hortonworks-premier-apache-druid/
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
Thanks to sensors and the Internet of Things, industrial processes now generate a sea of data. But are you plumbing its depths to find the insight it contains, or are you just drowning in it? Now, Hortonworks and Seeq team to bring advanced analytics and machine learning to time-series data from manufacturing and industrial processes.
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
Trimble Transportation Enterprise is a leading provider of enterprise software to over 2,000 transportation and logistics companies. They have designed an architecture that leverages Hortonworks Big Data solutions and Machine Learning models to power up multiple Blockchains, which improves operational efficiency, cuts down costs and enables building strategic partnerships.
https://hortonworks.com/webinar/blockchain-with-machine-learning-powered-by-big-data-trimble-transportation-enterprise/
Making Enterprise Big Data Small with EaseHortonworks
Every division in an organization builds its own database to keep track of its business. When the organization becomes big, those individual databases grow as well. The data from each database may become silo-ed and have no idea about the data in the other database.
https://hortonworks.com/webinar/making-enterprise-big-data-small-ease/
Driving Digital Transformation Through Global Data ManagementHortonworks
Using your data smarter and faster than your peers could be the difference between dominating your market and merely surviving. Organizations are investing in IoT, big data, and data science to drive better customer experience and create new products, yet these projects often stall in ideation phase to a lack of global data management processes and technologies. Your new data architecture may be taking shape around you, but your goal of globally managing, governing, and securing your data across a hybrid, multi-cloud landscape can remain elusive. Learn how industry leaders are developing their global data management strategy to drive innovation and ROI.
Presented at Gartner Data and Analytics Summit
Speaker:
Dinesh Chandrasekhar
Director of Product Marketing, Hortonworks
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
Join the Hortonworks product team as they introduce HDF 3.1 and the core components for a modern data architecture to support stream processing and analytics.
You will learn about the three main themes that HDF addresses:
Developer productivity
Operational efficiency
Platform interoperability
https://hortonworks.com/webinar/series-hdf-3-1-redefining-data-motion-modern-data-architectures/
4 Essential Steps for Managing Sensitive DataHortonworks
Data is growing in data lakes, so are security and compliance risks. These risks stem from storing and processing sensitive data. In this webinar, we will go through a 4 step process to proactively discover and manage sensitive data within big data environments.
https://hortonworks.com/webinar/4-essential-steps-managing-sensitive-data-data-lake/
5 Steps to Create a Company Culture that Embraces the Power of DataHortonworks
A business culture that relies on gut checks and feelings for business decisions is a hard hurdle to overcome. Company culture is often the biggest barrier to moving a company toward data-driven decisions. There's a way to get there, when driven by company leaders. Here's how you do that:
1. Get comfortable with softer data sets
2. Must come from top-down
3. A structure where goals are clear
4. Right role for technology
5. Clear stewardship around data
Exploring the Heated-and Completely Unnecessary- Data Lake DebateHortonworks
When it comes to the data lakes and data warehouses, there’s no shortage of controversy: Is one better than the other? The real answer is, there’s no need for heated debate—a data lake actually complements the data warehouse.
Integrating a data lake with your EDW is really just an evolution of architecture that can provide you with a cross-environment that allows you to explore data creatively to yield great business insights. However, there’s a trick to making it work: EDW optimization.
https://hortonworks.com/webinar/exploring-heated-completely-unnecessary-data-lake-debate/
In this webinar, we will hear from Mark McKinney, Director – Enterprise Data Analytics at Sprint about the business drivers, key success factors, and challenges faced while undertaking Sprint’s data modernization journey. You will hear how Sprint set about establishing a Hadoop data lake, ingested data from multiple environments, and overcame key skill shortages. You will also hear from Diyotta and Hortonworks about best practices for modernizing your data architecture to support transformational business initiatives.
https://hortonworks.com/webinar/sprints-data-modernization-journey/
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 Hortonworks
View the recording of the meet up, including the live demos, here: https://www.youtube.com/watch?v=uaJWB3K8lkg
Data science holds tremendous potential for organizations to uncover new insights and drivers of revenue and profitability. Big Data has brought the promise of doing data science at scale to enterprises, however this promise also comes with challenges for data scientists to continuously learn and collaborate. Data Scientists have many tools at their disposal such as notebooks like Juypter and Apache Zeppelin & IDEs such as RStudio with languages like R, Python, Scala and frameworks like Apache Spark. Given all the choices how do you best collaborate to build your model and then work through the development lifecycle to deploy it from test into production?
Why Data Science on Big Data?
In this meetup you will cover the attributes of a modern data science platform that empowers data scientists to build models using all the data in their data lake and foster continuous learning and collaboration. We will show a demo of Apache Zeppelin, Apache Spark, Apache Livy and Apache Hadoop with the focus on integration, security and model deployment and management.
Data Science at Scale DEMO
The demo will cover the Data Science life cycle: develop model in team environment, train the model with all the data on a Hadoop cluster, deploy model into production. The model will be a Spark ML model
Practical ML with Apache Spark
To deliver machine learning solutions data scientists not only need to fit models but also do familiar tasks data collection & wrangling, labelling, feature extraction and transformation, model tuning and evaluation, etc. Apache Spark provide provides a unified solution for all this under the same framework.
For example, one can use Spark SQL to generate training data from different sources and then pass it directly to MLlib for feature engineering and model tuning, instead of using Hive/Pig for the first half and then downloading the data to a single machine to train models in R. The latter is actually very common in practice but painful to maintain. Spark MLlib makes life easier for data scientists and machine learning engineers so that they can focus on building better ML models and applications.
We will discuss the underlying principles required to develop practical machine learning and data science pipelines and show some hands-on experience using Apache Spark to solve typical machine learning and data science problem. We will also have a short discussion about how Spark MLlib faces challenges from other machine learning libraries such as TensorFlow and XGBoost.
Benefits of Transferring Real-Time Data to Hadoop at ScaleHortonworks
Today’s Big Data teams demand solutions designed for Big Data that are optimized, secure, and adaptable to changing workload requirements. Working together, Hortonworks, IBM, and Attunity have designed an integrated solution that transfers large volumes of data to a platform that can handle rapid ingest, processing and analysis of data of all types from all sources, at scale.
https://hortonworks.com/webinar/benefits-transferring-real-time-data-hadoop-scale-ibm-hortonworks-attunity/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
A tool using a graphically programming paradigm with drag and drop interface allowing customers to build and deploy complex stream applications without writing any code
Bring complex stream apps to market considerably faster
Allows any organization to to build stream analytics apps without specialized skillsets.
The tool provides abstractions on the streaming engine used. The abstraction provides the ability to plugin in open source streaming engines (Storm, Spark, Flink, etc..)
The only open source tool in the market that provides graphically programming paradigm to build streaming apps.
Customers less concerned about what open source stream engine is used.
Decouple Schema from the Stream App via integration with Schema Registry
Combination of Flow Management’s Data Flow Management with Stream Processing’s Stream Builder provides a truly differentiated solution in the market to build end to end stream analytical apps