The talk will focus on explaining why operational databases do not scale due to limitations in legacy transactional management.
https://www.bigdataspain.org/2017/talk/end-of-the-myth-ultra-scalable-transactional-management
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017Big Data Spain
Apache Cassandra is distributed masterless column store database which is becoming mainstream for analytics and IoT data.
https://www.bigdataspain.org/2017/talk/tuning-java-driver-for-apache-cassandra
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Building Continuously Curated Ingestion PipelinesArvind Prabhakar
Data ingestion is a critical piece of infrastructure for any Big Data project. Learn about the key challenges in building Ingestion infrastructure and how enterprises are solving them using low level frameworks like Apache Flume, Kafka, and high level systems such as StreamSets.
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...HostedbyConfluent
DataOps challenges us to build data experiences in a repeatable way. For those with Kafka, this means finding a means of deploying flows in an automated and consistent fashion.
The challenge is to make the deployment of Kafka flows consistent across different technologies and systems: the topics, the schemas, the monitoring rules, the credentials, the connectors, the stream processing apps. And ideally not coupled to a particular infrastructure stack.
In this talk we will discuss the different approaches and benefits/disadvantages to automating the deployment of Kafka flows including Git operators and Kubernetes operators. We will walk through and demo deploying a flow on AWS EKS with MSK and Kafka Connect using GitOps practices: including a stream processing application, S3 connector with credentials held in AWS Secrets Manager.
Kafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC FederalHostedbyConfluent
ASRC Federal created the Mission Operator Assist (MOA) tool to extend human capabilities through AI/ML for NOAA. MOA ingests system log data from on-orbit satellite constellations and applies machine learning to greatly improve real-time situational awareness. MOA uses a collection of tools, including Kafka for multi-subscriber communications, all hosted through AWS Cloud Services and Kubernetes Containers for microservices. Like many traditional on-premises systems, satellite ground station operations are undergoing a renaissance as they increasingly become enabled by cloud.
During this session, the audience will learn about the satellite communications chain, and best practices and lessons learned in creating a data pipeline with Kafka for high throughput and scalability while displaying high quality situational awareness to mission operators. We will discuss our goals centered around establishing event-driven streaming for satellite logs so our machine learning becomes real-time and supporting a multi-subscriber approach for various Kafka topics. Listeners will also learn how a multi-subscriber approach using Kafka, helped us auto scale logstash based on how many messages are in the queue and other microservices.
SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®confluent
Watch this talk here: https://www.confluent.io/online-talks/siem-modernization-build-a-situationally-aware-organization-with-apache-kafka
Of all security breaches, 85% are conducted with compromised credentials, often at the administration level or higher. A lot of IT groups think “security” means authentication, authorization and encryption (AAE), but these are often tick-boxes that rarely stop breaches. The internal threat surfaces of data streams or disk drives in a raidset in a data centre are not the threat surface of interest.
Cyber or Threat organizations must conduct internal investigations of IT, subcontractors and supply chains without implicating the innocent. Therefore, they are organizationally air-gapped from IT. Some surveys indicate up to 10% of IT is under investigation at any given time.
Deploying a signal processing platform, such as Confluent Platform, allows organizations to evaluate data as soon as it becomes available enabling them to assess and mitigate risk before it arises. In Cyber or Threat Intelligence, events can be considered signals, and when analysts are hunting for threat actors, these don't appear as a single needle in a haystack, but as a series of needles. In this paradigm, streams of signals aggregate into signatures. This session shows how various sub-systems in Apache Kafka can be used to aggregate, integrate and attribute these signals into signatures of interest.
In this talk you will learn:
-The current threat landscape
-The difference between Security and Threat Intelligence
-The value of Confluent platform as an ideal complement to hardware endpoint detection systems and batch-based SIEM warehouses
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSetsKinetica
Enterprises are now faced with wrangling massive volumes of complex, streaming data from a variety of different sources, a new paradigm known as extreme data. However, the traditional data integration model that’s based on structured batch data and stable data movement patterns makes it difficult to analyze extreme data in real-time. Join Matt Hawkins, Principal Solutions Architect at Kinetica and Mark Brooks, Solution Engineer at StreamSets as they share how innovative organizations are modernizing their data stacks with StreamSets and Kinetica to enable faster data movement and analysis.In this webinar we’ll explore:
The modern data architecture required for dealing with extreme data
How StreamSets enables continuous data movement and transformation across the enterprise
How Kinetica harnesses the power of GPUs to accelerate analytics on streaming data
A live demo of StreamSets and Kinetica connector to enable high speed data ingestion, queries and data visualization
Project Ouroboros: Using StreamSets Data Collector to Help Manage the StreamS...Pat Patterson
On a typical day we see hundreds of downloads of StreamSets Data Collector, our open source data integration tool. We used to wrangle our download logs using a combination of the AWS S3 command line, sed, grep, awk and other tools, all run from a shell script (on my laptop!) once a week. This was a classic example of a brittle, hard to maintain, custom data integration. One day it dawned on me, "This is crazy, we have a tool that can do all this!". In this session, I'll explain how I built a dataflow pipeline to stream content delivery network (CDN) logs from S3 to MySQL in real-time, allowing us to gain valuable insights into our open source community. You'll also learn how we use the same techniques to not only gain insights into our community on Slack, but also build tools to better serve them.
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsDataStax Academy
The SimianViz microservices simulator contains a model of Cassandra that allows large scale global deployments to be created and exercised by simulating failure modes and connecting the simulation to real monitoring tools to visualize the effects. The simulator is open source Go code at github.com/adrianco/spigo and is developing rapidly.
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017Big Data Spain
Apache Cassandra is distributed masterless column store database which is becoming mainstream for analytics and IoT data.
https://www.bigdataspain.org/2017/talk/tuning-java-driver-for-apache-cassandra
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Building Continuously Curated Ingestion PipelinesArvind Prabhakar
Data ingestion is a critical piece of infrastructure for any Big Data project. Learn about the key challenges in building Ingestion infrastructure and how enterprises are solving them using low level frameworks like Apache Flume, Kafka, and high level systems such as StreamSets.
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...HostedbyConfluent
DataOps challenges us to build data experiences in a repeatable way. For those with Kafka, this means finding a means of deploying flows in an automated and consistent fashion.
The challenge is to make the deployment of Kafka flows consistent across different technologies and systems: the topics, the schemas, the monitoring rules, the credentials, the connectors, the stream processing apps. And ideally not coupled to a particular infrastructure stack.
In this talk we will discuss the different approaches and benefits/disadvantages to automating the deployment of Kafka flows including Git operators and Kubernetes operators. We will walk through and demo deploying a flow on AWS EKS with MSK and Kafka Connect using GitOps practices: including a stream processing application, S3 connector with credentials held in AWS Secrets Manager.
Kafka Migration for Satellite Event Streaming Data | Eric Velte, ASRC FederalHostedbyConfluent
ASRC Federal created the Mission Operator Assist (MOA) tool to extend human capabilities through AI/ML for NOAA. MOA ingests system log data from on-orbit satellite constellations and applies machine learning to greatly improve real-time situational awareness. MOA uses a collection of tools, including Kafka for multi-subscriber communications, all hosted through AWS Cloud Services and Kubernetes Containers for microservices. Like many traditional on-premises systems, satellite ground station operations are undergoing a renaissance as they increasingly become enabled by cloud.
During this session, the audience will learn about the satellite communications chain, and best practices and lessons learned in creating a data pipeline with Kafka for high throughput and scalability while displaying high quality situational awareness to mission operators. We will discuss our goals centered around establishing event-driven streaming for satellite logs so our machine learning becomes real-time and supporting a multi-subscriber approach for various Kafka topics. Listeners will also learn how a multi-subscriber approach using Kafka, helped us auto scale logstash based on how many messages are in the queue and other microservices.
SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®confluent
Watch this talk here: https://www.confluent.io/online-talks/siem-modernization-build-a-situationally-aware-organization-with-apache-kafka
Of all security breaches, 85% are conducted with compromised credentials, often at the administration level or higher. A lot of IT groups think “security” means authentication, authorization and encryption (AAE), but these are often tick-boxes that rarely stop breaches. The internal threat surfaces of data streams or disk drives in a raidset in a data centre are not the threat surface of interest.
Cyber or Threat organizations must conduct internal investigations of IT, subcontractors and supply chains without implicating the innocent. Therefore, they are organizationally air-gapped from IT. Some surveys indicate up to 10% of IT is under investigation at any given time.
Deploying a signal processing platform, such as Confluent Platform, allows organizations to evaluate data as soon as it becomes available enabling them to assess and mitigate risk before it arises. In Cyber or Threat Intelligence, events can be considered signals, and when analysts are hunting for threat actors, these don't appear as a single needle in a haystack, but as a series of needles. In this paradigm, streams of signals aggregate into signatures. This session shows how various sub-systems in Apache Kafka can be used to aggregate, integrate and attribute these signals into signatures of interest.
In this talk you will learn:
-The current threat landscape
-The difference between Security and Threat Intelligence
-The value of Confluent platform as an ideal complement to hardware endpoint detection systems and batch-based SIEM warehouses
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSetsKinetica
Enterprises are now faced with wrangling massive volumes of complex, streaming data from a variety of different sources, a new paradigm known as extreme data. However, the traditional data integration model that’s based on structured batch data and stable data movement patterns makes it difficult to analyze extreme data in real-time. Join Matt Hawkins, Principal Solutions Architect at Kinetica and Mark Brooks, Solution Engineer at StreamSets as they share how innovative organizations are modernizing their data stacks with StreamSets and Kinetica to enable faster data movement and analysis.In this webinar we’ll explore:
The modern data architecture required for dealing with extreme data
How StreamSets enables continuous data movement and transformation across the enterprise
How Kinetica harnesses the power of GPUs to accelerate analytics on streaming data
A live demo of StreamSets and Kinetica connector to enable high speed data ingestion, queries and data visualization
Project Ouroboros: Using StreamSets Data Collector to Help Manage the StreamS...Pat Patterson
On a typical day we see hundreds of downloads of StreamSets Data Collector, our open source data integration tool. We used to wrangle our download logs using a combination of the AWS S3 command line, sed, grep, awk and other tools, all run from a shell script (on my laptop!) once a week. This was a classic example of a brittle, hard to maintain, custom data integration. One day it dawned on me, "This is crazy, we have a tool that can do all this!". In this session, I'll explain how I built a dataflow pipeline to stream content delivery network (CDN) logs from S3 to MySQL in real-time, allowing us to gain valuable insights into our open source community. You'll also learn how we use the same techniques to not only gain insights into our community on Slack, but also build tools to better serve them.
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsDataStax Academy
The SimianViz microservices simulator contains a model of Cassandra that allows large scale global deployments to be created and exercised by simulating failure modes and connecting the simulation to real monitoring tools to visualize the effects. The simulator is open source Go code at github.com/adrianco/spigo and is developing rapidly.
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Lightbend
It’s become clear to many business that the ability to extract real-time actionable insights from data is not only a source of competitive advantage, but also a way to defend their existing business models from disruption. So while legacy models such as nightly batch jobs aren’t disappearing, an era of fast, streaming data (aka “Fast Data”) is upon us, and represents the state of the art for gaining real-time perishable insights that can then be used to serve existing customers better, acquiring new markets and keep the competition at bay.
That said, distributed, Fast Data architectures are much harder to build, and carry their own set of challenges. Enterprises looking to move quickly are presented with a growing ecosystem of technologies, which often delays fast decisions and provides its own set of risks:
* With so many choices, what tools should you use?
* How do you avoid making rookie mistakes?
* What are the best patterns and practices for streaming applications?
In this webinar with Sean Glover, Senior Consultant at Lightbend and industry veteran, we examine the rise of streaming systems built around Spark, Mesos, Akka, Cassandra and Kafka, their role in handling endless streams of data to gain real-time insights. Sean then reviews how the Lightbend Fast Data Platform (FDP) brings them together in a comprehensive, easy to use, integrated platform, which includes installation, integration, and monitoring tools tuned for various deployment scenarios, plus sample applications.
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes per day of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality. This session is especially recommended for data infrastructure engineers and architects planning, building, or maintaining similar systems.
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Data Con LA
Today’s Software Defined environments attempt to remove the weakness of computing hardware from the operational equation. There is no doubt that this is a natural progress away from overpriced, proprietary compute and storage layers. However, even at the heart of any Software Defined universe is an underlying hardware stack that must be robust, reliable and cost effective. Our 20+ years experience delivering over 2000 clusters and clouds has taught us how to properly design and engineer the right hardware solution for Big Data, Cluster and Cloud environments. This presentation will share this knowledge allowing user to make better design decisions for any deployment.
TidalScale has created a software defined computer.
At TidalScale, we have created a simple cost-effective way for a data scientist, an analyst, an engineer, a scientist, a database administrator, or a software developer to access a group of servers through a single operating system instance as if it were a single supercomputer. This dramatically simplifies development, while reducing software scaling complexity not to mention a dramatic cost saving in hardware and software.
We configure hosted hardware into one or more TidalPods. Each TidalPod is a virtual supercomputer comprising a set of commodity servers configured with the TidalScale HyperKernel. What the user sees is standard Linux, FreeBSD or Windows running with the sum of all memory, processors, networks, and I/O. The secret sauce is the HyperKernel that fools the guest OS into thinking it’s running directly on a huge, expensive machine when in fact it’s running on a set of smaller, less expensive servers.
We offer an incredibly simple user experience.
• Define the computer size you want (Number of CPU, Amount of Memory), boot the virtual machine, then login to the computer…
Thus, we enable a simple cost-effective way for a data scientist, an analyst, an engineer, a scientist, a database administrator, or a software developer to access a group of servers in a Datacenter through a single operating system instance as if it were a single supercomputer. This dramatically simplifies development, while reducing software scaling complexity not to mention a dramatic cost saving in hardware and software.
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...HostedbyConfluent
Converting production databases into live data streams for Apache Kafka can be labor intensive and costly. As Kafka architectures grow, complexity also rises as data teams begin to configure clusters for redundancy, partitions for performance, as well as for consumer groups for correlated analytics processing. In this breakout session, you’ll hear data streaming success stories from Generali and Skechers that leverage Qlik Data Integration and Confluent. You’ll discover how Qlik’s data integration platform lets organizations automatically produce real-time transaction streams into Kafka, Confluent Platform, or Confluent Cloud, deliver faster business insights from data, enable streaming analytics, as well as streaming ingestion for modern analytics. Learn how these customer use Qlik and Confluent to: - Turn databases into live data feeds - Simplify and automate the real-time data streaming process - Accelerate data delivery to enable real-time analytics Learn how Skechers and Generali breathe new life into data in the cloud, stay ahead of changing demands, while lowering over-reliance on resources, production time and costs.
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...HostedbyConfluent
This session will dive into our most successful (and unsuccessful!) multi-cluster event replication patterns.
An x-ray of the cross cluster distribution model that powers our globally distributed APIs will touch on the benefits that this model has provided in terms of client API experience, delivery agility and developer experience.
We will focus on recipes for effective use of Mirror Maker event replication to power platform distribution including the challenges of managing a 'fan in' event replication workflow - pulling events created in satellite clusters back to a mothership cluster for processing.
We will introduce the elegant technique of replication event multiplexing - which can be used to simplify the burden of managing a 'fan-in' replication topology by eliminating regional awareness from the application domain and improving replication health monitoring & observability.
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBMHostedbyConfluent
While Kafka has guarantees around the number of server failures a cluster can tolerate, to avoid service interruptions, or even data loss, it is prudent to have infrastructure in place for when an environment becomes unavailable during a planned or unplanned outage.
This talk describes the architectures available to you when planning for an outage. We will examine configurations including active/passive and active/active as well as availability zones and debate the benefits and limitations of each. We will also cover how to set up each configuration using the tools in Kafka.
Whether downtime while you fail over clients to a backup is acceptable or you require your Kafka clusters to be highly available, this talk will give you an understanding of the options available to mitigate the impact of the loss of an environment.
Flattening the Curve with Kafka (Rishi Tarar, Northrop Grumman Corp.) Kafka S...confluent
Responding to a global pandemic presents a unique set of technical and public health challenges. The real challenge is the ability to gather data coming in via many data streams in variety of formats influences the real-world outcome and impacts everyone. The Centers for Disease Control and Prevention CELR (COVID Electronic Lab Reporting) program was established to rapidly aggregate, validate, transform, and distribute laboratory testing data submitted by public health departments and other partners. Confluent Kafka with KStreams and Connect play a critical role in program objectives to:
o Track the threat of COVID-19 virus
o Provide comprehensive data for local, state, and federal response
o Better understand locations with an increase in incidence
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...confluent
The Oak Ridge Leadership Facility (OLCF) in the National Center for Computational Sciences (NCCS) division at Oak Ridge National Laboratory (ORNL) houses world-class high-performance computing (HPC) resources and has a history of operating top-ranked supercomputers on the TOP500 list, including the world's current fastest, Summit, an IBM AC922 machine with a peak of 200 petaFLOPS. With the exascale era rapidly approaching, the need for a robust and scalable big data platform for operations data is more important than ever. In the past when a new HPC resource was added to the facility, pipelines from data sources spanned multiple data sinks which oftentimes resulted in data silos, slow operational data onboarding, and non-scalable data pipelines for batch processing. Using Apache Kafka as the message bus of the division's new big data platform has allowed for easier decoupling of scalable data pipelines, faster data onboarding, and stream processing with the goal to continuously improve insight into the HPC resources and their supporting systems. This talk will focus on the NCCS division's transition to Apache Kafka over the past few years to enhance the OLCF's current capabilities and prepare for Frontier, OLCF's future exascale system; including the development and deployment of a full big data platform in a Kubernetes environment from both a technical and cultural shift perspective. This talk will also cover the mission of the OLCF, the operational data insights related to high-performance computing that the organization strives for, and several use-cases that exist in production today.
Getting real-time analytics for devices/application/business monitoring from trillions of events and petabytes of data like companies Netflix, Uber, Alibaba, Paypal, Ebay, Metamarkets do.
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
IBM has built a “Data Science Experience” cloud service that exposes Notebook services at web scale.
https://www.bigdataspain.org/2017/talk/the-analytic-platform-behind-ibms-watson-data-platform
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Lightbend
It’s become clear to many business that the ability to extract real-time actionable insights from data is not only a source of competitive advantage, but also a way to defend their existing business models from disruption. So while legacy models such as nightly batch jobs aren’t disappearing, an era of fast, streaming data (aka “Fast Data”) is upon us, and represents the state of the art for gaining real-time perishable insights that can then be used to serve existing customers better, acquiring new markets and keep the competition at bay.
That said, distributed, Fast Data architectures are much harder to build, and carry their own set of challenges. Enterprises looking to move quickly are presented with a growing ecosystem of technologies, which often delays fast decisions and provides its own set of risks:
* With so many choices, what tools should you use?
* How do you avoid making rookie mistakes?
* What are the best patterns and practices for streaming applications?
In this webinar with Sean Glover, Senior Consultant at Lightbend and industry veteran, we examine the rise of streaming systems built around Spark, Mesos, Akka, Cassandra and Kafka, their role in handling endless streams of data to gain real-time insights. Sean then reviews how the Lightbend Fast Data Platform (FDP) brings them together in a comprehensive, easy to use, integrated platform, which includes installation, integration, and monitoring tools tuned for various deployment scenarios, plus sample applications.
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes per day of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality. This session is especially recommended for data infrastructure engineers and architects planning, building, or maintaining similar systems.
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Data Con LA
Today’s Software Defined environments attempt to remove the weakness of computing hardware from the operational equation. There is no doubt that this is a natural progress away from overpriced, proprietary compute and storage layers. However, even at the heart of any Software Defined universe is an underlying hardware stack that must be robust, reliable and cost effective. Our 20+ years experience delivering over 2000 clusters and clouds has taught us how to properly design and engineer the right hardware solution for Big Data, Cluster and Cloud environments. This presentation will share this knowledge allowing user to make better design decisions for any deployment.
TidalScale has created a software defined computer.
At TidalScale, we have created a simple cost-effective way for a data scientist, an analyst, an engineer, a scientist, a database administrator, or a software developer to access a group of servers through a single operating system instance as if it were a single supercomputer. This dramatically simplifies development, while reducing software scaling complexity not to mention a dramatic cost saving in hardware and software.
We configure hosted hardware into one or more TidalPods. Each TidalPod is a virtual supercomputer comprising a set of commodity servers configured with the TidalScale HyperKernel. What the user sees is standard Linux, FreeBSD or Windows running with the sum of all memory, processors, networks, and I/O. The secret sauce is the HyperKernel that fools the guest OS into thinking it’s running directly on a huge, expensive machine when in fact it’s running on a set of smaller, less expensive servers.
We offer an incredibly simple user experience.
• Define the computer size you want (Number of CPU, Amount of Memory), boot the virtual machine, then login to the computer…
Thus, we enable a simple cost-effective way for a data scientist, an analyst, an engineer, a scientist, a database administrator, or a software developer to access a group of servers in a Datacenter through a single operating system instance as if it were a single supercomputer. This dramatically simplifies development, while reducing software scaling complexity not to mention a dramatic cost saving in hardware and software.
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...HostedbyConfluent
Converting production databases into live data streams for Apache Kafka can be labor intensive and costly. As Kafka architectures grow, complexity also rises as data teams begin to configure clusters for redundancy, partitions for performance, as well as for consumer groups for correlated analytics processing. In this breakout session, you’ll hear data streaming success stories from Generali and Skechers that leverage Qlik Data Integration and Confluent. You’ll discover how Qlik’s data integration platform lets organizations automatically produce real-time transaction streams into Kafka, Confluent Platform, or Confluent Cloud, deliver faster business insights from data, enable streaming analytics, as well as streaming ingestion for modern analytics. Learn how these customer use Qlik and Confluent to: - Turn databases into live data feeds - Simplify and automate the real-time data streaming process - Accelerate data delivery to enable real-time analytics Learn how Skechers and Generali breathe new life into data in the cloud, stay ahead of changing demands, while lowering over-reliance on resources, production time and costs.
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...HostedbyConfluent
This session will dive into our most successful (and unsuccessful!) multi-cluster event replication patterns.
An x-ray of the cross cluster distribution model that powers our globally distributed APIs will touch on the benefits that this model has provided in terms of client API experience, delivery agility and developer experience.
We will focus on recipes for effective use of Mirror Maker event replication to power platform distribution including the challenges of managing a 'fan in' event replication workflow - pulling events created in satellite clusters back to a mothership cluster for processing.
We will introduce the elegant technique of replication event multiplexing - which can be used to simplify the burden of managing a 'fan-in' replication topology by eliminating regional awareness from the application domain and improving replication health monitoring & observability.
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBMHostedbyConfluent
While Kafka has guarantees around the number of server failures a cluster can tolerate, to avoid service interruptions, or even data loss, it is prudent to have infrastructure in place for when an environment becomes unavailable during a planned or unplanned outage.
This talk describes the architectures available to you when planning for an outage. We will examine configurations including active/passive and active/active as well as availability zones and debate the benefits and limitations of each. We will also cover how to set up each configuration using the tools in Kafka.
Whether downtime while you fail over clients to a backup is acceptable or you require your Kafka clusters to be highly available, this talk will give you an understanding of the options available to mitigate the impact of the loss of an environment.
Flattening the Curve with Kafka (Rishi Tarar, Northrop Grumman Corp.) Kafka S...confluent
Responding to a global pandemic presents a unique set of technical and public health challenges. The real challenge is the ability to gather data coming in via many data streams in variety of formats influences the real-world outcome and impacts everyone. The Centers for Disease Control and Prevention CELR (COVID Electronic Lab Reporting) program was established to rapidly aggregate, validate, transform, and distribute laboratory testing data submitted by public health departments and other partners. Confluent Kafka with KStreams and Connect play a critical role in program objectives to:
o Track the threat of COVID-19 virus
o Provide comprehensive data for local, state, and federal response
o Better understand locations with an increase in incidence
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...confluent
The Oak Ridge Leadership Facility (OLCF) in the National Center for Computational Sciences (NCCS) division at Oak Ridge National Laboratory (ORNL) houses world-class high-performance computing (HPC) resources and has a history of operating top-ranked supercomputers on the TOP500 list, including the world's current fastest, Summit, an IBM AC922 machine with a peak of 200 petaFLOPS. With the exascale era rapidly approaching, the need for a robust and scalable big data platform for operations data is more important than ever. In the past when a new HPC resource was added to the facility, pipelines from data sources spanned multiple data sinks which oftentimes resulted in data silos, slow operational data onboarding, and non-scalable data pipelines for batch processing. Using Apache Kafka as the message bus of the division's new big data platform has allowed for easier decoupling of scalable data pipelines, faster data onboarding, and stream processing with the goal to continuously improve insight into the HPC resources and their supporting systems. This talk will focus on the NCCS division's transition to Apache Kafka over the past few years to enhance the OLCF's current capabilities and prepare for Frontier, OLCF's future exascale system; including the development and deployment of a full big data platform in a Kubernetes environment from both a technical and cultural shift perspective. This talk will also cover the mission of the OLCF, the operational data insights related to high-performance computing that the organization strives for, and several use-cases that exist in production today.
Getting real-time analytics for devices/application/business monitoring from trillions of events and petabytes of data like companies Netflix, Uber, Alibaba, Paypal, Ebay, Metamarkets do.
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
IBM has built a “Data Science Experience” cloud service that exposes Notebook services at web scale.
https://www.bigdataspain.org/2017/talk/the-analytic-platform-behind-ibms-watson-data-platform
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
AWS re:Invent 2016: Billions of Rows Transformed in Record Time Using Matilli...Amazon Web Services
Billions of Rows Transformed in Record Time Using Matillion ETL for Amazon Redshift
GE Power & Water develops advanced technologies to help solve some of the world’s most complex challenges related to water availability and quality. They had amassed billions of rows of data on on-premises databases, but decided to migrate some of their core big data projects to the AWS Cloud. When they decided to transform and store it all in Amazon Redshift, they knew they needed an ETL/ELT tool that could handle this enormous amount of data and safely deliver it to its destination. In this session, Ryan Oates, Enterprise Architect at GE Water, shares his use case, requirements, outcomes and lessons learned. He also shares the details of his solution stack, including Amazon Redshift and Matillion ETL for Amazon Redshift in AWS Marketplace. You learn best practices on Amazon Redshift ETL supporting enterprise analytics and big data requirements, simply and at scale. You learn how to simplify data loading, transformation and orchestration on to Amazon Redshift and how build out a real data pipeline. Get the insights to deliver your big data project in record time.
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
Tackling the challenge of designing a machine learning model and putting it into production is the key to getting value back – and the roadblock that stops many promising machine learning projects. After the data scientists have done their part, engineering robust production data pipelines has its own set of challenges. Syncsort software helps the data engineer every step of the way.
Building on the process of finding and matching duplicates to resolve entities, the next step is to set up a continuous streaming flow of data from data sources so that as the sources change, new data automatically gets pushed through the same transformation and cleansing data flow – into the arms of machine learning models.
Some of your sources may already be streaming, but the rest are sitting in transactional databases that change hundreds or thousands of times a day. The challenge is that you can’t affect performance of data sources that run key applications, so putting something like database triggers in place is not the best idea. Using Apache Kafka or similar technologies as the backbone to moving data around doesn’t solve the problem of needing to grab changes from the source pushing them into Kafka and consuming the data from Kafka to be processed. If something unexpected happens – like connectivity is lost on either the source or the target side, you don’t want to have to fix it or start over because the data is out of sync.
View this 15-minute webcast on-demand to learn how to tackle these challenges in large scale production implementations.
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
Originally, Hadoop was used as a batch analytics tool; however, this is rapidly changing, as applications move towards real-time processing and streaming. Amazon Elastic MapReduce has made running Hadoop in the cloud easier and more accessible than ever. Each day, tens of thousands of Hadoop clusters are run on the Amazon Elastic MapReduce infrastructure by users of every size — from university students to Fortune 50 companies. We recently launched Amazon Kinesis – a managed service for real-time processing of high volume, streaming data. Amazon Kinesis enables a new class of big data applications which can continuously analyze data at any volume and throughput, in real-time. Adi will discuss each service, dive into how customers are adopting the services for different use cases, and share emerging best practices. Learn how you can architect Amazon Kinesis and Amazon Elastic MapReduce together to create a highly scalable real-time analytics solution which can ingest and process terabytes of data per hour from hundreds of thousands of different concurrent sources. Forever change how you process web site click-streams, marketing and financial transactions, social media feeds, logs and metering data, and location-tracking events.
How to Migrate Applications Off a MainframeVMware Tanzu
Ah, the mainframe. Peel back many transactional business applications at any enterprise and you’ll find a mainframe application under there. It’s often where the crown jewels of the business’ data and core transactions are processed. The tooling for these applications is dated and new code is infrequent, but moving off is seen as risky. No one. Wants. To. Touch. Mainframes.
But mainframe applications don’t have to be the electric third rail. Modernizing, even pieces of those mainframe workloads into modern frameworks on modern platforms, has huge payoffs. Developers can gain all the productivity benefits of modern tooling. Not to mention the scaling, security, and cost benefits.
So, how do you get started modernizing applications off a mainframe? Join Rohit Kelapure, Consulting Practice Lead at Pivotal, as he shares lessons from projects with enterprises to move workloads off of mainframes. You’ll learn:
● How to decide what to modernize first by looking at business requirements AND the existing codebase
● How to take a test-driven approach to minimize risks in decomposing the mainframe application
● What to use as a replacement or evolution of mainframe schedulers
● How to include COBOL and other mainframe developers in the process to retain institutional knowledge and defuse project detractors
● How to replatform mainframe applications to the cloud leveraging a spectrum of techniques
Presenter : Rohit Kelapure, Consulting Practice Lead, Pivotal
Next generation business automation with the red hat decision manager and red...Masahiko Umeno
This slide had been presented at Red Hat Tech Exchange 2018 Taiwan. Talking about 1. Our focus area, 2. Application Architecture, 3. Development Method, 4. Organizing Information, 5. Business Process, 6. Case Management. This session obtain high evaluation. (No.1 in session contents per all sessions)
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData
Apache Spark 2.0 offers many enhancements that make continuous analytics quite simple. In this talk, we will discuss many other things that you can do with your Apache Spark cluster. We explain how a deep integration of Apache Spark 2.0 and in-memory databases can bring you the best of both worlds! In particular, we discuss how to manage mutable data in Apache Spark, run consistent transactions at the same speed as state-the-art in-memory grids, build and use indexes for point lookups, and run 100x more analytics queries at in-memory speeds. No need to bridge multiple products or manage, tune multiple clusters. We explain how one can take regulation Apache Spark SQL OLAP workloads and speed them up by up to 20x using optimizations in SnappyData.
We then walk through several use-case examples, including IoT scenarios, where one has to ingest streams from many sources, cleanse it, manage the deluge by pre-aggregating and tracking metrics per minute, store all recent data in a in-memory store along with history in a data lake and permit interactive analytic queries at this constantly growing data. Rather than stitching together multiple clusters as proposed in Lambda, we walk through a design where everything is achieved in a single, horizontally scalable Apache Spark 2.0 cluster. A design that is simpler, a lot more efficient, and let’s you do everything from Machine Learning and Data Science to Transactions and Visual Analytics all in one single cluster.
Why does big data always have to go through a pipeline? multiple data copies, slow, complex and stale analytics? We present a unified analytics platform that brings streaming, transactions and adhoc OLAP style interactive analytics in a single in-memory cluster based on Spark.
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudAmazon Web Services
FINRA’s Data Lake unlocks the value in its data to accelerate analytics and machine learning at scale. FINRA's Technology group has changed its customer's relationship with data by creating a Managed Data Lake that enables discovery on Petabytes of capital markets data, while saving time and money over traditional analytics solutions. FINRA’s Managed Data Lake includes a centralized data catalog and separates storage from compute, allowing users to query from petabytes of data in seconds. Learn how FINRA uses Spot instances and services such as Amazon S3, Amazon EMR, Amazon Redshift, and AWS Lambda to provide the 'right tool for the right job' at each step in the data processing pipeline. All of this is done while meeting FINRA’s security and compliance responsibilities as a financial regulator.
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionDmitry Anoshin
This session will cover building the modern Data Warehouse by migration from the traditional DW platform into the cloud, using Amazon Redshift and Cloud ETL Matillion in order to provide Self-Service BI for the business audience. This topic will cover the technical migration path of DW with PL/SQL ETL to the Amazon Redshift via Matillion ETL, with a detailed comparison of modern ETL tools. Moreover, this talk will be focusing on working backward through the process, i.e. starting from the business audience and their needs that drive changes in the old DW. Finally, this talk will cover the idea of self-service BI, and the author will share a step-by-step plan for building an efficient self-service environment using modern BI platform Tableau.
Many organizations still lack the ability to not only monitor, but more important, to truly manage their ECM applications. This slideshare shares how to advance ECM monitoring much further into ECM Application Management.
First, let's note a few interesting statistics and information about the changing ECM environment. According to varying statistics from Gartner, Forrester and AIIM:
74% list "improve the experience of our customers" as top business priority over the next 12 months. VS
Systems with 1000+ users created 60-150 support tickets per month
Success is measured by confident information workers and satisfied customers. VS
Majority of organizzations rely on support calls or incidents to alert them to system problems.
80% say content systems are just as critical to business operations as transactional systems.
but only 32% Have specific and measured SLAs for uptime.
and only 29% Are in a position to monitor trends over time against user loading, content volumes, geographical locations, upgrades
In addition, there is much change in the environment, including cloud, consolidation, user adoption pressures, standardization, expansion, upgrades, governance, Saas, process re-engineering and more.
That's where Reveille comes in:
Preempt issues by notifying IT about problems before your end-users are impacted.
Automate problem isolation and resolution for your business-critical content applications.
Communicate objectively your application service levels to IT and management over time.
Protect content from unwarranted behavior by employees.
Reveille's Approach is:
Comprehensive platform management + real-time user transaction management = Optimized ECM application health
The slideshow shares a product tour of Reveille's ECM application management solutions.
Modernize and Simplify IT Operations Management for DevOps SuccessDevOps.com
Whether your organization is already leveraging tools that are cloud based tools or you are part of an organization undergoing transformation, you may have the challenge of a hybrid set of workload deployments, both cloud based and on-premises. Join us for this webinar to explore the common operational challenges many DevOps teams are facing today, how IT operations best practices could be leveraged for use in a DevOps methodology and how modern operations management tools can help you carry out those best practices to meet your goals on an on-going basis.
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data Spain
Insights can only be as good as the data. The data quality domain is enormously large, so you need to understand your company pain points to know what to focus on first.
https://www.bigdataspain.org/2017/talk/big-data-big-quality
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
2gether is a financial platform based on Blockchain, Big Data and Artificial Intelligence that allows interaction between users and third-party services in a single interface.
https://www.bigdataspain.org/2017/talk/scaling-a-backend-for-a-big-data-and-blockchain-environment
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
All modern Big Data solutions, like Hadoop, Kafka or the rest of the ecosystem tools, are designed as distributed processes and as such include some sort of redundancy for High Availability.
https://www.bigdataspain.org/2017/talk/disaster-recovery-for-big-data
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Big Data Spain
In this presentation, attendees will see how to speed up existing Hadoop and Spark deployments by just making Apache Ignite responsible for RAM utilization. No code modifications, no new architecture from scratch!
https://www.bigdataspain.org/2017/talk/boost-hadoop-and-spark-with-in-memory-technologies
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Big Data Spain
The power of this new set of tools for Data Science. Is really easy to start applying these technics in your current workflow.
https://www.bigdataspain.org/2017/talk/data-science-for-lazy-people-automated-machine-learning
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
GPUs on the cloud as Infrastructure as a Service (IaaS) seem a commodity. However to efficiently distribute deep learning tasks on several GPUs is challenging.
https://www.bigdataspain.org/2017/talk/training-deep-learning-models-on-multiple-gpus-in-the-cloud
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Big Data Spain
Unbalanced data is a specific data configuration that appears commonly in nature. Applying machine learning techniques to this kind of data is a difficult process, usually addressed by unbalanced reduction techniques.
https://www.bigdataspain.org/2017/talk/unbalanced-data-same-algorithms-different-techniques
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
Time series related problems have traditionally been solved using engineered features obtained by heuristic processes.
https://www.bigdataspain.org/2017/talk/state-of-the-art-time-series-analysis-with-deep-learning
Big Data Spain 2017
November 16th - 17th
Trading at market speed with the latest Kafka features by Iñigo González at B...Big Data Spain
Not long ago only banks and hedge funds could afford doing automated and High Frequency Trading, that is, the ability to send buy commodities in microseconds intervals.
https://www.bigdataspain.org/2017/talk/trading-at-market-speed-with-the-latest-kafka-features
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain
The shift to stream processing at LinkedIn has accelerated over the past few years. We now have over 200 Samza applications in production processing more than 260B events per day.
https://www.bigdataspain.org/2017/talk/apache-samza-jake-maes
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Big Data Spain
Artificial Intelligence and Data-centric businesses.
https://www.bigdataspain.org/2017/talk/tbc
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
Ten years ago there were rumours of the death of causal inference. Big data was supposed to enable us to rely on purely correlational data to predict and control the world.
https://www.bigdataspain.org/2017/talk/why-big-data-didnt-end-causal-inference
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
The Meme of the Internet Index will be the new normal to analyze and predict facts and sensations which go around the Internet.
https://www.bigdataspain.org/2017/talk/meme-index-analyzing-fads-and-sensations-on-the-internet
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Big Data Spain
Geotab is a leader in the expanding world of Internet of Things (IoT) and telematics industry with Big Data.
https://www.bigdataspain.org/2017/talk/vehicle-big-data-that-drives-smart-city-advancement
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Big Data Spain
In recent years Machine Learning (ML) and especially Deep Learning (DL) have achieved great success in many areas such as visual recognition, NLP or even aiding in medical research.
https://www.bigdataspain.org/2017/talk/attacking-machine-learning-used-in-antivirus-with-reinforcement
Big Data Spain 2017
16th - 17th Kinépolis Madrid
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...Big Data Spain
Primary function of banking sector is promoting economic activity; which means “commerce”, exchanging what someone produces-has for something that someone consumes-desires.
https://www.bigdataspain.org/2017/talk/more-people-less-banking-blockchain
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Big Data Spain
Bol.com has been an early Hadoop user: since 2008 where it was first built for a recommendation algorithm.
https://www.bigdataspain.org/2017/talk/make-the-elephant-fly-once-again
Big Data Spain 2017
16th - 17th Kinépolis Madrid
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...Big Data Spain
In an era of growing data complexity and volume and the advent of Big Data, feature selection has a key role to play in helping reduce high-dimensionality in machine learning problems.
https://www.bigdataspain.org/2017/talk/feature-selection-for-big-data-advances-and-challenges
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Deep reinforcement learning : Starcraft learning environment by Gema Parreño ...Big Data Spain
A theorical description of reinforcement learning principles and a deep dive into DeepMind Research environment .
https://www.bigdataspain.org/2017/talk/reinforced-learning-deepmind-starcraft-learning-environment
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
JMeter webinar - integration with InfluxDB and Grafana
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-Peris at Big Data Spain 2017
1.
2. The End of a
Myth: Ultra-
Scalable
Transactional
Management
Presented by:
Ricardo Jimenez-Peris
CEO & Co-founder
@ LeanXcale
3. About the Speaker
Top researcher on scalable transactional management and
distributed data management with 100+ publications in top
conferences and journals
Co-author of a book on Database Replication
Professor on distributed systems and data management for over 25
years
Co-inventor of two granted patents and 8 new patent applications
Invited speaker to top-tech companies in Silicon Valley, such as
Facebook, Twitter, Salesforce, Heroku, EMC-Pivotal (when it was
EMC-Greenplum), HP, Microsoft
4. About LeanXcale
Vendor of a NewSQL
ultra-scalable database,
Full ACID, Full SQL
LeanXcale – HTAP
Database: blending
Operational and
Analytical capabilities
delivering real-time
data
LeanXcale leverages an
ultra-efficient storage
engine, which is a
relational key-value
data store
Product Team
45
%
30%
15
Awards
Total number
PhD Holders
10-25 years of
Industry expertise
Engineers from Top
technical universities
5. The Myth
”Operational databases can not scale”
WHY?
Nobody managed to scale them in
three decades.
Some say that is due to the CAP
Theorem.
- vendors that do not provide ACID properties
6. C - Consistency
A - Availability
P – Partitions
The CAP theorem states something very well
known in distributed systems, i.e. if you want to
tolerate partitions, choose:
Availability at all nodes and no consistency
OR
Consistency and no Availability at all nodes
The CAP Theorem
Q: Where is the S of
Scalability?
A: Nowhere
7. Solved how to scale
transactions to large
scale (i.e. 100 million
update transactions
per second) in a fully
seamless way
Breakthrough result of
15+ years of research
by a tenacious team
The End of the Myth: Ultra-Scalable
Transactions
8. Evaluation without data manager/logging to see how much
throughput can attain the transactional processing
2.35
Million
transactio
ns
per
second
Scalability
14. Separation of commit from the visibility of committed
data
Proactive pre-assignment of commit timestamps to
committing transactions
Transactions can commit in parallel due to:
• They do not conflict
• They have their commit timestamp already assigned that will
determine its serialization order
• Visibility is regulated separately to guarantee the reading of fully
consistent states
Detection and resolution of conflicts before commit
Main Principles
16. Local
Transaction
Manager
Get start TS
Run on start
TS snapshot
Conflict
Manag
er
The transaction will read
the state as of “start TS”.
Write-write conflicts are
detected by conflict
managers on the fly.
Transactional Life Cycle: Execution
17. Get start TS
Run on start
TS snapshot
Commit
The local transaction
manager orchestrates
the commit.
Local Txn
Manager
Transactional Life Cycle: Commit
19. TIMESTAMP 11
TIMESTAMP 15
TIMESTAMP 12
TIMESTAMP 14
TIMESTAMP 13
Time
Sequence of timestamps received by the Snapshot Server
Evolution of the current snapshot at the Snapshot Server
TIMESTAMP
11
TIMESTAMP
12 TIMESTAMP
12 TIMESTAMP
15TIMESTAMP
11
1
1
1
5
1
2
1
4
1
3
1
1
1
1
1
2
1
2
1
5
Transactional Life Cycle: Commit
20. The described approach so far is the original reactive
approach
It results in multiple messages per update transaction.
The adopted approach is proactive:
• The local transaction managers report periodically about
the number of committed update transactions per second
• The commit sequencer distributes batches of commit
timestamps to the local transaction managers
• The snapshot server gets periodically batches of
timestamps (both used and discarded) from local
transaction managers
• The snapshot server reports periodically to local transaction
managers the most current consistent snapshot
Increasing Efficiency
21. The transactional management provides ultra-scalability
Fully transparent:
• No sharding.
• No required a priori knowledge about rows to be
accessed.
• Syntactically: no changes required in the application.
• Semantically: equivalent behavior to a centralized
system.
Provides Snapshot Isolation
(the isolation level provided by Oracle when set to
“Serializable” isolation).
+
+
Transactional Processing
22. KiVi Key-Value
Data Store
OLTP & OLAP
Query Engine
Storage
Transaction Manager
SQL Engine
Ultra-Scalable
Transactions
Architecture
23. Cutting costs of business analytics by 80%
Real-time Analytical Queries
No more ETLs
Analytical Queries
on Operational Data
Operational Database
OLTP
Data Warehouse
OLAP
OLTP + OLAP
Blending OLTP & OLAP:
Making Decisions at the Right Time
25. LeanXcale is the first database technology that can substitute
the mainframe.
It can bear the operational workloads of a mainframe, but at
the same time provide real-time analytics over the
operational data.
It can be deployed by the mainframe to be loaded/updated in
real-time, and applications can be offloaded from the
mainframe one by one.
LeanXcale is partnering with Bull Atos to provide a database
appliance that will provide the substitute of the mainframe.
Offloading/Substituting Mainframe
26. Enabling to implement the Customer Experience Management
(CEM) halving the number of nodes.
Leveraging the computation of aggregates in real-time as raw
KPIs are inserted.
Analytical aggregation queries become simple single-row
queries.
Elasticity enables to substantially reduce the operation
personnel cost during the non-working hours with low loads.
Reducing Cost of Ownership at
Telcos
27. Using the key-value interface for large data ingestion of IoT
applications while still accessible through SQL and reducing by
several times the infrastructure needed.
Real-time analytics.
Computation of aggregates in real-time to reduce the cost of
aggregation analytical queries, e.g., for the smart grid.
Elasticity enable to adjust the consumption of resources to the
load received.
Large IoT Applications
28. Using the key-value interface to reduce the footprint needed
to get clicks
Real-time analytics for implementing availability checking
Elasticity enable to adjust the consumption of resources to
the load received
Full ACIDity to guarantee the consistency of the truth of
sales and actual availability
Disrupting Travel Tech