Abstract:- It's easier than ever to power serverless architectures with managed database services like MongoDB Atlas. In this session, we will explore the rise of serverless architectures and how they've rapidly integrated into public and private cloud offerings. We will demonstrate how to build a simple REST API using AWS Lambda functions, create a highly available cluster in MongoDB Atlas, and connect both via VCP Peering. We will then simulate load and use the monitoring and scale features of MongoDB Atlas and use MongoDB Compass to browse our database.
Building Microservices with Apache Kafka by Colin McCabeData Con LA
Abstract:- Building distributed systems is challenging. Luckily, Apache Kafka provides a powerful toolkit for putting together big services as a set of scalable, decoupled components. In this talk, I'll describe some of the design tradeoffs when building microservices, and how Kafka's powerful abstractions can help. I'll also talk a little bit about what the community has been up to with Kafka Streams, Kafka Connect, and exactly-once semantics.
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...HostedbyConfluent
The Apache Kafka ecosystem is very rich with components and pieces that make for designing and implementing secure, efficient, fault-tolerant and scalable event stream processing (ESP) systems. Using real-world examples, this talk covers why Apache Kafka is an excellent choice for cloud-native and hybrid architectures, how to go about designing, implementing and maintaining ESP systems, best practices and patterns for migrating to the cloud or hybrid configurations, when to go with PaaS or IaaS, what options are available for running Kafka in cloud or hybrid environments and what you need to build and maintain successful ESP systems that are secure, performant, reliable, highly-available and scalable.
Streaming data in the cloud with Confluent and MongoDB Atlas | Robert Waters,...HostedbyConfluent
Are you looking for a cloud-based architecture that includes the best of breed streaming and database technologies? In this session you will learn how to setup and configure the Confluent Cloud with MongoDB Atlas. We'll start the journey learning about the basic connectivity between the two cloud services and end with a brief discovery of what you can do with data once it is in MongoDB Atlas. By the end of this session you will know how to securely setup and configure the MongoDB Atlas connectors in the Confluent Cloud in both a source and sink configuration.
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...HostedbyConfluent
Being a pioneer in the interactive gaming industry, SONY PlayStation has played a vital role in implementing technological advancements thus help bringing global video gaming community together. With the recent launch of next generation console PS-5 into the market by partnering with thousands of game developers and millions of video gamers across the globe, humongous volumes of data generation in playstation servers is quite inevitable. This presentation talks about how we leveraged big data technologies along with Apache Kafka to solve some of the realtime data analytical problems. Two important case studies we carryout recently are: ""Competitive pricing analysis of game titles across online video game marketplaces"" & ""understand the gamers sentiment by streaming data from social feeds and perform NLP""
Along with Apache Kafka, the technologies that we have used to architect the solution are: REST API, ZooKeeper, D3.js visualization, DoMo, Python, SQL, NLP, AWS Cloud & JSON.
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...HostedbyConfluent
Confluent Cloud makes Devops engineers lives a lot more easier.
Yet moving 1500 microservices, 10K topics and 100K partitions to a multi-cluster Confluent cloud can be a challenge.
In this talk you will hear about 5 lessons that Wix has learned in order to successfully meet this challenge.
These lessons include:
1. Automation, Automation, Automation - all the process has to be completely automated at such scale
2. Prefer a gradual approach - E.g. migrate topics in small chunks and not all at once. Reduces risks if things go bad
3. Cleanup first - avoid migrating unused topics or topics with too many unnecessary partitions
How a distributed graph analytics platform uses Apache Kafka for data ingesti...HostedbyConfluent
Using Kafka to stream data into TigerGraph, a distributed graph database, is a common pattern in our customers’ data architecture. In the TigerGraph database, Kafka Connect framework was used to build the native S3 data loader. In TigerGraph Cloud, we will be building native integration with many data sources such as Azure Blob Storage and Google Cloud Storage using Kafka as an integrated component for the Cloud Portal.
In this session, we will be discussing both architectures: 1. built-in Kafka Connect framework within TigerGraph database; 2. using Kafka cluster for cloud native integration with other popular data sources. Demo will be provided for both data streaming processes.
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...HostedbyConfluent
Kafka Connect makes it possible to easily integrate data sources like MongoDB! In this session we will first explore how MongoDB enables developers to rapidly innovate through the use of the document model. We will then put the document model to life and showcase how to integrate MongoDB and Kafka through the use of the MongoDB Connector with Apache Kafka. Finally, we will explore the different ways of using the connector including the new Confluent Cloud integration.
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...HostedbyConfluent
At Wells-Fargo, we move 150 TB of logs data from our syslogs to Splunk forwarders that get indexed and organized for analytic queries. As we modernize and migrate our applications to our hybrid cloud the performance expectations for this infrastructure will proportionately increase. Those improvements include the resilience of the end to end infrastructure. First, we decoupled the applications from their logging interface through a loglibrary which split the streams of logs from their sources to KAFKA which routed them to two separate destinations Splunk and ELK respectively. We also used prometheus and grafana for monitoring the metrics. We also deployed KAFKA, Splunk, ELK, Prometheus and Grafana on the Kubernetes clusters. Confluent had released a version of KAFKA without Zookeeper and replaced its functionality with Quorum Controller. The Quorum-Controller version exhibited better disposability one of the 12factors that's important for Cloud-Nativeness. We packaged this version into a Kubernetes operator called Keda and deployed this for auto-scaling. We tested this to simulate the amount of logdata that we typically generate in production. Based on the above we have also implemented distributed tracing and help make it just as resilient. We will share our lessons learnt, the patterns and practices to modernize both our underlying runtime platforms and our applications with highly performing and resilient event-driven architectures.
Building Microservices with Apache Kafka by Colin McCabeData Con LA
Abstract:- Building distributed systems is challenging. Luckily, Apache Kafka provides a powerful toolkit for putting together big services as a set of scalable, decoupled components. In this talk, I'll describe some of the design tradeoffs when building microservices, and how Kafka's powerful abstractions can help. I'll also talk a little bit about what the community has been up to with Kafka Streams, Kafka Connect, and exactly-once semantics.
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...HostedbyConfluent
The Apache Kafka ecosystem is very rich with components and pieces that make for designing and implementing secure, efficient, fault-tolerant and scalable event stream processing (ESP) systems. Using real-world examples, this talk covers why Apache Kafka is an excellent choice for cloud-native and hybrid architectures, how to go about designing, implementing and maintaining ESP systems, best practices and patterns for migrating to the cloud or hybrid configurations, when to go with PaaS or IaaS, what options are available for running Kafka in cloud or hybrid environments and what you need to build and maintain successful ESP systems that are secure, performant, reliable, highly-available and scalable.
Streaming data in the cloud with Confluent and MongoDB Atlas | Robert Waters,...HostedbyConfluent
Are you looking for a cloud-based architecture that includes the best of breed streaming and database technologies? In this session you will learn how to setup and configure the Confluent Cloud with MongoDB Atlas. We'll start the journey learning about the basic connectivity between the two cloud services and end with a brief discovery of what you can do with data once it is in MongoDB Atlas. By the end of this session you will know how to securely setup and configure the MongoDB Atlas connectors in the Confluent Cloud in both a source and sink configuration.
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...HostedbyConfluent
Being a pioneer in the interactive gaming industry, SONY PlayStation has played a vital role in implementing technological advancements thus help bringing global video gaming community together. With the recent launch of next generation console PS-5 into the market by partnering with thousands of game developers and millions of video gamers across the globe, humongous volumes of data generation in playstation servers is quite inevitable. This presentation talks about how we leveraged big data technologies along with Apache Kafka to solve some of the realtime data analytical problems. Two important case studies we carryout recently are: ""Competitive pricing analysis of game titles across online video game marketplaces"" & ""understand the gamers sentiment by streaming data from social feeds and perform NLP""
Along with Apache Kafka, the technologies that we have used to architect the solution are: REST API, ZooKeeper, D3.js visualization, DoMo, Python, SQL, NLP, AWS Cloud & JSON.
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...HostedbyConfluent
Confluent Cloud makes Devops engineers lives a lot more easier.
Yet moving 1500 microservices, 10K topics and 100K partitions to a multi-cluster Confluent cloud can be a challenge.
In this talk you will hear about 5 lessons that Wix has learned in order to successfully meet this challenge.
These lessons include:
1. Automation, Automation, Automation - all the process has to be completely automated at such scale
2. Prefer a gradual approach - E.g. migrate topics in small chunks and not all at once. Reduces risks if things go bad
3. Cleanup first - avoid migrating unused topics or topics with too many unnecessary partitions
How a distributed graph analytics platform uses Apache Kafka for data ingesti...HostedbyConfluent
Using Kafka to stream data into TigerGraph, a distributed graph database, is a common pattern in our customers’ data architecture. In the TigerGraph database, Kafka Connect framework was used to build the native S3 data loader. In TigerGraph Cloud, we will be building native integration with many data sources such as Azure Blob Storage and Google Cloud Storage using Kafka as an integrated component for the Cloud Portal.
In this session, we will be discussing both architectures: 1. built-in Kafka Connect framework within TigerGraph database; 2. using Kafka cluster for cloud native integration with other popular data sources. Demo will be provided for both data streaming processes.
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...HostedbyConfluent
Kafka Connect makes it possible to easily integrate data sources like MongoDB! In this session we will first explore how MongoDB enables developers to rapidly innovate through the use of the document model. We will then put the document model to life and showcase how to integrate MongoDB and Kafka through the use of the MongoDB Connector with Apache Kafka. Finally, we will explore the different ways of using the connector including the new Confluent Cloud integration.
Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...HostedbyConfluent
At Wells-Fargo, we move 150 TB of logs data from our syslogs to Splunk forwarders that get indexed and organized for analytic queries. As we modernize and migrate our applications to our hybrid cloud the performance expectations for this infrastructure will proportionately increase. Those improvements include the resilience of the end to end infrastructure. First, we decoupled the applications from their logging interface through a loglibrary which split the streams of logs from their sources to KAFKA which routed them to two separate destinations Splunk and ELK respectively. We also used prometheus and grafana for monitoring the metrics. We also deployed KAFKA, Splunk, ELK, Prometheus and Grafana on the Kubernetes clusters. Confluent had released a version of KAFKA without Zookeeper and replaced its functionality with Quorum Controller. The Quorum-Controller version exhibited better disposability one of the 12factors that's important for Cloud-Nativeness. We packaged this version into a Kubernetes operator called Keda and deployed this for auto-scaling. We tested this to simulate the amount of logdata that we typically generate in production. Based on the above we have also implemented distributed tracing and help make it just as resilient. We will share our lessons learnt, the patterns and practices to modernize both our underlying runtime platforms and our applications with highly performing and resilient event-driven architectures.
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...HostedbyConfluent
We will demonstrate how easy it is to use Confluent Cloud as the data source of your Beam pipelines. You will learn how to process the information that comes from Confluent Cloud in real time, make transformations on such information and feed it back to your Kafka topics and other parts of your architecture.
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...HostedbyConfluent
As Kafka deployments grow within your organization, so do the challenges around lifecycle management. For instance, do you really know what streams exist, who is producing and consuming them? What is the effect of upstream changes? How is this information kept up to date, so it is relevant and consistent to others looking to reuse these streams? Ever wish you had a way to view and visualize graphically the relationships between schemas, topics and applications? In this talk we will show you how to do that and get more value from your Kafka Streaming infrastructure using an event portal. It’s like an API portal but specialized for event streams and publish/subscribe patterns. Join us to see how you can automatically discover event streams from your Kafka clusters, import them to a catalog and then leverage code gen capabilities to ease development of new applications.
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...HostedbyConfluent
You cannot operate what you cannot measure. In this talk, I am going to present the built-in metrics framework of Kafka Streams that supports monitoring Kafka Streams applications. You will learn how to setup monitoring of metrics for your Kafka Streams applications and you will hear about the following recent improvements to the metrics framework that aim to extend and simplify monitoring. KIP-444 aims to simplify and extend the built-in metrics framework. The RocksDB metrics introduced in KIP-471 and KIP-607 allow you to look directly into the built-in persistent state stores of your Kafka Streams applications. Finally, KIP-613 specifies metrics that measure end-to-end latencies in your applications. This talk will help you collect intel about the behavior of your Kafka Streams applications, and will allow you to reason about the deployment. In the end, you will be able to better understand your applications and run them in a more robust manner.
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...HostedbyConfluent
The challenge with today’s “data explosion” is finding the most appropriate answer to the question, “So where do I put my data?” while avoiding the longer-term problem: data warehouses, data lakes, cloud storage, NoSQL databases, … are often the places where “big” data goes to die.
Enter Physics 101, and my corollary to Newton’s First Law of Motion:
Data in motion tends to stay in motion until it comes rest on disk. Similarly, if data is at rest, it will remain at rest until an external “force” puts it in motion again.
Data inevitably comes to rest at some point. Without “external forces”, data often gets lost or becomes stale where it lands. “Modern” architectures tend to involve data pipelines where downstream consumers of data make use of data generated upstream, often with built-for-purpose repositories at each stage. This session will explore how data that has come to rest can be put in motion again; how Kafka can keep it in motion longer; and how pipelined architectures might be created to make use of that data.
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...HostedbyConfluent
Apache Hudi is a data lake platform, that provides streaming primitives (upserts/deletes/change streams) on top of data lake storage. Hudi powers very large data lakes at Uber, Robinhood and other companies, while being pre-installed on four major cloud platforms.
Hudi supports exactly-once, near real-time data ingestion from Apache Kafka to cloud storage, which is typically used in-place of a S3/HDFS sink connector to gain transactions and mutability. While this approach is scalable and battle-tested, it can only ingest data in mini batches, leading to lower data freshness. In this talk, we introduce a Kafka Connect Sink Connector for Apache Hudi, which writes data straight into Hudi's log format, making the data immediately queryable, while Hudi's table services like indexing, compaction, clustering work behind the scenes, to further re-organize for better query performance.
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...HostedbyConfluent
Event streaming allows companies to build more scalable and loosely coupled real-time applications supporting massive concurrency demands and simplifying the construction of services.
At the same time, API management provides capabilities to securely control the upstream services consumption, including the event processing infrastructure.
This session shows how Kong Konnect Enterprise can complement Kafka Event Streaming, exposing it to new and external consumers while applying specific and critical policies to control its consumption, including API key, OAuth/OIDC and others for authentication, rate limiting, caching, log processing, etc.
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...confluent
(Bruno Simic, Solutions Engineer, Couchbase)
Breakout during Confluent’s streaming event in Munich. This three-day hands-on course focused on how to build, manage, and monitor clusters using industry best-practices developed by the world’s foremost Apache Kafka™ experts. The sessions focused on how Kafka and the Confluent Platform work, how their main subsystems interact, and how to set up, manage, monitor, and tune your cluster.
Low-latency real-time data processing at giga-scale with Kafka | John DesJard...HostedbyConfluent
Data volumes continue to grow, demanding new, more scalable solutions for low-latency data processing. Previously, the default approach to deploying such systems was to throw a ton of hardware at the problem. However, that is no longer necessary, as newer technologies showcase a level of efficiency that enables smaller, more manageable clusters while handling extreme workloads. Processing billions of events per second on Kafka can now be done with a modest investment in compute resources. In this session, you will learn how to architect and build the fastest data processing applications that scale linearly, and combine streaming data and reference data data-in-motion and data-at-rest with machine learning. We will take you through the end-to-end framework and example application, built on the Hazelcast Platform, an open source software engine designed for ultra-fast performance. We will also show how you can leverage SQL to further explore the operational data in the solution including querying Kafka topics and key-value data on the in-memory data store. Attendees will also get access to the Github sample application shown.
Building Event-Driven Microservices using Kafka Streams (Stathis Souris, Thou...London Microservices
Recorded at the London Microservices Meetup: https://www.meetup.com/London-Microservices/
- Date: 14th of October 2020
- Video: https://youtu.be/Arzr0T0hrCw
- Event page: https://www.meetup.com/London-Microservices/events/273266418/
Follow us on Twitter! https://twitter.com/LondonMicrosvc
---
Building Event-Driven Microservices using Kafka Streams
Stathis Souris, ThousandEyes
Streaming is all the rage these days, but can business systems be built using stream processing?
We'll explore this question by looking at Streaming Microservices using Kafka Streams.
We'll also discuss some of the patterns that we currently use in real-life production microservices at ThousandEyes (part of Cisco) and things to avoid.
Key takeaways:
- Basic Kafka concepts
- Kafka Streams
- Discuss various event-driven service built using Kafka Streams
Stathis spent several years in Athens, Greece, as a Software Engineer before moving to London and ThousandEyes (part of Cisco now).
He enjoys working with large distributed systems using technologies like Kafka, Elasticsearch, Java, Kotlin.
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...HostedbyConfluent
Apache Druid is a high-performance distributed analytics store for modern analytics applications. It supports ingesting millions of events per second and sub-second query processing. Druid supports various types of data sources for ingestion, including Apache Kafka. You can immediately query on stream events once they get ingested into Druid. Since Kafka provides scalable and robust data delivery while Druid supports advanced complex analysis on streams, Kafka and Druid are widely used together for BI and operational analytics use cases, which require interactivity, scalability, real-time, and performance.
This talk is based on our real-world experiences building out streaming analytics stacks powering production use cases across many industries.
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...HostedbyConfluent
DataOps challenges us to build data experiences in a repeatable way. For those with Kafka, this means finding a means of deploying flows in an automated and consistent fashion.
The challenge is to make the deployment of Kafka flows consistent across different technologies and systems: the topics, the schemas, the monitoring rules, the credentials, the connectors, the stream processing apps. And ideally not coupled to a particular infrastructure stack.
In this talk we will discuss the different approaches and benefits/disadvantages to automating the deployment of Kafka flows including Git operators and Kubernetes operators. We will walk through and demo deploying a flow on AWS EKS with MSK and Kafka Connect using GitOps practices: including a stream processing application, S3 connector with credentials held in AWS Secrets Manager.
Why Cloud-Native Kafka Matters: 4 Reasons to Stop Managing it YourselfDATAVERSITY
With your most talented teams bogged down managing a massive Kafka deployment, it can be challenging to move the dial on projects that drive real value for your business. For example, launching your next major feature, fueling more best-in-breed services like AI/ML on your cloud provider platform, or developing your first use cases for real-time data movement across clouds. By shifting to a fully managed, cloud-native service for Kafka you can unlock your teams to work on the projects that make the best use of your data in motion.
In this webinar you will learn about:
• The increasing value of data in motion to your business
• Challenges and costs of self-managing a large-scale Kafka deployment
• Benefits of managed cloud services for non-core activities like data storage, data warehousing, and messaging
• Optimizing time usage for value-generating activity like new product launches
• Potential cost savings for your business with a cloud-native service for Kafka
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...HostedbyConfluent
Kafka has fast become the center of streaming analytics applications in the modern digital enterprise. Kafka operates in the context of a broad ecosystem of data lifecycle components which need a consistent platform of security, monitoring, management and governance. This problem becomes paramount when your streaming architectures go hybrid by spanning from on-premises to the cloud. Throw in the reality of a multi-cloud setup that a lot of enterprises are facing and now, you have a complex streaming architecture that is difficult to operationally manage, monitor, secure or govern.
Cloudera remains committed to an open community driven approach and increasing the ease of use and visibility for Kafka based solutions. Attend this session to understand more about how streaming architectures can be extended easily to the hybrid cloud and multi-cloud. Also, learn about our plans for further community contributions.
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...HostedbyConfluent
If you want to build an ecosystem of streaming data to your Kafka platform, you will need a much easier way for your developer to quickly move what’s on the source to your cluster. Better yet, making the connector serverless so it would NOT waste any resources for being idle, and having a trusted partner manage your Kafka infrastructure for you. In this session, we will show you how easy we have made streaming data with great user experience. Flexible resource management with our new secret weapon in the Apache Camel project -- Kamelet. We’ll also demonstrate how Red Hat OpenShift Streams for Apache Kafka simplifies the provisioning of Kafka deployments in a public cloud, managing the cluster,topics, and configuring secure access to the Kafka cluster for your developers.
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...HostedbyConfluent
Quality Matters … and as event-driven architectures (EDA) become increasingly popular in the microservices space, ensuring the delivery and performance of your EDA increases in importance. But while it’s powerful architecture, it does come with its challenges, especially from a testing perspective. For example, most organizations are not reliant on Kafka alone, but a multitude of interconnected APIs like REST, GraphQL and gRPC. One of the questions that arise from this challenge: How do you build end-to-end tests when the APIs are completely different technologies—without relying on fragile scripts? In our talk, we’ll tackle this question and many more when it comes to the testing of Apache Kafka endpoints and your services architecture. We’ll cover what makes testing in EDA difficult; technologies that can help you; and how we at SmartBear are thinking about these testing problems and, most importantly, how we are trying to solve for them.
Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...HostedbyConfluent
Are you looking for a cloud-based architecture that includes the best of breed streaming and database technologies? In this session you will learn how to setup and configure the Confluent Cloud with MongoDB Atlas. We'll start the journey learning about the basic connectivity between the two cloud services and end with a brief discovery of what you can do with data once it is in MongoDB Atlas. By the end of this session you will know how to securely setup and configure the MongoDB Atlas connectors in the Confluent Cloud in both a source and sink configuration.
Webinar: Serverless Architectures with AWS Lambda and MongoDB AtlasMongoDB
It’s easier than ever to power serverless architectures with our managed MongoDB as a service, MongoDB Atlas. In this session, we will explore the rise of serverless architectures and how they’ve rapidly integrated into public and private cloud offerings.
With AWS Lambda, you can easily build scalable microservices for mobile, web, and IoT applications or respond to events from other AWS services without managing infrastructure. In this session, you’ll see demonstrations and hear more about newly launched features. We’ll show you how to use Lambda to build web, mobile, or IoT backends and voice-enabled apps, and we'll show you how to extend both AWS and third party services by triggering Lambda functions. We’ll also provide productivity and performance tips for getting the most out of your Lambda functions and show how cloud native architectures use Lambda to eliminate “cold servers” and excess capacity without sacrificing scalability or responsiveness.
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...HostedbyConfluent
We will demonstrate how easy it is to use Confluent Cloud as the data source of your Beam pipelines. You will learn how to process the information that comes from Confluent Cloud in real time, make transformations on such information and feed it back to your Kafka topics and other parts of your architecture.
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...HostedbyConfluent
As Kafka deployments grow within your organization, so do the challenges around lifecycle management. For instance, do you really know what streams exist, who is producing and consuming them? What is the effect of upstream changes? How is this information kept up to date, so it is relevant and consistent to others looking to reuse these streams? Ever wish you had a way to view and visualize graphically the relationships between schemas, topics and applications? In this talk we will show you how to do that and get more value from your Kafka Streaming infrastructure using an event portal. It’s like an API portal but specialized for event streams and publish/subscribe patterns. Join us to see how you can automatically discover event streams from your Kafka clusters, import them to a catalog and then leverage code gen capabilities to ease development of new applications.
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...HostedbyConfluent
You cannot operate what you cannot measure. In this talk, I am going to present the built-in metrics framework of Kafka Streams that supports monitoring Kafka Streams applications. You will learn how to setup monitoring of metrics for your Kafka Streams applications and you will hear about the following recent improvements to the metrics framework that aim to extend and simplify monitoring. KIP-444 aims to simplify and extend the built-in metrics framework. The RocksDB metrics introduced in KIP-471 and KIP-607 allow you to look directly into the built-in persistent state stores of your Kafka Streams applications. Finally, KIP-613 specifies metrics that measure end-to-end latencies in your applications. This talk will help you collect intel about the behavior of your Kafka Streams applications, and will allow you to reason about the deployment. In the end, you will be able to better understand your applications and run them in a more robust manner.
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...HostedbyConfluent
The challenge with today’s “data explosion” is finding the most appropriate answer to the question, “So where do I put my data?” while avoiding the longer-term problem: data warehouses, data lakes, cloud storage, NoSQL databases, … are often the places where “big” data goes to die.
Enter Physics 101, and my corollary to Newton’s First Law of Motion:
Data in motion tends to stay in motion until it comes rest on disk. Similarly, if data is at rest, it will remain at rest until an external “force” puts it in motion again.
Data inevitably comes to rest at some point. Without “external forces”, data often gets lost or becomes stale where it lands. “Modern” architectures tend to involve data pipelines where downstream consumers of data make use of data generated upstream, often with built-for-purpose repositories at each stage. This session will explore how data that has come to rest can be put in motion again; how Kafka can keep it in motion longer; and how pipelined architectures might be created to make use of that data.
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...HostedbyConfluent
Apache Hudi is a data lake platform, that provides streaming primitives (upserts/deletes/change streams) on top of data lake storage. Hudi powers very large data lakes at Uber, Robinhood and other companies, while being pre-installed on four major cloud platforms.
Hudi supports exactly-once, near real-time data ingestion from Apache Kafka to cloud storage, which is typically used in-place of a S3/HDFS sink connector to gain transactions and mutability. While this approach is scalable and battle-tested, it can only ingest data in mini batches, leading to lower data freshness. In this talk, we introduce a Kafka Connect Sink Connector for Apache Hudi, which writes data straight into Hudi's log format, making the data immediately queryable, while Hudi's table services like indexing, compaction, clustering work behind the scenes, to further re-organize for better query performance.
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...HostedbyConfluent
Event streaming allows companies to build more scalable and loosely coupled real-time applications supporting massive concurrency demands and simplifying the construction of services.
At the same time, API management provides capabilities to securely control the upstream services consumption, including the event processing infrastructure.
This session shows how Kong Konnect Enterprise can complement Kafka Event Streaming, exposing it to new and external consumers while applying specific and critical policies to control its consumption, including API key, OAuth/OIDC and others for authentication, rate limiting, caching, log processing, etc.
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...confluent
(Bruno Simic, Solutions Engineer, Couchbase)
Breakout during Confluent’s streaming event in Munich. This three-day hands-on course focused on how to build, manage, and monitor clusters using industry best-practices developed by the world’s foremost Apache Kafka™ experts. The sessions focused on how Kafka and the Confluent Platform work, how their main subsystems interact, and how to set up, manage, monitor, and tune your cluster.
Low-latency real-time data processing at giga-scale with Kafka | John DesJard...HostedbyConfluent
Data volumes continue to grow, demanding new, more scalable solutions for low-latency data processing. Previously, the default approach to deploying such systems was to throw a ton of hardware at the problem. However, that is no longer necessary, as newer technologies showcase a level of efficiency that enables smaller, more manageable clusters while handling extreme workloads. Processing billions of events per second on Kafka can now be done with a modest investment in compute resources. In this session, you will learn how to architect and build the fastest data processing applications that scale linearly, and combine streaming data and reference data data-in-motion and data-at-rest with machine learning. We will take you through the end-to-end framework and example application, built on the Hazelcast Platform, an open source software engine designed for ultra-fast performance. We will also show how you can leverage SQL to further explore the operational data in the solution including querying Kafka topics and key-value data on the in-memory data store. Attendees will also get access to the Github sample application shown.
Building Event-Driven Microservices using Kafka Streams (Stathis Souris, Thou...London Microservices
Recorded at the London Microservices Meetup: https://www.meetup.com/London-Microservices/
- Date: 14th of October 2020
- Video: https://youtu.be/Arzr0T0hrCw
- Event page: https://www.meetup.com/London-Microservices/events/273266418/
Follow us on Twitter! https://twitter.com/LondonMicrosvc
---
Building Event-Driven Microservices using Kafka Streams
Stathis Souris, ThousandEyes
Streaming is all the rage these days, but can business systems be built using stream processing?
We'll explore this question by looking at Streaming Microservices using Kafka Streams.
We'll also discuss some of the patterns that we currently use in real-life production microservices at ThousandEyes (part of Cisco) and things to avoid.
Key takeaways:
- Basic Kafka concepts
- Kafka Streams
- Discuss various event-driven service built using Kafka Streams
Stathis spent several years in Athens, Greece, as a Software Engineer before moving to London and ThousandEyes (part of Cisco now).
He enjoys working with large distributed systems using technologies like Kafka, Elasticsearch, Java, Kotlin.
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...HostedbyConfluent
Apache Druid is a high-performance distributed analytics store for modern analytics applications. It supports ingesting millions of events per second and sub-second query processing. Druid supports various types of data sources for ingestion, including Apache Kafka. You can immediately query on stream events once they get ingested into Druid. Since Kafka provides scalable and robust data delivery while Druid supports advanced complex analysis on streams, Kafka and Druid are widely used together for BI and operational analytics use cases, which require interactivity, scalability, real-time, and performance.
This talk is based on our real-world experiences building out streaming analytics stacks powering production use cases across many industries.
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...HostedbyConfluent
DataOps challenges us to build data experiences in a repeatable way. For those with Kafka, this means finding a means of deploying flows in an automated and consistent fashion.
The challenge is to make the deployment of Kafka flows consistent across different technologies and systems: the topics, the schemas, the monitoring rules, the credentials, the connectors, the stream processing apps. And ideally not coupled to a particular infrastructure stack.
In this talk we will discuss the different approaches and benefits/disadvantages to automating the deployment of Kafka flows including Git operators and Kubernetes operators. We will walk through and demo deploying a flow on AWS EKS with MSK and Kafka Connect using GitOps practices: including a stream processing application, S3 connector with credentials held in AWS Secrets Manager.
Why Cloud-Native Kafka Matters: 4 Reasons to Stop Managing it YourselfDATAVERSITY
With your most talented teams bogged down managing a massive Kafka deployment, it can be challenging to move the dial on projects that drive real value for your business. For example, launching your next major feature, fueling more best-in-breed services like AI/ML on your cloud provider platform, or developing your first use cases for real-time data movement across clouds. By shifting to a fully managed, cloud-native service for Kafka you can unlock your teams to work on the projects that make the best use of your data in motion.
In this webinar you will learn about:
• The increasing value of data in motion to your business
• Challenges and costs of self-managing a large-scale Kafka deployment
• Benefits of managed cloud services for non-core activities like data storage, data warehousing, and messaging
• Optimizing time usage for value-generating activity like new product launches
• Potential cost savings for your business with a cloud-native service for Kafka
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...HostedbyConfluent
Kafka has fast become the center of streaming analytics applications in the modern digital enterprise. Kafka operates in the context of a broad ecosystem of data lifecycle components which need a consistent platform of security, monitoring, management and governance. This problem becomes paramount when your streaming architectures go hybrid by spanning from on-premises to the cloud. Throw in the reality of a multi-cloud setup that a lot of enterprises are facing and now, you have a complex streaming architecture that is difficult to operationally manage, monitor, secure or govern.
Cloudera remains committed to an open community driven approach and increasing the ease of use and visibility for Kafka based solutions. Attend this session to understand more about how streaming architectures can be extended easily to the hybrid cloud and multi-cloud. Also, learn about our plans for further community contributions.
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...HostedbyConfluent
If you want to build an ecosystem of streaming data to your Kafka platform, you will need a much easier way for your developer to quickly move what’s on the source to your cluster. Better yet, making the connector serverless so it would NOT waste any resources for being idle, and having a trusted partner manage your Kafka infrastructure for you. In this session, we will show you how easy we have made streaming data with great user experience. Flexible resource management with our new secret weapon in the Apache Camel project -- Kamelet. We’ll also demonstrate how Red Hat OpenShift Streams for Apache Kafka simplifies the provisioning of Kafka deployments in a public cloud, managing the cluster,topics, and configuring secure access to the Kafka cluster for your developers.
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...HostedbyConfluent
Quality Matters … and as event-driven architectures (EDA) become increasingly popular in the microservices space, ensuring the delivery and performance of your EDA increases in importance. But while it’s powerful architecture, it does come with its challenges, especially from a testing perspective. For example, most organizations are not reliant on Kafka alone, but a multitude of interconnected APIs like REST, GraphQL and gRPC. One of the questions that arise from this challenge: How do you build end-to-end tests when the APIs are completely different technologies—without relying on fragile scripts? In our talk, we’ll tackle this question and many more when it comes to the testing of Apache Kafka endpoints and your services architecture. We’ll cover what makes testing in EDA difficult; technologies that can help you; and how we at SmartBear are thinking about these testing problems and, most importantly, how we are trying to solve for them.
Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...HostedbyConfluent
Are you looking for a cloud-based architecture that includes the best of breed streaming and database technologies? In this session you will learn how to setup and configure the Confluent Cloud with MongoDB Atlas. We'll start the journey learning about the basic connectivity between the two cloud services and end with a brief discovery of what you can do with data once it is in MongoDB Atlas. By the end of this session you will know how to securely setup and configure the MongoDB Atlas connectors in the Confluent Cloud in both a source and sink configuration.
Webinar: Serverless Architectures with AWS Lambda and MongoDB AtlasMongoDB
It’s easier than ever to power serverless architectures with our managed MongoDB as a service, MongoDB Atlas. In this session, we will explore the rise of serverless architectures and how they’ve rapidly integrated into public and private cloud offerings.
With AWS Lambda, you can easily build scalable microservices for mobile, web, and IoT applications or respond to events from other AWS services without managing infrastructure. In this session, you’ll see demonstrations and hear more about newly launched features. We’ll show you how to use Lambda to build web, mobile, or IoT backends and voice-enabled apps, and we'll show you how to extend both AWS and third party services by triggering Lambda functions. We’ll also provide productivity and performance tips for getting the most out of your Lambda functions and show how cloud native architectures use Lambda to eliminate “cold servers” and excess capacity without sacrificing scalability or responsiveness.
Join us to learn about the state of serverless computing from Dr. Tim Wagner, General Manager of AWS Lambda. Dr. Wagner discusses the latest developments from AWS Lambda and the serverless computing ecosystem. He talks about how serverless computing is becoming a core component in how companies build and run their applications and services, and he also discusses how serverless computing will continue to evolve.
How to build and deploy serverless apps - AWS Summit Cape Town 2018Amazon Web Services
Speaker: Alex Casalboni, AWS
Customer Speaker: Impression Signatures
Serverless computing allows you to build and run applications without the need for provisioning or managing servers. It means that you can build web, mobile, and IoT backends, run stream processing or big data workloads, build chatbots, run code at the edge, and more. In this session, learn how to get started with serverless computing with AWS Lambda and managed services such as Amazon API Gateway, Amazon Kinesis, and Amazon DynamoDB. We introduce you to the basics of building with AWS Lambda, as well as how to properly perform CI/CD for your serverless application. We will discuss a method for automating the deployment of serverless applications using services such as AWS CodePipeline and AWS CodeBuild, and techniques such as canary deployments and automatic rollbacks.
With AWS Lambda, you can easily build scalable microservices for mobile, web, and IoT applications or respond to events from other AWS services without managing infrastructure. In this session, you’ll see demonstrations and hear more about newly launched features. We’ll show you how to use Lambda to build web, mobile, or IoT backends and voice-enabled apps, and we'll show you how to extend both AWS and third party services by triggering Lambda functions. We’ll also provide productivity and performance tips for getting the most out of your Lambda functions and show how cloud native architectures use Lambda to eliminate “cold servers” and excess capacity without sacrificing scalability or responsiveness.
Ever wished you had a list of cheat codes to unleash the full power of AWS Lambda for your production workload? Come learn how to build a robust, scalable, and highly available serverless application using AWS Lambda. In this session, we discuss hacks and tricks for maximizing your AWS Lambda performance, such as leveraging customer reuse, using the 500 MB scratch space and local cache, creating custom metrics for managing operations, aligning upstream and downstream services to scale along with Lambda, and many other workarounds and optimizations across your entire function lifecycle.
You also learn how Hearst converted its real-time clickstream analytics data pipeline from a server-based model to a serverless one. The infrastructure of the data pipeline relied on Amazon EC2 instances and cron jobs to shepherd data through the process. In 2016, Hearst converted its data pipeline architecture to a serverless process that relies on event triggers and the power of AWS Lambda. By moving from a time-based process to a trigger-based process, Hearst improved its pipeline latency times by 50%.
With AWS Lambda, you can easily build scalable microservices for mobile, web, and IoT applications or respond to events from other AWS services without managing infrastructure. In this session, you’ll see demonstrations and hear more about newly launched features. We’ll show you how to use Lambda to build web, mobile, or IoT backends and voice-enabled apps, and we'll show you how to extend both AWS and third party services by triggering Lambda functions. We’ll also provide productivity and performance tips for getting the most out of your Lambda functions and show how cloud native architectures use Lambda to eliminate “cold servers” and excess capacity without sacrificing scalability or responsiveness.
Build and run applications without thinking about serversAmazon Web Services
Organizations need to gain insight and knowledge from a growing number of Internet of Things (IoT) APIs clickstreams comprised of unstructured and log data sources. However, organizations are often limited by legacy data warehouses and ETL processes that were designed for transactional data. In this session, we’ll introduce the key ETL features of AWS Glue through use cases ranging from scheduled nightly data warehouse loads to near real-time, event-driven ETL flows for your data lake. We’ll also discuss how to build scalable, efficient and serverless ETL pipelines using AWS Glue.
¿Qué es eso del desarrollo sin servidores? ¿Qué lenguajes puedo utilizar? ¿Cómo hago cosas como autenticación, o guardar en base de datos, o enviar notificaciones? ¿Esto escala? A todas estas preguntas, y a alguna más, intentaré dar respuesta en esta sesión, donde haré una pequeña demo de montar una app muy sencilla y desplegarla en la nube sin preocuparnos de gestionar infraestructura. Charla realizada por primera vez para AlcarriaConf 2021
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...Cloud Native Day Tel Aviv
The “Twelve-Factor” application model has come to represent twelve best practices for building modern, cloud-native applications. With guidance on things like configuration, deployment, runtime, and multiple service communication, the Twelve-Factor model prescribes best practices that apply to everything from web applications to APIs to data processing applications. Although Serverless computing and AWS Lambda have changed how application development is done, the “Twelve-Factor” best practices remain relevant and applicable in a Serverless world. In this talk, we’ll apply the “Twelve-Factor” model to Serverless application development with AWS Lambda and Amazon API Gateway and show you how these services enable you to build scalable, low cost, and low administration applications.
Migrating your .NET Applications to the AWS Serverless PlatformAmazon Web Services
Windows and .NET-based workloads are first-class citizens on AWS. In this session, we show how you can easily move an existing .NET application to the AWS cloud and take advantage of it serverless capabilities. We will cover migration and architectural considerations for porting your C# application to AWS Lambda, and using API Gateway to create a façade for your application to safely make changes as you migrate.
Speakers:
Stephen Liedig, Public Sector Solutions Architect, Amazon Web Services
Shane Baldacchino, Solutions Architect, Amazon Web Services
SMC305 Building CI/CD Pipelines for Serverless ApplicationsAmazon Web Services
Continuous Integration and Continuous Delivery help developers rapidly and reliably release updates for their applications in a standardized and safe manner. The faster you can release new features and fix bugs, the quicker you can innovate and respond to customer needs. Serverless computing has changed the game for application development, including how to properly perform CI/CD for your application. AWS provides developer tools that help you automate the end-to-end lifecycle of your serverless application. In this session, we’ll discuss how to build multi-stage pipelines that let you build and test your application in an automated way using AWS CodePipeline and AWS CodeBuild. We’ll also cover the built-in capabilities of AWS Lambda and Amazon API Gateway that allow you to create multiple versions, stages, and environments for your serverless applications.
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA
Mike Limcaco, Analytics Specialist / Customer Engineer at Google
Measure trends in a particular topic or search term across Google Search across the US down to the city-level. Integrate these data signals into analytic pipelines to drive product, retail, media (video, audio, digital content) recommendations tailored to your audience segment. We'll discuss how Google unique datasets can be used with Google Cloud smart analytic services to process, enrich and surface the most relevant product or content that matches the ever-changing interests of your local customer segment.
Melinda Thielbar, Data Science Practice Lead and Director of Data Science at Fidelity Investments
From corporations to governments to private individuals, most of the AI community has recognized the growing need to incorporate ethics into the development and maintenance of AI models. Much of the current discussion, though, is meant for leaders and managers. This talk is directed to data scientists, data engineers, ML Ops specialists, and anyone else who is responsible for the hands-on, day-to-day of work building, productionalizing, and maintaining AI models. We'll give a short overview of the business case for why technical AI expertise is critical to developing an AI Ethics strategy. Then we'll discuss the technical problems that cause AI models to behave unethically, how to detect problems at all phases of model development, and the tools and techniques that are available to support technical teams in Ethical AI development.
Data Con LA 2022 - Improving disaster response with machine learningData Con LA
Antje Barth, Principal Developer Advocate, AI/ML at AWS & Chris Fregly, Principal Engineer, AI & ML at AWS
The frequency and severity of natural disasters are increasing. In response, governments, businesses, nonprofits, and international organizations are placing more emphasis on disaster preparedness and response. Many organizations are accelerating their efforts to make their data publicly available for others to use. Repositories such as the Registry of Open Data on AWS and Humanitarian Data Exchange contain troves of data available for use by developers, data scientists, and machine learning practitioners. In this session, see how a community of developers came together though the AWS Disaster Response hackathon to build models to support natural disaster preparedness and response.
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
Sig Narvaez, Executive Solution Architect at MongoDB
MongoDB is now a Developer Data Platform. Come learn what�s new in the 6.0 release and Atlas following all the recent announcements made at MongoDB World 2022. Topics will include
- Atlas Search which combines 3 systems into one (database, search engine, and sync mechanisms) letting you focus on your product's differentiation.
- Atlas Data Federation to seamlessly query, transform, and aggregate data from one or more MongoDB Atlas databases, Atlas Data Lake and AWS S3 buckets
- Queryable Encryption lets you run expressive queries on fully randomized encrypted data to meet the most stringent security requirements
- Relational Migrator which analyzes your existing relational schemas and helps you design a new MongoDB schema.
- And more!
Data Con LA 2022 - Real world consumer segmentationData Con LA
Jaysen Gillespie, Head of Analytics and Data Science at RTB House
1. Shopkick has over 30M downloads, but the userbase is very heterogeneous. Anecdotal evidence indicated a wide variety of users for whom the app holds long-term appeal.
2. Marketing and other teams challenged Analytics to get beyond basic summary statistics and develop a holistic segmentation of the userbase.
3. Shopkick's data science team used SQL and python to gather data, clean data, and then perform a data-driven segmentation using a k-means algorithm.
4. Interpreting the results is more work -- and more fun -- than running the algo itself. We'll discuss how we transform from ""segment 1"", ""segment 2"", etc. to something that non-analytics users (Marketing, Operations, etc.) could actually benefit from.
5. So what? How did team across Shopkick change their approach given what Analytics had discovered.
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
Ravi Pillala, Chief Data Architect & Distinguished Engineer at Intuit
TurboTax is one of the well known consumer software brand which at its peak serves 385K+ concurrent users. In this session, We start with looking at how user behavioral data & tax domain events are captured in real time using the event bus and analyzed to drive real time personalization with various TurboTax data pipelines. We will also look at solutions performing analytics which make use of these events, with the help of Kafka, Apache Flink, Apache Beam, Spark, Amazon S3, Amazon EMR, Redshift, Athena and Amazon lambda functions. Finally, we look at how SageMaker is used to create the TurboTax model to predict if a customer is at risk or needs help.
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA
George Mansoor, Chief Information Systems Officer at California State University
Overview of the CSU Data Architecture on moving on-prem ERP data to the AWS Cloud at scale using Delphix for Data Replication/Virtualization and AWS Data Migration Service (DMS) for data extracts
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
Anand Ranganathan, Chief AI Officer at Unscrambl
Conversational AI is getting more and more widely used for customer support and employee support use-cases. In this session, I'm going to talk about how it can be extended for data analysis and data science use-cases ... i.e., how users can interact with a bot to ask analytical questions on data in relational databases.
This allows users to explore complex datasets using a combination of text and voice questions, in natural language, and then get back results in a combination of natural language and visualizations. Furthermore, it allows collaborative exploration of data by a group of users in a channel in platforms like Microsoft Teams, Slack or Google Chat.
For example, a group of users in a channel can ask questions to a bot in plain English like ""How many cases of Covid were there in the last 2 months by state and gender"" or ""Why did the number of deaths from Covid increase in May 2022"", and jointly look at the results that come back. This facilitates data awareness, data-driven collaboration and joint decision making among teams in enterprises and outside.
In this talk, I'll describe how we can bring together various features including natural-language understanding, NL-to-SQL translation, dialog management, data story-telling, semantic modeling of data and augmented analytics to facilitate collaborate exploration of data using conversational AI.
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA
Anil Inamdar, VP & Head of Data Solutions at Instaclustr
The most modernized enterprises utilize polyglot architecture, applying the best-suited database technologies to each of their organization's particular use cases. To successfully implement such an architecture, though, you need a thorough knowledge of the expansive NoSQL data technologies now available.
Attendees of this Data Con LA presentation will come away with:
-- A solid understanding of the decision-making process that should go into vetting NoSQL technologies and how to plan out their data modernization initiatives and migrations.
-- They will learn the types of functionality that best match the strengths of NoSQL key-value stores, graph databases, columnar databases, document-type databases, time-series databases, and more.
-- Attendees will also understand how to navigate database technology licensing concerns, and to recognize the types of vendors they'll encounter across the NoSQL ecosystem. This includes sniffing out open-core vendors that may advertise as “open source,"" but are driven by a business model that hinges on achieving proprietary lock-in.
-- Attendees will also learn to determine if vendors offer open-code solutions that apply restrictive licensing, or if they support true open source technologies like Hadoop, Cassandra, Kafka, OpenSearch, Redis, Spark, and many more that offer total portability and true freedom of use.
Data Con LA 2022 - Intro to Data ScienceData Con LA
Zia Khan, Computer Systems Analyst and Data Scientist at LearningFuze
Data Science tutorial is designed for people who are new to Data Science. This is a beginner level session so no prior coding or technical knowledge is required. Just bring your laptop with WiFi capability. The session starts with a review of what is data science, the amount of data we generate and how companies are using that data to get insight. We will pick a business use case, define the data science process, followed by hands-on lab using python and Jupyter notebook. During the hands-on portion we will work with pandas, numpy, matplotlib and sklearn modules and use a machine learning algorithm to approach the business use case.
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA
Mariana Danilovic, Managing Director at Infiom, LLC
We will address:
(1) Community creation and engagement using tokens and NFTs
(2) Organization of DAO structures and ways to incentivize Web3 communities
(3) DeFi business models applied to Web3 ventures
(4) Why Metaverse matters for new entertainment and community engagement models.
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
Curtis ODell, Global Director Data Integrity at Tricentis
Join me to learn about a new end-to-end data testing approach designed for modern data pipelines that fills dangerous gaps left by traditional data management tools—one designed to handle structured and unstructured data from any source. You'll hear how you can use unique automation technology to reach up to 90 percent test coverage rates and deliver trustworthy analytical and operational data at scale. Several real world use cases from major banks/finance, insurance, health analytics, and Snowflake examples will be presented.
Key Learning Objective
1. Data journeys are complex and you have to ensure integrity of the data end to end across this journey from source to end reporting for compliance
2. Data Management tools do not test data, they profile and monitor at best, and leave serious gaps in your data testing coverage
3. Automation with integration to DevOps and DataOps' CI/CD processes are key to solving this.
4. How this approach has impact in your vertical
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA
Arif Ansari, Professor at University of Southern California
Super Bowl Ad cost $7 million and each year a few Super Bowl ads go viral. The traditional A/B testing does not predict virality. Some highly shared ones reach over 60 million organic views, which can be more valuable than views on TV. Not only are these voluntary, but they are typically without distraction, and win viewer engagement in the form of likes, comments, or shares. A Super Bowl ad that wins 69 million views on YouTube (e.g., Alexa Mind Reader) costs less than 10 cents per quality view! However, the challenge is triggering virality. We developed a method to predict virality and engineer virality into Ads.
1. Prof. Gerard J. Tellis and co-authors recommended that advertisers use YouTube to tease, test, and tweak (TTT) their ads to maximize sharing and viewing. 2022 saw that maxim put into practice.
2. We developed viral Ads prediction using two scientific models:
a. Prof. Gerard Tellis et al.'s model for viral prediction
b. Deep Learning viral prediction using social media effect
3. The model was able to identify all the top 15 Viral Ads it performed better than the traditional agencies.
4. New proposed method is Tease, Test, Tweak, Target and Spots Ad.
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA
Jai Bansal, Senior Manager, Data Science at Aetna
This talk describes an internal data product called Member Embeddings that facilitates modeling of member medical journeys with machine learning.
Medical claims are the key data source we use to understand health journeys at Aetna. Claims are the data artifacts that result from our members' interactions with the healthcare system. Claims contain data like the amount the provider billed, the place of service, and provider specialty. The primary medical information in a claim is represented in codes that indicate the diagnoses, procedures, or drugs for which a member was billed. These codes give us a semi-structured view into the medical reason for each claim and so contain rich information about members' health journeys. However, since the codes themselves are categorical and high-dimensional (10K cardinality), it's challenging to extract insight or predictive power directly from the raw codes on a claim.
To transform claim codes into a more useful format for machine learning, we turned to the concept of embeddings. Word embeddings are widely used in natural language processing to provide numeric vector representations of individual words.
We use a similar approach with our claims data. We treat each claim code as a word or token and use embedding algorithms to learn lower-dimensional vector representations that preserve the original high-dimensional semantic meaning.
This process converts the categorical features into dense numeric representations. In our case, we use sequences of anonymized member claim diagnosis, procedure, and drug codes as training data. We tested a variety of algorithms to learn embeddings for each type of claim code.
We found that the trained embeddings showed relationships between codes that were reasonable from the point of view of subject matter experts. In addition, using the embeddings to predict future healthcare-related events outperformed other basic features, making this tool an easy way to improve predictive model performance and save data scientist time.
Data Con LA 2022 - Data Streaming with KafkaData Con LA
Jie Chen, Manager Advisory, KPMG
Data is the new oil. However, many organizations have fragmented data in siloed line of businesses. In this topic, we will focus on identifying the legacy patterns and their limitations and introducing the new patterns packed by Kafka's core design ideas. The goal is to tirelessly pursue better solutions for organizations to overcome the bottleneck in data pipelines and modernize the digital assets for ready to scale their businesses. In summary, we will walk through three uses cases, recommend Dos and Donts, Take aways for Data Engineers, Data Scientist, Data architect in developing forefront data oriented skills.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
5. 1. Cloud services have matured
2. BaaS à “SaaS-ification”
3. API’s are the glue
4. Containers – now per function
5. SysOps à DevOps à NoOps
Less Ops, More Engineering
Forbes: 5 factors fuelling Serverless Computing
https://www.forbes.com/sites/janakirammsv/2016/02/28/five-factors-that-are-fueling-serverless-computing-part-1
@SigNarvaez | @MongoDB | @BigDataDayLA
13. Shape
• Person
• Insurance Policies
• Shape changes per policy type
• Addresses
Operations via API
• GET Customers with soon-to-expire policies,
within a geo radius
• GET Customers / by SSN, id, etc.
• PATCH Update basic contact info (cell, email, …)
Customer Single View - Insurance Industry (hypothetical)
High-level architecture of a
single view platform
@SigNarvaez | @MongoDB | @BigDataDayLA
14. CQRS on Serverless Microservices
COMMAND
Update basic
info
QUERY
Soon to Expire
GEO
By SSN, ID, …
CUD API Key
Read API Key
Lambda Function(s)
Majority
Writes
Secondary
Reads
Lambda Function(s)
VPC
Peering
@SigNarvaez | @MongoDB | @BigDataDayLA
19. IAM
• Role with Lambda execute policies
VPC
• VPC Peering Connection
• Security Group
Required AWS Services
Lambda
• Set VPC, Security Group and IAM role
• Upload deployment package (.zip)
API Gateway
• API definition (Resources & HTTP Methods)
• Map Routes to Lambda functions
• API Keys & Usage Plans
@SigNarvaez | @MongoDB | @BigDataDayLA
28. Connections & Containers
http://docs.aws.amazon.com/lambda/latest/dg/lambda-introduction.html
… AWS Lambda maintains the container for some time in anticipation of another Lambda
function invocation. … the service freezes the container after a function completes, and
thaws the container for reuse. If AWS Lambda chooses to reuse the container, this has the
following implications:
- Any declarations in your Lambda function code (outside the handler code, see Programming
Model) remains initialized, providing additional optimization when the function is invoked
again. For example, if your Lambda function establishes a database connection, instead of
reestablishing the connection, the original connection is used in subsequent invocations. You
can add logic in your code to check if a connection already exists before creating one.
@SigNarvaez | @MongoDB | @BigDataDayLA
31. Local Emulators : Test on EC2 instance against Atlas
python-lambda-local at https://pypi.python.org/pypi/python-lambda-local
lambda-local (node.js) at https://www.npmjs.com/package/lambda-local
34. Upload &
configure function The handler function
The role with lambda permissions
The VPC (peered with Atlas)
The security group that allows traffic
At least 2 subnets
50. Scaling?
Scaling Lambda
No user intervention required - Default safety throttle of
100 concurrent executions per account per region.
Functions invoked synchronously throw 429 error code.
Functions invoked asynchronously can absorb reasonable
bursts for approx. 15-30 minutes. If exhausted, consider
using Simple Queue Service (SQS) or Simple Notification
Service (SNS) as the Dead Letter Queue (DLQ).
Read more at https://aws.amazon.com/lambda/faqs/
Scaling MongoDB Atlas
On-Demand
Zero downtime
Upscale/Downscale:
• Instance size
• Storage size
• IOPS
• Replication factor.
@SigNarvaez | @MongoDB | @BigDataDayLA
51. Serverless Architectures
with AWS Lambda and
MongoDB Atlas
Sig Narváez
Sr. Solutions Architect
sig@mongodb.com
@SigNarvaez
Q & A
https://github.com/snarvaez/bddla17
https://resources.mongodb.com/serverles
s-architectures
@SigNarvaez | @MongoDB | @BigDataDayLA