Apache Ignite is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies.
This session introduces the Kafka Connector for Redis Enterprise. During the presentation we will discover how Kafka in combination with the multi-model database platform from Redis Labs opens up new possibilities for developers. We will conclude this presentation with a live demo of the connector that showcases the multiple database models enabled by the Redis Enterprise Kafka connector.
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...HostedbyConfluent
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Ethan Guo | Current 2022
Back in 2016, Apache Hudi brought transactions, change capture on top of data lakes, what is today referred to as the Lakehouse architecture. In this session, we first introduce Apache Hudi and the key technology gaps it fills in the modern data architecture. Bridging traditional data lakes and warehouses, Hudi helps realize the Lakehouse vision, by bringing transactions, optimized table metadata to data lakes and powerful storage layout optimizations, moving them closer to cloud warehouses of today. Viewed from a data engineering lens, Hudi also plays a key unifying role between the batch and stream processing worlds, by acting as a columnar, server-less ""state store"" for batch jobs, ushering in what we call the incremental processing model, where batch jobs can consume new data, update/delete intermediate results in a Hudi table, instead of re-computing/re-write entire output like old-school big batch jobs.
Rest of talk focusses on a deep dive into the some of the time-tested design choices and tradeoffs in Hudi, that helps power some of the largest transactional data lakes on the planet today. We will start by describing a tour of the storage format design, including data, metadata layouts and of course Hudi's timeline, an event log that is central to implementing ACID transactions and concurrency control. We will delve deeper into the practical concurrency control pitfalls in data lakes, and show how Hudi's hybrid approach combining MVCC with optimistic concurrency control, lowers contention and unlocks minute-level near real-time commits to Hudi tables. We will conclude with code examples that showcase Hudi's rich set of table services that perform vital table management such as cleaning older file versions, compaction of delta logs into base files, dynamic re-clustering for faster query performance, or the more recently introduced indexing service that maintains Hudi's multi-modal indexing capabilities.
Cluster computing frameworks such as Hadoop or Spark are tremendously beneficial in processing and deriving insights from data. However, long query latencies make these frameworks sub-optimal choices to power interactive applications. Organizations frequently rely on dedicated query layers, such as relational databases and key/value stores, for faster query latencies, but these technologies suffer many drawbacks for analytic use cases. In this session, we discuss using Druid for analytics and why the architecture is well suited to power analytic applications.
User-facing applications are replacing traditional reporting interfaces as the preferred means for organizations to derive value from their datasets. In order to provide an interactive user experience, user interactions with analytic applications must complete in an order of milliseconds. To meet these needs, organizations often struggle with selecting a proper serving layer. Many serving layers are selected because of their general popularity without understanding the possible architecture limitations.
Druid is an analytics data store designed for analytic (OLAP) queries on event data. It draws inspiration from Google’s Dremel, Google’s PowerDrill, and search infrastructure. Many enterprises are switching to Druid for analytics, and we will cover why the technology is a good fit for its intended use cases.
Speaker
Nishant Bangarwa, Software Engineer, Hortonworks
What is elastic data warehousing, and how does Snowflake uniquely enable it? Learn about the requirements needed to support flexible, elastic data warehousing using cloud infrastructure.
Keeping Spark on Track: Productionizing Spark for ETLDatabricks
ETL is the first phase when building a big data processing platform. Data is available from various sources and formats, and transforming the data into a compact binary format (Parquet, ORC, etc.) allows Apache Spark to process it in the most efficient manner. This talk will discuss common issues and best practices for speeding up your ETL workflows, handling dirty data, and debugging tips for identifying errors.
Speakers: Kyle Pistor & Miklos Christine
This talk was originally presented at Spark Summit East 2017.
Hot Topics: The DuraSpace Community Webinar Series,
“Introducing DSpace 7: Next Generation UI”
Curated by Claire Knowles, Library Digital Development Manager, The University of Edinburgh.
Introducing DSpace 7
February 28, 2017 presented by: Claire Knowles - The University of Edinburgh, Art Lowel - Atmire, Andrea Bollini - 4Science, Tim Donohue – DuraSpace
Every day, businesses across a wide variety of industries share data to support insights that drive efficiency and new business opportunities. However, existing methods for sharing data involve great effort on the part of data providers to share data, and involve great effort on the part of data customers to make use of that data.
However, existing approaches to data sharing (such as e-mail, FTP, EDI, and APIs) have significant overhead and friction. For one, legacy approaches such as e-mail and FTP were never intended to support the big data volumes of today. Other data sharing methods also involve enormous effort. All of these methods require not only that the data be extracted, copied, transformed, and loaded, but also that related schemas and metadata must be transported as well. This creates a burden on data providers to deconstruct and stage data sets. This burden and effort is mirrored for the data recipient, who must reconstruct the data.
As a result, companies are handicapped in their ability to fully realize the value in their data assets.
Snowflake Data Sharing allows companies to grant instant access to ready-to-use data to any number of partners or data customers without any data movement, copying, or complex pipelines.
Using Snowflake Data Sharing, companies can derive new insights and value from data much more quickly and with significantly less effort than current data sharing methods. As a result, companies now have a new approach and a powerful new tool to get the full value out of their data assets.
This session introduces the Kafka Connector for Redis Enterprise. During the presentation we will discover how Kafka in combination with the multi-model database platform from Redis Labs opens up new possibilities for developers. We will conclude this presentation with a live demo of the connector that showcases the multiple database models enabled by the Redis Enterprise Kafka connector.
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...HostedbyConfluent
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Ethan Guo | Current 2022
Back in 2016, Apache Hudi brought transactions, change capture on top of data lakes, what is today referred to as the Lakehouse architecture. In this session, we first introduce Apache Hudi and the key technology gaps it fills in the modern data architecture. Bridging traditional data lakes and warehouses, Hudi helps realize the Lakehouse vision, by bringing transactions, optimized table metadata to data lakes and powerful storage layout optimizations, moving them closer to cloud warehouses of today. Viewed from a data engineering lens, Hudi also plays a key unifying role between the batch and stream processing worlds, by acting as a columnar, server-less ""state store"" for batch jobs, ushering in what we call the incremental processing model, where batch jobs can consume new data, update/delete intermediate results in a Hudi table, instead of re-computing/re-write entire output like old-school big batch jobs.
Rest of talk focusses on a deep dive into the some of the time-tested design choices and tradeoffs in Hudi, that helps power some of the largest transactional data lakes on the planet today. We will start by describing a tour of the storage format design, including data, metadata layouts and of course Hudi's timeline, an event log that is central to implementing ACID transactions and concurrency control. We will delve deeper into the practical concurrency control pitfalls in data lakes, and show how Hudi's hybrid approach combining MVCC with optimistic concurrency control, lowers contention and unlocks minute-level near real-time commits to Hudi tables. We will conclude with code examples that showcase Hudi's rich set of table services that perform vital table management such as cleaning older file versions, compaction of delta logs into base files, dynamic re-clustering for faster query performance, or the more recently introduced indexing service that maintains Hudi's multi-modal indexing capabilities.
Cluster computing frameworks such as Hadoop or Spark are tremendously beneficial in processing and deriving insights from data. However, long query latencies make these frameworks sub-optimal choices to power interactive applications. Organizations frequently rely on dedicated query layers, such as relational databases and key/value stores, for faster query latencies, but these technologies suffer many drawbacks for analytic use cases. In this session, we discuss using Druid for analytics and why the architecture is well suited to power analytic applications.
User-facing applications are replacing traditional reporting interfaces as the preferred means for organizations to derive value from their datasets. In order to provide an interactive user experience, user interactions with analytic applications must complete in an order of milliseconds. To meet these needs, organizations often struggle with selecting a proper serving layer. Many serving layers are selected because of their general popularity without understanding the possible architecture limitations.
Druid is an analytics data store designed for analytic (OLAP) queries on event data. It draws inspiration from Google’s Dremel, Google’s PowerDrill, and search infrastructure. Many enterprises are switching to Druid for analytics, and we will cover why the technology is a good fit for its intended use cases.
Speaker
Nishant Bangarwa, Software Engineer, Hortonworks
What is elastic data warehousing, and how does Snowflake uniquely enable it? Learn about the requirements needed to support flexible, elastic data warehousing using cloud infrastructure.
Keeping Spark on Track: Productionizing Spark for ETLDatabricks
ETL is the first phase when building a big data processing platform. Data is available from various sources and formats, and transforming the data into a compact binary format (Parquet, ORC, etc.) allows Apache Spark to process it in the most efficient manner. This talk will discuss common issues and best practices for speeding up your ETL workflows, handling dirty data, and debugging tips for identifying errors.
Speakers: Kyle Pistor & Miklos Christine
This talk was originally presented at Spark Summit East 2017.
Hot Topics: The DuraSpace Community Webinar Series,
“Introducing DSpace 7: Next Generation UI”
Curated by Claire Knowles, Library Digital Development Manager, The University of Edinburgh.
Introducing DSpace 7
February 28, 2017 presented by: Claire Knowles - The University of Edinburgh, Art Lowel - Atmire, Andrea Bollini - 4Science, Tim Donohue – DuraSpace
Every day, businesses across a wide variety of industries share data to support insights that drive efficiency and new business opportunities. However, existing methods for sharing data involve great effort on the part of data providers to share data, and involve great effort on the part of data customers to make use of that data.
However, existing approaches to data sharing (such as e-mail, FTP, EDI, and APIs) have significant overhead and friction. For one, legacy approaches such as e-mail and FTP were never intended to support the big data volumes of today. Other data sharing methods also involve enormous effort. All of these methods require not only that the data be extracted, copied, transformed, and loaded, but also that related schemas and metadata must be transported as well. This creates a burden on data providers to deconstruct and stage data sets. This burden and effort is mirrored for the data recipient, who must reconstruct the data.
As a result, companies are handicapped in their ability to fully realize the value in their data assets.
Snowflake Data Sharing allows companies to grant instant access to ready-to-use data to any number of partners or data customers without any data movement, copying, or complex pipelines.
Using Snowflake Data Sharing, companies can derive new insights and value from data much more quickly and with significantly less effort than current data sharing methods. As a result, companies now have a new approach and a powerful new tool to get the full value out of their data assets.
Splunk for JMX App overview (configuration, deployment, tips and tricks). Developing JMX logic in your application. Splunking other JVM logs and profiling traces. The JVM application landscape and why it's such a rich source of Splunkable machine data. Developing new Splunkbase apps to leverage Splunk for JMX.
Using Time Window Compaction Strategy For Time Series WorkloadsJeff Jirsa
Cassandra is a great fit for high write use cases, which makes it a popular choice for storing time series and sensor-collection workloads. At Crowdstrike, we've been using Cassandra for just that purpose, collecting petabytes of expiring time series data. In this talk, I'll discuss compaction in time series workloads, and the TimeWindowCompactionStrategy we developed specifically for this purpose. I'll detail TWCS specific configuration properties, some lesser known compaction sub-properties that apply to all compaction strategies, and also cover other general tricks and tuning that are useful for very large time-series workloads.
Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs. This is a huge deck with lots of screenshots so you can see exactly how it works.
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
Apache Kafka in conjunction with Apache Spark became the de facto standard for processing and analyzing data. Both frameworks are open, flexible, and scalable.
Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use serverless SaaS offerings to focus on business logic. However, hybrid and multi-cloud scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden.
This session explores different architectures to build serverless Apache Kafka and Apache Spark multi-cloud architectures across regions and continents.
We start from the analytics perspective of a data lake and explore its relation to a fully integrated data streaming layer with Kafka to build a modern data Data Lakehouse.
Real-world use cases show the joint value and explore the benefit of the "delta lake" integration.
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...confluent
(Henri Cai, Pinterest) Kafka Summit SF 2018
With the rise of large-scale real-time computation, there is a growing need to link legacy MySQL systems with real-time platforms. Pinterest has a hundred billion pins stored in MySQL at the scale of a 100TB and most of this data is needed for building data-driven products for machine learning and data analytics.
This talk discusses how Pinterest designed and built a continuous database (DB) ingestion system for moving MySQL data into near-real-time computation pipelines with only 15 minutes of latency to support our dynamic personalized recommendations and search indices. Pinterest helps people discover and do things that they love. We have billions of core objects (pins/boards/users) stored in MySQL at the scale of 100TB. All this data needs to be ingested onto S3/Hadoop for machine learning and data analytics. As Pinterest is moving towards real-time computation, we are facing a stringent service-level agreement requirement such as making the MySQL data available on S3/Hadoop within 15 minutes, and serving the DB data incrementally in stream processing. We designed WaterMill: a continuous DB ingestion system to listen for MySQL binlog changes, publish the MySQL changelogs as an Apache Kafka® change stream and ingest and compact the stream into Parquet columnar tables in S3/Hadoop within 15 minutes.
We would like to share how we solved the problem of:
-Scalable data partitioning, efficient compaction algorithm
-Stories on schema migration, rewind and recovery
-PII (personally identifiable information) processing
-Columnar storage for efficient incremental query
-How the DB change stream powers other use cases such as cache invalidation in multi-datacenter
-How we deal with the issue of S3 eventual consistency and rate limiting; related technologies: Apache Kafka, stream processing, MySQL binlog processing, Amazon S3, Hadoop and Parquet columnar storage
A 30 day plan to start ending your data struggle with SnowflakeSnowflake Computing
Organizations everywhere are struggling to load, integrate, analyze and collaborate with data. This is largely thanks to their antiquated data platform, designed in a time when few people had the desire or need to interact with the database. Snowflake, the data warehouse built for the cloud, can help.
How Hess Has Continued to Optimize the AWS Cloud After Migrating - ENT218 - r...Amazon Web Services
Hess Corporation is a leading global independent energy company engaged in exploration for and production of crude oil and natural gas. Early in Hess's journey to the cloud, they operated the AWS platform in a manner similar to how they operated their on-premises data centers, creating a number of challenges. In this session, Hess Corporation discusses how they worked to further optimize their use of the AWS Cloud following their data center migration. They also cover technical strategies implemented to improve security, governance, and financial reporting and examine changes to their corporate culture that encourage innovation while improving cost controls.
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. You can start small for just $0.25 per hour with no commitment or upfront costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of most other data warehousing solutions.
In this Masterclass presentation we will:
• Explore the architecture and fundamental characteristics of Amazon Redshift
• Show you how to launch Redshift clusters and to load data into them
• Explain out how to use the AWS Console to monitor and manage Redshift clusters
• Help you to discover best practices and other resources to help you get the most from Redshift
Watch the recording here: http://youtu.be/-FmCWcxRvXY
Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs. This is a huge deck with lots of screenshots so you can see exactly how it works.
Building Event Streaming Architectures on Scylla and KafkaScyllaDB
Event streaming architectures require high-throughput, low-latency components to consistently and smoothly transfer data between heterogenous transactional and analytical systems. Join us and Confluent's Tim Berglund to learn how the Scylla and Confluent Kafka interoperate as a foundation upon which you can build enterprise-grade, event-driven applications, plus a use case from Numberly.
Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job?
This session explores the DOs and DONTs. Separate sections explain when to use Kafka, when NOT to use Kafka, and when to MAYBE use Kafka.
No matter if you think about open source Apache Kafka, a cloud service like Confluent Cloud, or another technology using the Kafka protocol like Redpanda or Pulsar, check out this slide deck.
A detailed article about this topic:
https://www.kai-waehner.de/blog/2022/01/04/when-not-to-use-apache-kafka/
Databricks is a Software-as-a-Service-like experience (or Spark-as-a-service) that is a tool for curating and processing massive amounts of data and developing, training and deploying models on that data, and managing the whole workflow process throughout the project. It is for those who are comfortable with Apache Spark as it is 100% based on Spark and is extensible with support for Scala, Java, R, and Python alongside Spark SQL, GraphX, Streaming and Machine Learning Library (Mllib). It has built-in integration with many data sources, has a workflow scheduler, allows for real-time workspace collaboration, and has performance improvements over traditional Apache Spark.
Docker Inside/Out: the ‘real’ real-world of stacking containers in production...Codemotion
So you’ve already containerized the shit out of your code, broken down monoliths, microserviced the hell out of your app and have run some awesome workloads in your local, dev and test environments. It’s all looking good, but now what? Running Docker commands is one thing, but maintaining containers in production is a whole other ballgame. So, during this talk, I’ll show you the REAL wild world of Docker in production. With the added benefit of talking to and observing how over 900 of our customers have been using Docker in production.
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...Codemotion
Vast volume of our processed data is Time Series data and once you start working with distributed systems, you start tackling many scale and performance problems: How to handle missing data?Should I handle both serving and backed process or separating them out? Best Performance for Money? In the talk we will tell the tale of all of the transformations we’ve made to our data model@Windward, some of the problems we’ve handled, review the multiple data persistency layers like: S3, MongoDB, Apache Cassandra, MySQL. And I’ll try my best NOT to answer the question “Which one of them is the Best?"
Splunk for JMX App overview (configuration, deployment, tips and tricks). Developing JMX logic in your application. Splunking other JVM logs and profiling traces. The JVM application landscape and why it's such a rich source of Splunkable machine data. Developing new Splunkbase apps to leverage Splunk for JMX.
Using Time Window Compaction Strategy For Time Series WorkloadsJeff Jirsa
Cassandra is a great fit for high write use cases, which makes it a popular choice for storing time series and sensor-collection workloads. At Crowdstrike, we've been using Cassandra for just that purpose, collecting petabytes of expiring time series data. In this talk, I'll discuss compaction in time series workloads, and the TimeWindowCompactionStrategy we developed specifically for this purpose. I'll detail TWCS specific configuration properties, some lesser known compaction sub-properties that apply to all compaction strategies, and also cover other general tricks and tuning that are useful for very large time-series workloads.
Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs. This is a huge deck with lots of screenshots so you can see exactly how it works.
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
Apache Kafka in conjunction with Apache Spark became the de facto standard for processing and analyzing data. Both frameworks are open, flexible, and scalable.
Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use serverless SaaS offerings to focus on business logic. However, hybrid and multi-cloud scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden.
This session explores different architectures to build serverless Apache Kafka and Apache Spark multi-cloud architectures across regions and continents.
We start from the analytics perspective of a data lake and explore its relation to a fully integrated data streaming layer with Kafka to build a modern data Data Lakehouse.
Real-world use cases show the joint value and explore the benefit of the "delta lake" integration.
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...confluent
(Henri Cai, Pinterest) Kafka Summit SF 2018
With the rise of large-scale real-time computation, there is a growing need to link legacy MySQL systems with real-time platforms. Pinterest has a hundred billion pins stored in MySQL at the scale of a 100TB and most of this data is needed for building data-driven products for machine learning and data analytics.
This talk discusses how Pinterest designed and built a continuous database (DB) ingestion system for moving MySQL data into near-real-time computation pipelines with only 15 minutes of latency to support our dynamic personalized recommendations and search indices. Pinterest helps people discover and do things that they love. We have billions of core objects (pins/boards/users) stored in MySQL at the scale of 100TB. All this data needs to be ingested onto S3/Hadoop for machine learning and data analytics. As Pinterest is moving towards real-time computation, we are facing a stringent service-level agreement requirement such as making the MySQL data available on S3/Hadoop within 15 minutes, and serving the DB data incrementally in stream processing. We designed WaterMill: a continuous DB ingestion system to listen for MySQL binlog changes, publish the MySQL changelogs as an Apache Kafka® change stream and ingest and compact the stream into Parquet columnar tables in S3/Hadoop within 15 minutes.
We would like to share how we solved the problem of:
-Scalable data partitioning, efficient compaction algorithm
-Stories on schema migration, rewind and recovery
-PII (personally identifiable information) processing
-Columnar storage for efficient incremental query
-How the DB change stream powers other use cases such as cache invalidation in multi-datacenter
-How we deal with the issue of S3 eventual consistency and rate limiting; related technologies: Apache Kafka, stream processing, MySQL binlog processing, Amazon S3, Hadoop and Parquet columnar storage
A 30 day plan to start ending your data struggle with SnowflakeSnowflake Computing
Organizations everywhere are struggling to load, integrate, analyze and collaborate with data. This is largely thanks to their antiquated data platform, designed in a time when few people had the desire or need to interact with the database. Snowflake, the data warehouse built for the cloud, can help.
How Hess Has Continued to Optimize the AWS Cloud After Migrating - ENT218 - r...Amazon Web Services
Hess Corporation is a leading global independent energy company engaged in exploration for and production of crude oil and natural gas. Early in Hess's journey to the cloud, they operated the AWS platform in a manner similar to how they operated their on-premises data centers, creating a number of challenges. In this session, Hess Corporation discusses how they worked to further optimize their use of the AWS Cloud following their data center migration. They also cover technical strategies implemented to improve security, governance, and financial reporting and examine changes to their corporate culture that encourage innovation while improving cost controls.
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. You can start small for just $0.25 per hour with no commitment or upfront costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of most other data warehousing solutions.
In this Masterclass presentation we will:
• Explore the architecture and fundamental characteristics of Amazon Redshift
• Show you how to launch Redshift clusters and to load data into them
• Explain out how to use the AWS Console to monitor and manage Redshift clusters
• Help you to discover best practices and other resources to help you get the most from Redshift
Watch the recording here: http://youtu.be/-FmCWcxRvXY
Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs. This is a huge deck with lots of screenshots so you can see exactly how it works.
Building Event Streaming Architectures on Scylla and KafkaScyllaDB
Event streaming architectures require high-throughput, low-latency components to consistently and smoothly transfer data between heterogenous transactional and analytical systems. Join us and Confluent's Tim Berglund to learn how the Scylla and Confluent Kafka interoperate as a foundation upon which you can build enterprise-grade, event-driven applications, plus a use case from Numberly.
Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job?
This session explores the DOs and DONTs. Separate sections explain when to use Kafka, when NOT to use Kafka, and when to MAYBE use Kafka.
No matter if you think about open source Apache Kafka, a cloud service like Confluent Cloud, or another technology using the Kafka protocol like Redpanda or Pulsar, check out this slide deck.
A detailed article about this topic:
https://www.kai-waehner.de/blog/2022/01/04/when-not-to-use-apache-kafka/
Databricks is a Software-as-a-Service-like experience (or Spark-as-a-service) that is a tool for curating and processing massive amounts of data and developing, training and deploying models on that data, and managing the whole workflow process throughout the project. It is for those who are comfortable with Apache Spark as it is 100% based on Spark and is extensible with support for Scala, Java, R, and Python alongside Spark SQL, GraphX, Streaming and Machine Learning Library (Mllib). It has built-in integration with many data sources, has a workflow scheduler, allows for real-time workspace collaboration, and has performance improvements over traditional Apache Spark.
Docker Inside/Out: the ‘real’ real-world of stacking containers in production...Codemotion
So you’ve already containerized the shit out of your code, broken down monoliths, microserviced the hell out of your app and have run some awesome workloads in your local, dev and test environments. It’s all looking good, but now what? Running Docker commands is one thing, but maintaining containers in production is a whole other ballgame. So, during this talk, I’ll show you the REAL wild world of Docker in production. With the added benefit of talking to and observing how over 900 of our customers have been using Docker in production.
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...Codemotion
Vast volume of our processed data is Time Series data and once you start working with distributed systems, you start tackling many scale and performance problems: How to handle missing data?Should I handle both serving and backed process or separating them out? Best Performance for Money? In the talk we will tell the tale of all of the transformations we’ve made to our data model@Windward, some of the problems we’ve handled, review the multiple data persistency layers like: S3, MongoDB, Apache Cassandra, MySQL. And I’ll try my best NOT to answer the question “Which one of them is the Best?"
Component-Based UI Architectures for the Web - Andrew Rota - Codemotion Rome...Codemotion
Today UI frameworks for the web are embracing the concept of “components”. But what does a component-focused architecture really mean? In this talk we’ll dive into the theory behind component-based UIs and what it means for the future of user interfaces on the web. At the conclusion of this talk, attendees will have an understanding of what makes component-based architectures distinct, and why such an approach might be the ideal solution for building web-based UIs.
Pronti per la legge sulla data protection GDPR? No Panic! - Domenico Maracci,...Codemotion
L’Application Economy obbliga l’IT a correre alla stessa velocità del business. Nel contempo l’entrata in vigore di nuove stringenti normative in ambito sicurezza impone l’adeguamento del Software Delivery LifeCycle affinché queste possano essere implementate e testate già dalle fasi iniziale dello sviluppo, ottimizzando i tempi di delivery e minimizzando il time to market.
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Codemotion
Once you start working with Big Data systems, you discover a whole bunch of problems you won’t find in monolithic systems. Monitoring all of the components becomes a big data problem itself. In the talk, we’ll mention all of the aspects that you should take into consideration when monitoring a distributed system using tools like Web Services, Spark, Cassandra, MongoDB, AWS. Not only the tools, what should you monitor about the actual data that flows in the system? We’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Codemotion
Today’s applications are expected to provide powerful full-text search. But how does that work in general and how do I implement it on my site or in my application? Actually, this is not as hard as it sounds at first. This talk covers: * How full-text search works in general and what the differences to databases are. * How the score or quality of a search result is calculated. * How to implement this with Elasticsearch. Attendees will learn how to add common search patterns to their applications without breaking a sweat.
From a Developer's POV: is Machine Learning Reshaping the World? - Simone Sca...Codemotion
There is no denying that machine learning is rapidly reshaping the technological horizon, fueled by increasing availability of data, computing power, and software (e.g., TensorFlow). Classical ML techniques are becoming a common tool for the everyday programmer, at the same time that sophisticated deep learning models are fueling driverless cars, advanced AI players, and more. This talk will survey the ways in which ML is impacting the programming world, as we try to answer the following questions: are we truly witnessing a new AI resurgence? If yes, what should any developer be aware of?
Microservices in GO - Massimiliano Dessì - Codemotion Rome 2017Codemotion
In this talk we'll see how to write a cloud native microservice with Go language, the microservices will be: Cloud native A twelve factor app Scalable with the GO built in concurrency Monitored with a distributed tracing system to check the latency Testable with a load test during the development Communications with different protocols.
Comics and immersive storytelling in Virtual Reality - Fabio Corrirossi - Cod...Codemotion
Virtual Reality is an undoubtedly ideal storytelling platform, whichever the story. After starting with the very first VR comic in the world, "Magnetique", a GearVR exclusive, we'll focus on telling virtual reality stories without resorting to 360° videos. Drawing techniques, stereoscopic coding, sequential art tips and tricks. And more. Our times allow for a unique opportunity to tell old stories, anew.
Anche per te "Open Source" = "qualcuno ha già fatto il lavoro al posto mio, e per di più gratis"? Ottimo, allora sei nel posto giusto e con l'approccio giusto! In questo talk, attraverso tanti episodi di vita vissuta come utente, contributor e maintainer, discuteremo di come trarre una serie di altri vantaggi da questo magico mondo, di come approcciarsi alle community e, perché no, anche delle gioie e dei dolori che ti aspettano se decidi di saltare la staccionata e di rendere (veramente) open il tuo codice.
Il game audio come processo ingegneristico - Davide Pensato - Codemotion Rome...Codemotion
Mostrare come il game audio è a tutti gli effetti una professionalità che unisce ad aspetti artistico creativi, forti competenze tecnico informatiche. L'audio designer può a tutti gli effetti considerarsi un ingegnere del suono, che applica modelli, regole e metodi rigorosi per ottenere il risultato. Tutto questo all'interno del ciclo di produzione, integrandosi con grafici, designer e programmatori
Cyber Security in Multi Cloud Architecture - Luca Di Bari - Codemotion Rome 2017Codemotion
Nuovi modelli di sicurezza in ambienti multi-cloud. Ridefinizione del concetto di Front-End. Nuovi approcci alle tematiche di sicurezza in scenari magmatici.
The busy developer guide to Docker - Maurice de Beijer - Codemotion Rome 2017Codemotion
Docker is all the rage these days and you are told all the time you need to use Docker to host your applications. But what is Docker and why has it become such a hot topic? Why is Microsoft updating Windows 2016 so be a Docker container host? What does using Docker mean for your application architecture or can you just take any application and host it using Docker? In this session Maurice de Beijer will explain the history of Docker as well as explain how you could use it with your applications. He will also explain what else, besides Docker, you will need to add to your architecture.
Xamarin.Forms is a framework for building cross-platform applications that share most of the UI codebase among the UWP, iOS and Android platforms. Due to the higher level of abstraction compared to Xamarin.Native, Xamarin.Forms applications may suffer from memory leaks and slow rendering times at the expense of the final user experience. In the session, we will explore the mechanisms used by Xamarin.Forms to translate abstract UI components into native ones, highlight with demos what are the main bottlenecks met by developer, how to solve them and get close to native performances.
Barbarians at the Gate(way) - Dave Lewis - Codemotion Rome 2017Codemotion
This talk will examine the tools, methods and data behind the DDoS attacks that are prevalent in the news headlines. Using information collected, I will demonstrate what the attackers are using to cause their mischief and mayhem and examine the timeline and progression of attackers as they move from the historical page defacers to the motivated DDoS attacker. I will look at the motivations and rationale that they have and try to share some sort of understanding as to what patterns to be aware of for their own protection.
Commodore 64 Mon Amour(2): sprite multiplexing. Il caso Catalypse e altre sto...Codemotion
Continuiamo il viaggio iniziato lo scorso anno nel magico mondo della moderna programmazione del Commodore 64. La scena italiana e romana è molto attiva. Dopo una brevissima introduzione sugli sprite in generale, il mitico Andrea Pompili, autore dello sparattutto Catalypse pubblicato da Genias nel 1992, ci spiegherà la tecnica dello sprite multiplexing, utilizzata per superare il noto limite degli 8 sprite contemporanei a schermo, applicata al suo gioco.
Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017Codemotion
L’ecosistema degli orchestratori di container è in rapido movimento, una galassia di piattaforme e framework. Come si fa a scegliere quello giusto per le vostre esigenze? Vediamo tutti gli orchestratori in commercio, con i loro pro e contro: DC/OS, Kubernetes, Docker e anche quelli meno famosi ma saranno promesse, e anche le dinamiche e le scelte fatte.
Web Based Virtual Reality - Tanay Pant - Codemotion Rome 2017Codemotion
There has been a surge in the development of virtual reality applications with the production of easily accessible and sophisticated VR devices such as Oculus Rift, HTC Vive and Samsung Gear. Frameworks like A-Frame developed by the MozVR team combined with cheap alternatives such as Google Cardboard allows the developers to leverage the power of the web. The attendees of this talk would learn about the WebVR API, using A-Frame to build virtual worlds, creating virtual worlds for modern content display (such as reddit posts, news feeds, Instagram photos) as well as game development.
Handle insane devices traffic using Google Cloud Platform - Andrea Ulisse - C...Codemotion
In a world of connected devices it is really important to be prepared receiving and managing a huge amount of messages. In this context what is making the real difference is the backend that has to be able to handle safely every request in real time. In this talk we will show how the broad spectrum of highly scalable services makes Google Cloud Platform the perfect habitat for such as workloads.
Apache Ignite is an integrated and distributed In-Memory Data Fabric for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies. It is designed to easily power both existing and new applications in a distributed, massively parallel architecture on affordable, industry-standard hardware. Apache Ignite addresses today's Fast Data and Big Data needs by providing a comprehensive in-memory data fabric, which includes a data grid with SQL and transactional capabilities, in-memory streaming, an in-memory file system, and more.
From Data to Services at the Speed of BusinessAli Hodroj
From Data to Services at the Speed of Business: Applying cloud-native paradigm to combine fast data analytics with microservices architecture for hybrid workloads.
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...Deepak Chandramouli
PayPal Data Lake Journey | 2017-Oct | San Diego | Teradata Edge of Next
Gimel [http://www.gimel.io] is a Big Data Processing Library, open sourced by PayPal.
https://www.youtube.com/watch?v=52PdNno_9cU&t=3s
Gimel empowers analysts, scientists, data engineers alike to access a variety of Big Data / Traditional Data Stores - with just SQL or a single line of code (Unified Data API).
This is possible via the Catalog of Technical properties abstracted from users, along with a rich collection of Data Store Connectors available in Gimel Library.
A Catalog provider can be Hive or User Supplied (runtime) or UDC.
In addition, PayPal recently open sourced UDC [Unified Data Catalog], which can host and serve the Technical Metatada of the Data Stores & Objects. Visit http://www.unifieddatacatalog.io to experience first hand.
In memory computing principles by Mac Moore of GridGainData Con LA
In the presentation, we will provide an overview of general in-memory computing principles and the drivers behind it. We will start with a summary of the technical drivers (abundant hardware resources) and market forces (the rise of Big Data). We will cover popular and emerging use cases for in-memory computing, from financial industry trading platforms to mobile payment processing, online advertising, online/mobile gaming back-ends and more. We will then present some foundational concepts and terminology, and discuss considerations around any in-memory solution. From there, we will illustrate how a complete in-memory computing stack like GridGain combines clustering, high performance computing, in-memory data grids, stream processing and Hadoop acceleration into one unified and easy to use platform.
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitDenis Magda
Machine learning is a method of data analysis that automates the building of analytical models. By using algorithms that iteratively learn from data, computers are able to find hidden insights without the help of explicit programming. These insights bring tremendous benefits into many different domains. For business users, in particular, these insights help organizations improve customer experience, become more competitive, and respond much faster to opportunities or threats.
The availability of very powerful in-memory computing platforms, such as Apache Ignite, means that more organizations can benefit from machine learning today. In this presentation, we will discuss how the Compute Grid, Data Grid, and Machine Learning Grid components of Apache Ignite work together to enable your business to start reaping the benefits of machine learning. Through examples, attendees will learn how Apache Ignite can be used for data analysis and be the in-memory hammer in your machine learning toolkit.
In this talk I’ll present the SharedRDD – a high-performance in-memory caching layer for Spark jobs. We’ll work through 1) design & architecture of this component, 2) configuration and 3) actual Java and Scala usage examples.
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
3 Things to Learn About:
*Building scalable real time architectures for managing data from IoT
*Processing data in real time with components such as Kudu & Spark
*Customer case studies highlighting real-time IoT use cases
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWSAmazon Web Services
What if you were told that within three months, you had to scale your existing platform from 1,000 req/sec (requests per second) to handle 300,000 req/sec with an average latency of 25 milliseconds? And that you had to accomplish this with a tight budget, expand globally, and keep the project confidential until officially announced by well-known global mobile device manufacturers? That’s what exactly happened to us. This session explains how The Weather Company partnered with AWS to scale our data distribution platform to prepare for unpredictable global demand. We cover the many challenges that we faced as we worked on architecture design, technology and tools selection, load testing, deployment and monitoring, and how we solved these challenges using AWS.
Apache Spark and Apache Ignite: Where Fast Data Meets the IoTDenis Magda
It is not enough to build a mesh of sensors or embedded devices to obtain more insights about the surrounding environment and optimize your production systems. Usually, your IoT solution needs to be capable of transferring enormous amounts of data to storage or the cloud where the data have to be processed further. Quite often, the processing of the endless streams of data has to be done in real-time so that you can react on the IoT subsystem's state accordingly.
This session will show attendees how to build a Fast Data solution that will receive endless streams from the IoT side and will be capable of processing the streams in real-time using Apache Ignite's cluster resources.
Solving enterprise challenges through scale out storage & big compute finalAvere Systems
Google Cloud Platform, Avere Systems, and Cycle Computing experts will share best practices for advancing solutions to big challenges faced by enterprises with growing compute and storage needs. In this “best practices” webinar, you’ll hear how these companies are working to improve results that drive businesses forward through scalability, performance, and ease of management.
The slides were from a webinar presented January 24, 2017. The audience learned:
- How enterprises are using Google Cloud Platform to gain compute and storage capacity on-demand
- Best practices for efficient use of cloud compute and storage resources
- Overcoming the need for file systems within a hybrid cloud environment
- Understand how to eliminate latency between cloud and data center architectures
- Learn how to best manage simulation, analytics, and big data workloads in dynamic environments
- Look at market dynamics drawing companies to new storage models over the next several years
Presenters communicated a foundation to build infrastructure to support ongoing demand growth.
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...Amazon Web Services
Maintaining control of sensitive data is critical in the highly regulated financial investments environment that Vanguard operates in. This need for data control complicated Vanguard's move to the cloud. They needed to expand globally to provide a great user experience while at the same time maintaining their mainframe-based backend data architecture. In this session, Vanguard discusses the creative approach they took to decouple their monolithic backend architecture to empower a microservices architecture while maintaining compliance with regulations. They also cover solutions implemented to successfully meet their requirements for security, latency, and end-state consistency.
If you're like most of the world, you're on an aggressive race to implement machine learning applications and on a path to get to deep learning. If you can give better service at a lower cost, you will be the winners in 2030. But infrastructure is a key challenge to getting there. What does the technology infrastructure look like over the next decade as you move from Petabytes to Exabytes? How are you budgeting for more colossal data growth over the next decade? How do your data scientists share data today and will it scale for 5-10 years? Do you have the appropriate security, governance, back-up and archiving processes in place? This session will address these issues and discuss strategies for customers as they ramp up their AI journey with a long term view.
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Codemotion
Increased complexity makes it very hard and time-consuming to keep your software bug-free and secure. We introduce fuzz-testing as a method for automatically and continuously discovering vulnerabilities hidden in your code. The talk will explain how fuzzing works and how to integrate fuzz-testing into your Software Development Life Cycle to increase your code’s security.
Pompili - From hero to_zero: The FatalNoise neverending storyCodemotion
It was 1993 when we decided to venture in a beat'em up game for Amiga. The Catalypse's success story pushed me and my comrade to create something astonishing for this incredible game machine... but things went harder, assumptions were slightly different, and italian competitors appeared out of nowhere... the project died in 1996. Story ended? Probably not...
Il Commodore 65 è un prototipo di personal computer che Commodore avrebbe dovuto mettere in commercio quale successore del Commodore 64. Purtroppo la sua realizzazione si fermò appunto allo stadio prototipale. Racconterò l'affascinante storia del suo sviluppo ed il perchè della soppressione del progetto ormai ad un passo dalla immissione in commercio.
Rivivere l'ebbrezza di progettare un vecchio computer o una consolle da bar è oggi possibile sfruttando le FPGA, ovvero logiche programmabili che consentono a chiunque di progettare il proprio hardware o di ricrearne uno del passato. In questa sessione si racconta come dal reverse engineering dell'hardware di vecchie glorie come il Commodore 64 e lo ZX Spectrum sia stato possibile farle rivivere attraverso tecnologie oggi alla portata di tutti.
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Codemotion
There's a lot of talk about blockchain, but how does the technology behind it actually work? For developers, getting some hands-on experience is the fastest way to get familiair with new technologies. So let's build a blockchain, then! In this session, we're going to build one in plain old Java, and have it working in 40 minutes. We'll cover key concepts of a blockchain: transactions, blocks, mining, proof-of-work, and reaching consensus in the blockchain network. After this session, you'll have a better understanding of core aspects of blockchain technology.
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Codemotion
When was the last time you were truly lost? Thanks to the maps and location technology in our phones, a whole generation has now grown up in a world where getting lost is truly a thing of the past. Location technology goes far beyond maps in the palm of our hand, however. In this talk, we will explore how a ridesharing app works. How do we discover our destination?How do we find the closest driver? How do we display this information on a map? How do we find the best route?To answer these questions,we will be learning about a variety of location APIs, including Maps, Positioning, Geocoding etc.
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Codemotion
Eward Driehuis, SecureLink's research chief, will guide you through the bumpy ride we call the cyber threat landscape. As the industry has over a decade of experience of dealing with increasingly sophisticated attacks, you might be surprised to hear more attacks slip through the cracks than ever. From analyzing 20.000 of them in 2018, backed by a quarter of a million security events and over ten trillion data points, Eward will outline why this happens, how attacks are changing, and why it doesn't matter how neatly or securely you code.
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 - Codemotion
IoT revolution is ended. Thanks to hardware improvement, building an intelligent ecosystem is easier than never before for both startups and large-scale enterprises. The real challenge is now to connect, process, store and analyze data: in the cloud, but also, at the edge. We’ll give a quick look on frameworks that aggregate dispersed devices data into a single global optimized system allowing to improve operational efficiency, to predict maintenance, to track asset in real-time, to secure cloud-connected devices and much more.
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Codemotion
What if Virtual Reality glasses could transform your environment into a three-dimensional work of art in realtime in the style of a painting from Van Gogh? One of the many interesting developments in the field of Deep Learning is the so called "Style Transfer". It describes a possibility to create a patchwork (or pastiche) from two images. While one of these images defines the the artistic style of the result picture, the other one is used for extracting the image content. A team from TNG Technology Consulting managed to build an AI showcase using OpenCV and Tensorflow to realize such goggles.
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Codemotion
Blockchain (and Cryptocurrency) is an evolution of 20-year old research from scientists like Chaum, Lamport, and Castro & Liskov. Due to the current hype, it's hard to distinguish beneficial aspects of the technology from a desire for a "silver bullet" for device security, verifiable logistics, or "saving democracy". The problem: blockchain introduces new security challenges - and blind adoption without understanding reduces overall security. In this talk, Melanie Rieback and Klaus Kursawe explain the pitfalls and limits of blockchain, so you can avoid making your applications LESS secure.
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Codemotion
Networking is a core part of computing in the digital world we inhabit. But, how well do you know how it works? Do you understand all the moving parts of the OSI stack inside your computer, and how the network is actually put together? How can this ever work? This guided safari of layers, standards, protocols, and happenstance will bring us close to the copper wire, and up through the layers of CDMA/CD, ARP, routing and HTTP. We will make a few excursions through patchworks that still work forty years later, and cleverly designed mechanisms that show that simplicity is the only way to last.
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Codemotion
Performance tests are not only an important instrument for understanding a system and its runtime environment. It is also essential in order to check stability and scalability – non-functional requirements that might be decisive for success. But won't my cloud hosting service scale for me as long as I can afford it? Yes, but… It only operates and scales resources. It won't automatically make your system fast, stable and scalable. This talk shows how such and comparable questions can be clarified with performance tests and how DevOps teams benefit from regular test practise.
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Codemotion
Sascha will demonstrate the opportunities and challenges of Conversational AI learned from the practice. Both Technology and User Experience will be covered introducing a process finding micro-moments, writing happy paths, gathering intents, designing the conversational flow, and finally publishing on almost all channels including Voice Services and Chatbots. Valuable for enterprises, developers, and designers. All live on stage in just minutes and with almost no code.
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Codemotion
A key challenge we face at Pacmed is quickly calibrating and deploying our tools for clinical decision support in different hospitals, where data formats may vary greatly. Using Intensive Care Units as a case study, I’ll delve into our scalable Python pipeline, which leverages Pandas’ split-apply-combine approach to perform complex feature engineering and automatic quality checks on large time-varying data, e.g. vital signs. I’ll show how we use the resulting flexible and interpretable dataframes to quickly (re)train our models to predict mortality, discharge, and medical complications.
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Codemotion
Coolblue is a proud Dutch company, with a large internal development department; one that truly takes CI/CD to heart. Empowerment through automation is at the heart of these development teams, and with more than 1000 deployments a day, we think it's working out quite well. In this session, Pat Hermens (a Development Managers) will step you through what enables us to move so quickly, which tools we use, and most importantly, the mindset that is required to enable development teams to deliver at such a rapid pace.
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...Codemotion
Quantum computers can use all of the possible pathways generated by quantum decisions to solve problems that will forever remain intractable to classical compute power. As the mega players vie for quantum supremacy and Rigetti announces its $1M "quantum advantage" prize, we live in exciting times. IBM-Q and Microsoft Q# are two ways you can learn to program quantum computers so that you're ready when the quantum revolution comes. I'll demonstrate some quantum solutions to problems that will forever be out of reach of classical, including organic chemistry and large number factorisation.
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Codemotion
Chinese food exploded across America in the early 20th century, rapidly adapting to local tastes while also spreading like wildfire. How was it able to spread so fast? The GY6 is a family of scooter engines that has achieved near total ubiquity in Europe. It is reliable and cheap to manufacture, and it's made in factories across China. How are these factories able to remain afloat? Chinese-American food and the GY6 are both riveting studies in product-market fit, and both are the product of a distributed open source-like development model. What lessons can we learn for open source software?
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Codemotion
The design space has exploded in size within the last few years and Sketch is one of the most important milestones to represent the phenomenon. But behind the scenes of this growing reality there is a remote team that revolutionizes the design space all without leaving the home office. This talk will present how Sketch has grown to become a modern, product designer's tool.
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Codemotion
Would you fly in a plane designed by a craftsman or would you prefer your aircraft to be designed by engineers? We are learning that science and empiricism works in software development, maybe now is the time to redefine what “Software Engineering” really means. Software isn't bridge-building, it is not car or aircraft development either, but then neither is Chemical Engineering. Engineering is different in different disciplines. Maybe it is time for us to begin thinking about retrieving the term "Software Engineering" maybe it is time to define what our "Engineering" discipline should be.
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Codemotion
What is the job of a CTO and how does it change as a startup grows in size and scale? As a CTO, where should you spend your focus? As an engineer aspiring to be a CTO, what skills should you pursue? In this inspiring and personal talk, I describe my journey from early Red Hat engineer to CTO at Bloomon. I will share my view on what it means to be a CTO, and ultimately answer the question: Should the CTO be coding?
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Collection of logical in-memory components that solve high performance and scalability challenges
Collection of logical in-memory components that solve high performance and scalability challenges
Collection of logical in-memory components that solve high performance and scalability challenges
application level soft-locking using versioning
deadlock protection
When it comes to querying and acting on data — including in Big Data/Fast Data environments — SQL still dominates. And no other database-agnostic in-memory solution handles SQL functionality like the Apache Ignite In-Memory Data Fabric.
application level soft-locking using versioning
deadlock protection (Serializable)
Apache Ignite allows for most of the data structures from the java.util.concurrent framework to be used in a distributed fashion
Collection of logical in-memory components that solve high performance and scalability challenges
ComputeTask with service injection
Direct API for Fork/Join
Collection of logical in-memory components that solve high performance and scalability challenges
Direct API for Fork/Join
Affinity run of spark jobs
- supports multiple infrastructure including Google Cloud, AWS & Docker