This document discusses how caching can help address performance, scalability, and autonomy challenges for microservices architectures. It introduces Pivotal Cloud Cache (PCC) as a caching solution for microservices on Pivotal Cloud Foundry. PCC provides an in-memory cache that can scale horizontally and increase performance. It also allows for data autonomy between microservices and teams while providing high availability. PCC offers an easy and cost-effective way to cache data and adopt microservices on Pivotal Cloud Foundry.
Kafka Intro With Simple Java Producer ConsumersJean-Paul Azar
Introduction to Kafka streaming platform. Covers Kafka Architecture with some small examples from the command line. Then we expand on this with a multi-server example. Lastly, we added some simple Java client examples for a Kafka Producer and a Kafka Consumer.
KafkaConsumer - Decoupling Consumption and Processing for Better Resource Uti...confluent
When working with KafkaConsumer, we usually employ single thread both for reading and processing of messages. KafkaConsumer is not thread-safe, so using single thread fits in well. Downside of this approach is that you are limited to single thread for processing messages.
By decoupling consumption and processing, we can achieve processing parallelization with single consumer and get the most out of multi-core CPU architectures available today. While this can be very useful in certain use-case scenarios, it's not trivial to implement.
How do we use multiple threads with KafkaConsumer which is not thread safe? How do we react to consumer group rebalancing? Can we get desired processing and ordering guarantees? In this talk we 'll try to answer these questions and explore challenges we face on our path.
Kafka Intro With Simple Java Producer ConsumersJean-Paul Azar
Introduction to Kafka streaming platform. Covers Kafka Architecture with some small examples from the command line. Then we expand on this with a multi-server example. Lastly, we added some simple Java client examples for a Kafka Producer and a Kafka Consumer.
KafkaConsumer - Decoupling Consumption and Processing for Better Resource Uti...confluent
When working with KafkaConsumer, we usually employ single thread both for reading and processing of messages. KafkaConsumer is not thread-safe, so using single thread fits in well. Downside of this approach is that you are limited to single thread for processing messages.
By decoupling consumption and processing, we can achieve processing parallelization with single consumer and get the most out of multi-core CPU architectures available today. While this can be very useful in certain use-case scenarios, it's not trivial to implement.
How do we use multiple threads with KafkaConsumer which is not thread safe? How do we react to consumer group rebalancing? Can we get desired processing and ordering guarantees? In this talk we 'll try to answer these questions and explore challenges we face on our path.
A brief overview of caching mechanisms in a web application. Taking a look at the different layers of caching and how to utilize them in a PHP code base. We also compare Redis and MemCached discussing their advantages and disadvantages.
https://www.oxygenxml.com/events/2021/webinar_the_new_json_schema_diagram_editor.html
A webinar in which we will show you how Oxygen now offers even more powerful tools that allow you to design, develop, and edit JSON Schemas. We will be focusing on presenting features ranging from the new intuitive and expressive visual schema Design mode, all the way up to the JSON Schema documentation generator that includes diagram images for each component.
During this live webinar, you will get the chance to take an in-depth look at all of these features, as well as learn:
How to create JSON Schemas from scratch
How to visualize and edit complex JSON Schemas
How to generate JSON Schema documentation
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
Flink Forward San Francisco 2022.
At Stripe we have created a complete end to end exactly-once processing pipeline to process financial data at scale, by combining the exactly-once power from Flink, Kafka, and Pinot together. The pipeline provides exactly-once guarantee, end-to-end latency within a minute, deduplication against hundreds of billions of keys, and sub-second query latency against the whole dataset with trillion level rows. In this session we will discuss the technical challenges of designing, optimizing, and operating the whole pipeline, including Flink, Kafka, and Pinot. We will also share our lessons learned and the benefits gained from exactly-once processing.
by
Xiang Zhang & Pratyush Sharma & Xiaoman Dong
Amazon Aurora is a fully managed relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. It is purpose-built for the cloud using a new architectural model and distributed systems techniques to provide far higher performance, availability and durability than previously possible using conventional monolithic database architectures. Amazon Aurora packs a lot of innovations in the engine and storage layers. In this session, we will do a deep-dive into some of the key innovations behind Amazon Aurora, new improvements to Aurora's performance, availability and cost-effectiveness and discuss best practices and optimal configurations.
Delta from a Data Engineer's PerspectiveDatabricks
Take a walk through the daily struggles of a data engineer in this presentation as we cover what is truly needed to create robust end to end Big Data solutions.
Optimizing Delta/Parquet Data Lakes for Apache SparkDatabricks
This talk will start by explaining the optimal file format, compression algorithm, and file size for plain vanilla Parquet data lakes. It discusses the small file problem and how you can compact the small files. Then we will talk about partitioning Parquet data lakes on disk and how to examine Spark physical plans when running queries on a partitioned lake.
We will discuss why it’s better to avoid PartitionFilters and directly grab partitions when querying partitioned lakes. We will explain why partitioned lakes tend to have a massive small file problem and why it’s hard to compact a partitioned lake. Then we’ll move on to Delta lakes and explain how they offer cool features on top of what’s available in Parquet. We’ll start with Delta 101 best practices and then move on to compacting with the OPTIMIZE command.
We’ll talk about creating partitioned Delta lake and how OPTIMIZE works on a partitioned lake. Then we’ll talk about ZORDER indexes and how to incrementally update lakes with a ZORDER index. We’ll finish with a discussion on adding a ZORDER index to a partitioned Delta data lake.
Building Reliable Data Lakes at Scale with Delta LakeDatabricks
Most data practitioners grapple with data reliability issues—it’s the bane of their existence. Data engineers, in particular, strive to design, deploy, and serve reliable data in a performant manner so that their organizations can make the most of their valuable corporate data assets.
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Built on open standards, Delta Lake employs co-designed compute and storage and is compatible with Spark API’s. It powers high data reliability and query performance to support big data use cases, from batch and streaming ingests, fast interactive queries to machine learning. In this tutorial we will discuss the requirements of modern data engineering, the challenges data engineers face when it comes to data reliability and performance and how Delta Lake can help. Through presentation, code examples and notebooks, we will explain these challenges and the use of Delta Lake to address them. You will walk away with an understanding of how you can apply this innovation to your data architecture and the benefits you can gain.
This tutorial will be both instructor-led and hands-on interactive session. Instructions on how to get tutorial materials will be covered in class.
What you’ll learn:
Understand the key data reliability challenges
How Delta Lake brings reliability to data lakes at scale
Understand how Delta Lake fits within an Apache Spark™ environment
How to use Delta Lake to realize data reliability improvements
Prerequisites
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Pre-register for Databricks Community Edition
Exploring Java Heap Dumps (Oracle Code One 2018)Ryan Cuprak
Memory leaks are not always simple or easy to find. Heap dumps from production systems are often gigantic (4+ gigs) with millions of objects in memory. Simple spot checking with traditional tools is woefully inadequate in these situations, especially with real data. Leaks can be entire object graphs with enormous amounts of noise. This session will show you how to build custom tools using the Apache NetBeans Profiler/Heapwalker APIs. Using these APIs, you can read and analyze Java heaps programmatically to ask really hard questions. This gives you the power to analyze complex object graphs with tens of thousands of objects in seconds.
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...Dremio Corporation
Essentially every successful analytical DBMS in the market today makes use of column-oriented data structures. In the Hadoop ecosystem, Apache Parquet (and Apache ORC) provide similar advantages in terms of processing and storage efficiency. Apache Arrow is the in-memory counterpart to these formats and has been been embraced by over a dozen open source projects as the de facto standard for in-memory processing. In this session the PMC Chair for Apache Arrow and the PMC Chair for Apache Parquet discuss the future of column-oriented processing.
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
Recently, a set of modern table formats such as Delta Lake, Hudi, Iceberg spring out. Along with Hive Metastore these table formats are trying to solve problems that stand in traditional data lake for a long time with their declared features like ACID, schema evolution, upsert, time travel, incremental consumption etc.
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMRAmazon Web Services
Organizations need to perform increasingly complex analysis on their data — streaming analytics, ad-hoc querying and predictive analytics — in order to get better customer insights and actionable business intelligence. However, the growing data volume, speed, and complexity of diverse data formats make current tools inadequate or difficult to use. Apache Spark has recently emerged as the framework of choice to address these challenges. Spark is a general-purpose processing framework that follows a DAG model and also provides high-level APIs, making it more flexible and easier to use than MapReduce. Thanks to its use of in-memory datasets (RDDs), embedded libraries, fault-tolerance, and support for a variety of programming languages, Apache Spark enables developers to implement and scale far more complex big data use cases, including real-time data processing, interactive querying, graph computations and predictive analytics. In this session, we present a technical deep dive on Spark running on Amazon EMR. You learn why Spark is great for ad-hoc interactive analysis and real-time stream processing, how to deploy and tune scalable clusters running Spark on Amazon EMR, how to use EMRFS with Spark to query data directly in Amazon S3, and best practices and patterns for Spark on Amazon EMR.
An update from Mike Hichwa (Oracle) on the introduction of the "Always Free" Autonomous Database at Oracle Openworld19. See how to sign up for always free Oracle Cloud Infrastructure (OCI) services. Kind of an illustrated quick start. See how to provision an Autonomous Database. See how to use and access Oracle APEX. Learn Oracle APEX architecture overview including the "RAD stack". Also a quick introduction to new APEX 19.2 and faceted search. #ORCLAPEX, #autonomousDB, #OOW19, @oracleapex. ORDS, Oracle REST Data Services.
Session Sponsored by Splunk: Splunk for the Cloud, in the CloudAmazon Web Services
As more critical workloads move to the cloud, there is a need for increased levels of operational intelligence. Organisations need to ensure that mission-critical cloud deployments adhere to security and compliance standards. They also need to ensure application performance and uptime in the cloud meet customer expectations. To meet these needs, Splunk has closely aligned with AWS to deliver solutions that offer real-time visibility into cloud management, infrastructure, billing and services. Attend this session to learn how Splunk can help you move to the cloud with agility and confidence
Speaker: Richard Smith, Alliance Manager, ANZ and Simon O’Brien; Senior Systems Engineer, Splunk
A Successful Journey to the Cloud with Data VirtualizationDenodo
Watch full webinar here: https://bit.ly/3mPLIlo
A shift to the cloud is a common element of any current data strategy. However, a successful transition to the cloud is not easy and can take years. It comes with security challenges, changes in downstream and upstream applications, and new ways to operate and deploy software. An abstraction layer that decouples data access from storage and processing can be a key element to enable a smooth journey to the cloud.
Attend this webinar to learn more about:
- How to use Data Virtualization to gradually change data systems without impacting business operations
- How Denodo integrates with the larger cloud ecosystems to enable security
- How simple it is to create and manage a Denodo cloud deployment
Caching for Microservices Architectures: Session II - Caching PatternsVMware Tanzu
In the first webinar of the series we covered the importance of caching in microservice-based application architectures—in addition to improving performance it also aids in making content available from legacy systems, promotes loose coupling and team autonomy, and provides air gaps that can limit failures from cascading through a system.
To reap these benefits, though, the right caching patterns must be employed. In this webinar, we will examine various caching patterns and shed light on how they deliver the capabilities needed by our microservices. What about rapidly changing data, and concurrent updates to data? What impact do these and other factors have to various use cases and patterns?
Understanding data access patterns, covered in this webinar, will help you make the right decisions for each use case. Beyond the simplest of use cases, caching can be tricky business—join us for this webinar to see how best to use them.
Jagdish Mirani, Cornelia Davis, Michael Stolz, Pulkit Chandra, Pivotal
A brief overview of caching mechanisms in a web application. Taking a look at the different layers of caching and how to utilize them in a PHP code base. We also compare Redis and MemCached discussing their advantages and disadvantages.
https://www.oxygenxml.com/events/2021/webinar_the_new_json_schema_diagram_editor.html
A webinar in which we will show you how Oxygen now offers even more powerful tools that allow you to design, develop, and edit JSON Schemas. We will be focusing on presenting features ranging from the new intuitive and expressive visual schema Design mode, all the way up to the JSON Schema documentation generator that includes diagram images for each component.
During this live webinar, you will get the chance to take an in-depth look at all of these features, as well as learn:
How to create JSON Schemas from scratch
How to visualize and edit complex JSON Schemas
How to generate JSON Schema documentation
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
Flink Forward San Francisco 2022.
At Stripe we have created a complete end to end exactly-once processing pipeline to process financial data at scale, by combining the exactly-once power from Flink, Kafka, and Pinot together. The pipeline provides exactly-once guarantee, end-to-end latency within a minute, deduplication against hundreds of billions of keys, and sub-second query latency against the whole dataset with trillion level rows. In this session we will discuss the technical challenges of designing, optimizing, and operating the whole pipeline, including Flink, Kafka, and Pinot. We will also share our lessons learned and the benefits gained from exactly-once processing.
by
Xiang Zhang & Pratyush Sharma & Xiaoman Dong
Amazon Aurora is a fully managed relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. It is purpose-built for the cloud using a new architectural model and distributed systems techniques to provide far higher performance, availability and durability than previously possible using conventional monolithic database architectures. Amazon Aurora packs a lot of innovations in the engine and storage layers. In this session, we will do a deep-dive into some of the key innovations behind Amazon Aurora, new improvements to Aurora's performance, availability and cost-effectiveness and discuss best practices and optimal configurations.
Delta from a Data Engineer's PerspectiveDatabricks
Take a walk through the daily struggles of a data engineer in this presentation as we cover what is truly needed to create robust end to end Big Data solutions.
Optimizing Delta/Parquet Data Lakes for Apache SparkDatabricks
This talk will start by explaining the optimal file format, compression algorithm, and file size for plain vanilla Parquet data lakes. It discusses the small file problem and how you can compact the small files. Then we will talk about partitioning Parquet data lakes on disk and how to examine Spark physical plans when running queries on a partitioned lake.
We will discuss why it’s better to avoid PartitionFilters and directly grab partitions when querying partitioned lakes. We will explain why partitioned lakes tend to have a massive small file problem and why it’s hard to compact a partitioned lake. Then we’ll move on to Delta lakes and explain how they offer cool features on top of what’s available in Parquet. We’ll start with Delta 101 best practices and then move on to compacting with the OPTIMIZE command.
We’ll talk about creating partitioned Delta lake and how OPTIMIZE works on a partitioned lake. Then we’ll talk about ZORDER indexes and how to incrementally update lakes with a ZORDER index. We’ll finish with a discussion on adding a ZORDER index to a partitioned Delta data lake.
Building Reliable Data Lakes at Scale with Delta LakeDatabricks
Most data practitioners grapple with data reliability issues—it’s the bane of their existence. Data engineers, in particular, strive to design, deploy, and serve reliable data in a performant manner so that their organizations can make the most of their valuable corporate data assets.
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Built on open standards, Delta Lake employs co-designed compute and storage and is compatible with Spark API’s. It powers high data reliability and query performance to support big data use cases, from batch and streaming ingests, fast interactive queries to machine learning. In this tutorial we will discuss the requirements of modern data engineering, the challenges data engineers face when it comes to data reliability and performance and how Delta Lake can help. Through presentation, code examples and notebooks, we will explain these challenges and the use of Delta Lake to address them. You will walk away with an understanding of how you can apply this innovation to your data architecture and the benefits you can gain.
This tutorial will be both instructor-led and hands-on interactive session. Instructions on how to get tutorial materials will be covered in class.
What you’ll learn:
Understand the key data reliability challenges
How Delta Lake brings reliability to data lakes at scale
Understand how Delta Lake fits within an Apache Spark™ environment
How to use Delta Lake to realize data reliability improvements
Prerequisites
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Pre-register for Databricks Community Edition
Exploring Java Heap Dumps (Oracle Code One 2018)Ryan Cuprak
Memory leaks are not always simple or easy to find. Heap dumps from production systems are often gigantic (4+ gigs) with millions of objects in memory. Simple spot checking with traditional tools is woefully inadequate in these situations, especially with real data. Leaks can be entire object graphs with enormous amounts of noise. This session will show you how to build custom tools using the Apache NetBeans Profiler/Heapwalker APIs. Using these APIs, you can read and analyze Java heaps programmatically to ask really hard questions. This gives you the power to analyze complex object graphs with tens of thousands of objects in seconds.
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...Dremio Corporation
Essentially every successful analytical DBMS in the market today makes use of column-oriented data structures. In the Hadoop ecosystem, Apache Parquet (and Apache ORC) provide similar advantages in terms of processing and storage efficiency. Apache Arrow is the in-memory counterpart to these formats and has been been embraced by over a dozen open source projects as the de facto standard for in-memory processing. In this session the PMC Chair for Apache Arrow and the PMC Chair for Apache Parquet discuss the future of column-oriented processing.
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
Recently, a set of modern table formats such as Delta Lake, Hudi, Iceberg spring out. Along with Hive Metastore these table formats are trying to solve problems that stand in traditional data lake for a long time with their declared features like ACID, schema evolution, upsert, time travel, incremental consumption etc.
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMRAmazon Web Services
Organizations need to perform increasingly complex analysis on their data — streaming analytics, ad-hoc querying and predictive analytics — in order to get better customer insights and actionable business intelligence. However, the growing data volume, speed, and complexity of diverse data formats make current tools inadequate or difficult to use. Apache Spark has recently emerged as the framework of choice to address these challenges. Spark is a general-purpose processing framework that follows a DAG model and also provides high-level APIs, making it more flexible and easier to use than MapReduce. Thanks to its use of in-memory datasets (RDDs), embedded libraries, fault-tolerance, and support for a variety of programming languages, Apache Spark enables developers to implement and scale far more complex big data use cases, including real-time data processing, interactive querying, graph computations and predictive analytics. In this session, we present a technical deep dive on Spark running on Amazon EMR. You learn why Spark is great for ad-hoc interactive analysis and real-time stream processing, how to deploy and tune scalable clusters running Spark on Amazon EMR, how to use EMRFS with Spark to query data directly in Amazon S3, and best practices and patterns for Spark on Amazon EMR.
An update from Mike Hichwa (Oracle) on the introduction of the "Always Free" Autonomous Database at Oracle Openworld19. See how to sign up for always free Oracle Cloud Infrastructure (OCI) services. Kind of an illustrated quick start. See how to provision an Autonomous Database. See how to use and access Oracle APEX. Learn Oracle APEX architecture overview including the "RAD stack". Also a quick introduction to new APEX 19.2 and faceted search. #ORCLAPEX, #autonomousDB, #OOW19, @oracleapex. ORDS, Oracle REST Data Services.
Session Sponsored by Splunk: Splunk for the Cloud, in the CloudAmazon Web Services
As more critical workloads move to the cloud, there is a need for increased levels of operational intelligence. Organisations need to ensure that mission-critical cloud deployments adhere to security and compliance standards. They also need to ensure application performance and uptime in the cloud meet customer expectations. To meet these needs, Splunk has closely aligned with AWS to deliver solutions that offer real-time visibility into cloud management, infrastructure, billing and services. Attend this session to learn how Splunk can help you move to the cloud with agility and confidence
Speaker: Richard Smith, Alliance Manager, ANZ and Simon O’Brien; Senior Systems Engineer, Splunk
A Successful Journey to the Cloud with Data VirtualizationDenodo
Watch full webinar here: https://bit.ly/3mPLIlo
A shift to the cloud is a common element of any current data strategy. However, a successful transition to the cloud is not easy and can take years. It comes with security challenges, changes in downstream and upstream applications, and new ways to operate and deploy software. An abstraction layer that decouples data access from storage and processing can be a key element to enable a smooth journey to the cloud.
Attend this webinar to learn more about:
- How to use Data Virtualization to gradually change data systems without impacting business operations
- How Denodo integrates with the larger cloud ecosystems to enable security
- How simple it is to create and manage a Denodo cloud deployment
Caching for Microservices Architectures: Session II - Caching PatternsVMware Tanzu
In the first webinar of the series we covered the importance of caching in microservice-based application architectures—in addition to improving performance it also aids in making content available from legacy systems, promotes loose coupling and team autonomy, and provides air gaps that can limit failures from cascading through a system.
To reap these benefits, though, the right caching patterns must be employed. In this webinar, we will examine various caching patterns and shed light on how they deliver the capabilities needed by our microservices. What about rapidly changing data, and concurrent updates to data? What impact do these and other factors have to various use cases and patterns?
Understanding data access patterns, covered in this webinar, will help you make the right decisions for each use case. Beyond the simplest of use cases, caching can be tricky business—join us for this webinar to see how best to use them.
Jagdish Mirani, Cornelia Davis, Michael Stolz, Pulkit Chandra, Pivotal
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
Discover how to avoid common pitfalls when shifting to an event-driven architecture (EDA) in order to boost system recovery and scalability. We cover Kafka Schema Registry, in-broker transformations, event sourcing, and more.
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudCloudera, Inc.
3 Things to Learn About:
*On-premises versus the cloud
*Design & benefits of real-time operational data in the cloud
*Best practices and architectural considerations
Solving enterprise challenges through scale out storage & big compute finalAvere Systems
Google Cloud Platform, Avere Systems, and Cycle Computing experts will share best practices for advancing solutions to big challenges faced by enterprises with growing compute and storage needs. In this “best practices” webinar, you’ll hear how these companies are working to improve results that drive businesses forward through scalability, performance, and ease of management.
The slides were from a webinar presented January 24, 2017. The audience learned:
- How enterprises are using Google Cloud Platform to gain compute and storage capacity on-demand
- Best practices for efficient use of cloud compute and storage resources
- Overcoming the need for file systems within a hybrid cloud environment
- Understand how to eliminate latency between cloud and data center architectures
- Learn how to best manage simulation, analytics, and big data workloads in dynamic environments
- Look at market dynamics drawing companies to new storage models over the next several years
Presenters communicated a foundation to build infrastructure to support ongoing demand growth.
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Continuent
Marketo provides the leading cloud-based marketing software platform for companies of all sizes to build and sustain engaging customer relationships. Marketo's SaaS platform runs on MySQL and has faced data management challenges common to all 24x7 SaaS businesses:
- Keeping data available regardless of DBMS failures or planned maintenance
- Utilizing hardware optimized for multi-terabyte MySQL servers
- Keeping replicas caught up and ready for instant failover despite high transaction loads
In this webinar, Nick Bonfiglio, VP of Operations at Marketo, describes how Marketo manages thousands of customers and processes a billion marketing analytics transactions a day using Continuent Tungsten and MySQL atop an innovative hardware architecture. He explains how Tungsten parallel replication paved the way to rapid growth by solving Marketo's biggest MySQL challenge: keeping DBMS replicas up to date despite massive transaction loads.
Who's in your Cloud? Cloud State MonitoringKevin Hakanson
When it comes to cloud operations, monitoring security and visibility are critical. Integration by other systems via Cloud APIs is one of the most powerful value drivers of the hyperscale cloud providers.
In this session, we will describe Cloud State Monitoring, including why it is important and who needs awareness in your organization. An explanation of the categories of Cloud APIs (including the management plane, control plane, and data plane) will give us background. Specific use cases across AWS, Azure, and GCP will dive deep into various changes you might not have considered monitoring.
As application development becomes more agile, and the ability to rapidly create and iterate new innovations escalates, so too does the need to be able to rapidly scale up the solutions that become successful. Equally it is common to create solutions with relatively short life-cycles and so we need to be able to scale down to recover resources too. On a more fine grained level, to make efficient use of shared platforms such as Kubernetes, we need to be able to dynamically scale applications up and down based on fine grained demand. Inevitably all these challenges are just as important for the integration between applications. This session explores what scalability means for the key areas of integration technology - application integration, API management and messaging.
Adelaide Global Azure Bootcamp 2018 - Azure 101Balabiju
A one day session that covers all the foundation of Azure services.
Microsoft Cloud Overview - IaaS, PaaS and SaaS
• Microsoft Azure Resource Manager (ARM)
• Microsoft Azure Storage
• Microsoft Azure Virtual Machines
• Microsoft Azure Identity
• Microsoft Azure Backup
Azure SQL Database Managed Instance is a new flavor of Azure SQL Database that is a game changer. It offers near-complete SQL Server compatibility and network isolation to easily lift and shift databases to Azure (you can literally backup an on-premise database and restore it into a Azure SQL Database Managed Instance). Think of it as an enhancement to Azure SQL Database that is built on the same PaaS infrastructure and maintains all it's features (i.e. active geo-replication, high availability, automatic backups, database advisor, threat detection, intelligent insights, vulnerability assessment, etc) but adds support for databases up to 35TB, VNET, SQL Agent, cross-database querying, replication, etc. So, you can migrate your databases from on-prem to Azure with very little migration effort which is a big improvement from the current Singleton or Elastic Pool flavors which can require substantial changes.
The Tanzu Developer Connect is a hands-on workshop that dives deep into TAP. Attendees receive a hands on experience. This is a great program to leverage accounts with current TAP opportunities.
The Tanzu Developer Connect is a hands-on workshop that dives deep into TAP. Attendees receive a hands on experience. This is a great program to leverage accounts with current TAP opportunities.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
4. Ability to handle a large number of concurrent requests
Performance Drivers for Modern Applications
▪ More users of new mobile and web
applications
▪ Users expect real-time response,
even during peak usage
▪ Increasing number of requests from other
applications
▪ New use cases from new data sources,
ex: IoT and streaming data
▪ Scaling the application logic results in the
need for scaling the data access layer
5. Caches Provide Blazing Fast Performance
▪ Memory is orders of magnitude faster than disk
▪ Caches can present a structural view of data
optimized for performance
▪ Maximizing cache hits
- Preloading cache (cache warming)
- Expiration and eviction
• Application driven
• Time based
• Notifications and events
Microservice
Instance
Cache
Database
6. Externalizing state is a requirement for microservice instances to scale
Microservices Need Performance and Scalability
▪ Externalize microservices state for performance
and scalability of the business logic
- Store application state information in cache
for fast retrieval
- Adheres to 12-factor principles
▪ Dynamically change the number of application
instances without losing state information
7. Microservices with large, frequently accessed data sets need a cache layer
Microservices Need Performance and Scalability
Performance and scalability of data
▪ Add servers to a shared cluster
▪ Reduces the pressure to scale rigid
backing stores
▪ Enables availability and resilience
9. Fosters an agile, dynamic application culture
Team Autonomy Equates to Velocity
▪ Separate development and release cycles
- Evolve each microservice independently
- Independent development, test, production cycles
- Continuous integration, continuous delivery
▪ Independent technology decisions, including data layer
- Polyglot persistence
- Independent data model decisions
▪ Changes should be non-breaking for other teams and
microservices
10. Extreme ends of data sharing continuum present challenges
Autonomy in the Context of the Data Layer
Autonomous
Distributed workload
and data
management
challenges
Shared Database Database
Per
Service
No autonomy
Development
and runtime
coupling
11. Data APIs Present Autonomous Views of Data
● Define a Data API that projects a data model to match the
needs of the consuming microservices
● Data API is the access point to a microservice that’s primarily
responsible for accessing data
● Data API provides a contract for accessing data
● Teams create data caches optimized for their microservices
● Allows more flexibility for (and isolation from) changes to
backing stores
Caches provide data for each autonomous view
12. Data API evolve in support of the evolution of the microservice(s)
Versioned APIs Facilitate Change Management
▪ Analogous to the notion of versioned
microservices
▪ Parallel deployment of versions creates the
possibility of a managed evolution
▪ Allows for data transformations within the
microservice as an alternative to changing
the backing store(s)
V1 V2
13. Caching Can Present an Autonomous View of Data
● Provides a surface area to:
○ implement access control
○ implement throttling
○ perform logging
○ enforce other policies
Teams create data caches optimized for their microservices
14. Caching Can Present an Autonomous View of Data
● Data APIs project a bounded context
○ Each bounded context has a single, unified
model
○ Relationships between models are explicitly
defined
○ Teams are typically given full responsibility over
one or more bounded contexts
16. Several points of failure
Large Number of ‘Moving Parts’
▪ Single request can touch several components: servers,
distinct clusters, microservice instances
▪ Availability zones can fail
▪ Regions can become unstable
▪ Network is unreliable
▪ Cloud native architecture components are ephemeral by
design
- Instances added and removed dynamically
Enterprise Readiness Requires the Ability to Tolerate Failures
17. Highly Available Caching Layer Offers Protection
● Cache serves as the ‘primary’ data
store for the application
● High availability: copy data for
failure protection
● Immune to lapses in backing store
availability
● Backing stores kept up-to-date
through the cache
19. Expensive and Brittle
Legacy Application Infrastructures
▪ High startup costs
▪ Steep pricing curve for adding capacity
- Mainframe MIPS pricing
- Legacy RDBMS data stores are expensive to scale
▪ Complex deployments
▪ Easily disrupted
▪ Points of failure
▪ Scalability bottlenecks
20. 20
Legacy Modernization is key to success
$$$$
ROI Funds
Transformation
Replatform Modernize Migrate
Runs on
PCF
Existing Workloads
Cloud Native
Built for
PCF
New Initiatives
Cloud
Native
ModernizeReplatform Migrate
21. Legacy Systems: Part of a Cloud Native Evolution
Create microservices around the edges of the legacy system
● Caching layer to mediates between
the old and the new
● Optionally, re-platform the legacy
application
● Optionally, reduce reliance on legacy
application over time
Microservices
Legacy
Middleware
Legacy
Application
Monolithic
Application
23. Prepackaged for Simple Consumption
● Plans (use cases) based on caching patterns
● Look-aside pattern supported out of the box
○ Cache is controlled and managed by the
application
○ Good for saving application state,
microservices architectures, reducing load
on legacy systems, etc..
○ Perfect match for the Spring Framework
@Cacheable annotation
● Other caching patterns and options to come
(WAN replication, Session State Caching, Inline
caching pattern, etc.).
Look-Aside Cache
Look-aside pattern supported out of the box
App Instance
Cache
Database
24. Pivotal Cloud Cache
• Easy accessibility
through Marketplace
• Instant Provisioning
• Bind to apps through
easy to use interface
• Lifecycle management
• Common access
control and audit trails
across services
MySQL New Relic
Single Sign-
On
RabbitMQ
Config
Server
Service
Directory
Circuit
Breaker
Signal
Sciences
Crunchy
PostgreSQL AND
MORE
Services Marketplace
Pivotal Cloud
Cache
Dynatrace
Extending the Pivotal Cloud Foundry Platform for Microservices Architectures
25. Easily Provisioned for Developer Self Service
Operators create and register service plans with the Services Marketplace
Create
Service
Plans
Set
Quotas
Deploy
PCC
Broker
Define VMs
Define Memory
Define CPU
Define Disk Size
Max Cluster Size
Max # of Clusters
OpsMan Tile
Register with Marketplace
26. In-memory, and horizontally scalable for parallel execution
Pivotal Cloud Cache Performance
Grow cluster dynamically with no interruption of service or data loss
Data is sharded or replicated across servers
27. This is how an in-memory cache can horizontally scale
Partitioning (aka Sharding)
Take advantage of the memory and network bandwidth of all members of the cluster
29. High Availability
Spanning Servers and Availability Zones
Stretched cluster across availability zones
Replication for high availability of data in cache
Pivotal Cloud Foundry resurrects lost VMs
31. Integrated Security
Pre-configured Authentication and Authorization
● Role-based, configurable, authorization
for administrative activities
● Pre-defined, pre-configured roles
● Consistent mechanism for authenticating
and authorizing actions
● Every administrative function can require
authorization
● Every data access can require
authorization
● Some users can read/write data
● Others can start/stop servers
● Still others can configure cluster
User 1
User 2
User 3
User 4
Group 1
Group 2
Role 1
Role 2
Role 3
Role 1A
Role 1B
Role 1C
Role 2A
Role 3A
Role 3B
Determines
experience
Determines
permissions
32. Summary
Speed up your apps on Pivotal Cloud Foundry
● PCC can overcome the performance, elasticity and scaling, challenges of
microservices architectures
● Using PCC with data APIs can increase autonomy between teams
● PCC has rock solid availability and failure recovery
● PCC provides an evolutionary approach to adopting microservices that can
extend the life of legacy systems
● The combination of PCC and PCF make it possible to get started quickly
and easily adjust cache capacity as needed