This document summarizes a presentation on caching 101. It discusses caching theory, patterns, and scaling approaches. It covers caching on the JVM using JSR 107 and different caching strategies like cache-aside, cache-through, and write-behind. It also discusses scaling caches by moving data off-heap, clustering caches across multiple machines, and achieving different levels of consistency like strong, eventual, and probabilistically bounded staleness.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2BubBkw.
Alex Robinson walks through his experiences trying to reliably run a distributed database on Kubernetes, optimize its performance, and help others do the same in their heterogeneous environments. He looks at what kinds of stateful applications can most easily be run in containers, and a number of pitfalls he encountered along the way. Filmed at qconsf.com.
Alex Robinson is a member of the technical staff at Cockroach Labs, where he works on CockroachDB's core transactional storage layer and leads all integrations with orchestration systems. Previously, he was a senior software engineer at Google, where he spent his last two years as a core early developer of Kubernetes and GKE.
Virtualized storage is fast becoming the new norm.
Nobody can justify provisioning non-production environments the way they did up to now.
This presentation is about how Delphix removes the biggest bottleneck in IT operations, development, and QA by virtualizing data. It identifies the bottleneck and the impact on IT, then describes how Delphix removes it to enable DevOps continuous delivery.
Most mid-sized Django websites thrive by relying on memcached. Though what happens when basic memcached is not enough? And how can one identify when the caching architecture is becoming a bottleneck? We'll cover the problems we've encountered and solutions we've put in place.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2BubBkw.
Alex Robinson walks through his experiences trying to reliably run a distributed database on Kubernetes, optimize its performance, and help others do the same in their heterogeneous environments. He looks at what kinds of stateful applications can most easily be run in containers, and a number of pitfalls he encountered along the way. Filmed at qconsf.com.
Alex Robinson is a member of the technical staff at Cockroach Labs, where he works on CockroachDB's core transactional storage layer and leads all integrations with orchestration systems. Previously, he was a senior software engineer at Google, where he spent his last two years as a core early developer of Kubernetes and GKE.
Virtualized storage is fast becoming the new norm.
Nobody can justify provisioning non-production environments the way they did up to now.
This presentation is about how Delphix removes the biggest bottleneck in IT operations, development, and QA by virtualizing data. It identifies the bottleneck and the impact on IT, then describes how Delphix removes it to enable DevOps continuous delivery.
Most mid-sized Django websites thrive by relying on memcached. Though what happens when basic memcached is not enough? And how can one identify when the caching architecture is becoming a bottleneck? We'll cover the problems we've encountered and solutions we've put in place.
Priming Your Teams For Microservice Deployment to the CloudMatt Callanan
You think of a great idea for a microservice and want to ship it to production as quickly as possible. Of course you'll need to create a Git repo with a codebase that reuses libraries you share with other services. And you'll want a build and a basic test suite. You'll want to deploy it to immutable servers using infrastructure as code that dev and ops can maintain. Centralised logging, monitoring, and HipChat notifications would also be great. Of course you'll want a load balancer and a CNAME that your other microservices can hit. You'd love to have blue-green deploys and the ability to deploy updates at any time through a Continuous Delivery pipeline. Phew! How long will it take to set all this up? A couple of days? A week? A month?
What if you could do all of this within 30 minutes? And with a click of a button soon be receiving production traffic?
Matt introduces "Primer", Expedia's microservice generation and deployment platform that enables rapid experimentation in the cloud, how it's caused unprecedented rates of learning, and explain tips and tricks on how to build one yourself with practical takeaways for everyone from the startup to the enterprise.
Video: https://www.youtube.com/watch?v=Xy4EkaXyEs4
Meetup: http://www.meetup.com/Devops-Brisbane/events/225050723/
This slide deck describes some of the best practices found when running Oracle Database inside a Docker container. Those best practices are general observations collected over time and may not reflect your actual environment or current situation.
Typesafe trainer and consultant Will Sargent describes just how Play Framework is so "fast" for Java and Scala production apps.
More Play, Akka, Scala and Apache Spark webinars, presentations, and videos:
http://typesafe.com/resources/videos
Caching has been an essential strategy for greater performance in computing since the beginning of the field. Nearly all applications have data access patterns that make caching an attractive technique, but caching also has hidden trade-offs related to concurrency, memory usage, and latency.
As we build larger distributed systems, caching continues to be a critical technique for building scalable, high-throughput, low-latency applications. Large systems tend to magnify the caching trade-offs and have created new approaches to distributed caching. There are unique challenges in testing systems like these as well.
Ehcache and Terracotta provide a unique way to start with simple caching for a small system and grow that system over time with a consistent API while maintaining low-latency, high-throughput caching.
Cassandra ne permet ni jointure, ni agrégats et limite drastiquement vos capacités à requêter vos données pour permettre une scalabilité linéaire dans une architecture masterless. L'outil de choix pour effectuer des traitements analytiques sur vos tables Cassandra est Spark mais ce dernier complexifie des opérations pourtant simples en SQL. SparkSQL permet de retrouver une syntaxe SQL dans Spark et nous allons voir comment l'utiliser en Scala, Java et en Python pour travailler sur des tables Cassandra, et retrouver jointures et agrégats (entre autres).
Priming Your Teams For Microservice Deployment to the CloudMatt Callanan
You think of a great idea for a microservice and want to ship it to production as quickly as possible. Of course you'll need to create a Git repo with a codebase that reuses libraries you share with other services. And you'll want a build and a basic test suite. You'll want to deploy it to immutable servers using infrastructure as code that dev and ops can maintain. Centralised logging, monitoring, and HipChat notifications would also be great. Of course you'll want a load balancer and a CNAME that your other microservices can hit. You'd love to have blue-green deploys and the ability to deploy updates at any time through a Continuous Delivery pipeline. Phew! How long will it take to set all this up? A couple of days? A week? A month?
What if you could do all of this within 30 minutes? And with a click of a button soon be receiving production traffic?
Matt introduces "Primer", Expedia's microservice generation and deployment platform that enables rapid experimentation in the cloud, how it's caused unprecedented rates of learning, and explain tips and tricks on how to build one yourself with practical takeaways for everyone from the startup to the enterprise.
Video: https://www.youtube.com/watch?v=Xy4EkaXyEs4
Meetup: http://www.meetup.com/Devops-Brisbane/events/225050723/
This slide deck describes some of the best practices found when running Oracle Database inside a Docker container. Those best practices are general observations collected over time and may not reflect your actual environment or current situation.
Typesafe trainer and consultant Will Sargent describes just how Play Framework is so "fast" for Java and Scala production apps.
More Play, Akka, Scala and Apache Spark webinars, presentations, and videos:
http://typesafe.com/resources/videos
Caching has been an essential strategy for greater performance in computing since the beginning of the field. Nearly all applications have data access patterns that make caching an attractive technique, but caching also has hidden trade-offs related to concurrency, memory usage, and latency.
As we build larger distributed systems, caching continues to be a critical technique for building scalable, high-throughput, low-latency applications. Large systems tend to magnify the caching trade-offs and have created new approaches to distributed caching. There are unique challenges in testing systems like these as well.
Ehcache and Terracotta provide a unique way to start with simple caching for a small system and grow that system over time with a consistent API while maintaining low-latency, high-throughput caching.
Cassandra ne permet ni jointure, ni agrégats et limite drastiquement vos capacités à requêter vos données pour permettre une scalabilité linéaire dans une architecture masterless. L'outil de choix pour effectuer des traitements analytiques sur vos tables Cassandra est Spark mais ce dernier complexifie des opérations pourtant simples en SQL. SparkSQL permet de retrouver une syntaxe SQL dans Spark et nous allons voir comment l'utiliser en Scala, Java et en Python pour travailler sur des tables Cassandra, et retrouver jointures et agrégats (entre autres).
AtlasCamp 2015: Get your add-on in shape for Data CenterAtlassian
Michael Heemskerk
The launch of JIRA, Confluence and Stash Data Center opens up both new opportunities and new challenges for add-on developers. Data Center installations consist of multiple clustered nodes that serve thousands of users. Data Center add-ons need to cope with distributed state, high load, large numbers of users, issues, pages or repositories. Join Michael Heemskerk, Stash architect, to learn all about the challenges, best practices and frameworks that are available for add-on developers that want to make their add-on ready for Data Center.
JEEConf 2019 | Let’s build a Java backend designed for a high loadAlex Moskvin
We’ll consider a process of building specific Java backend application that must survive a high load.
During the talk we’ll analyse requirements, pick up a technical stack and then while discussing specific functionality go through typical major painful points that specifically Java developers are not taking into account that eventually result in inability to run such application at large scale, breach response times, crash via OOM and some others. Following the discussion about potential architectural flaws, we’ll also discuss recommended approaches, techniques and solutions for tackling those.
Overcoming 5 Common Docker Challenges: How We Do It at RightScaleRightScale
We highlight solutions to common Docker challenges that you may encounter as you move from initial experiments toward full-fledged Docker adoption. At RightScale, we’ve been sharing our lessons learned as we move toward a fully containerized environment leveraging a “sea of containers.” We’re now in the middle stages of that journey and will share some of the challenges we’ve encountered and how we’ve overcome them.
Today, most any application can be “Dockerized.” However, there are special challenges when deploying a distributed application such as Spark on containers. This session will describe how to overcome these challenges in deploying Spark on Docker containers, with many practical tips and techniques for running Spark in a container environment.
Containers are typically used to run stateless applications on a single host. There are significant real-world enterprise requirements that need to be addressed when running a stateful, distributed application in a secure multi-host container environment.
There are decisions that need to be made concerning which tools and infrastructure to use. There are many choices with respect to container managers, orchestration frameworks, and resource schedulers that are readily available today and some that may be available tomorrow including:]
• Mesos
• Kubernetes
• Docker Swarm
Each has its own strengths and weaknesses; each has unique characteristics that may make it suitable, or unsuitable, for Spark. Understanding these differences is critical to the successful deployment of Spark on Docker containers.
This session will describe the work done by the BlueData engineering team to run Spark inside containers, on a distributed platform, including the evaluation of various orchestration frameworks and lessons learned. You will learn how to apply practical networking and storage techniques to achieve high performance and agility in a distributed, container environment.
Speaker
Thomas Phelan, Chief Architect, Blue Data, Inc
Abstract
Provisioning dev environments is often a slow, complicated and manual process. Often devs simply don’t have the diskspace.
You can solve many of the these problems with virtualisation technologies and source controlled powershell scripts.
We’ll show you how by talking you through:
DOCKER CONTAINERS
1. Defining containers
2. Configuring Windows Server 2016 to run containers
3. Running SQL Server containers
4. Creating custom container images
5. Sharing container images
REDGATE CLONES
6. Defining database clones
7. Masking data
8. Creating database images from backups or live databases
WINDOCKS
9. Containers and clones working together
The session will explain concepts via slides which will be backed up by demos.
References:
http://dlmconsultants.com/containers-and-clones/
Chris Ward - Understanding databases for distributed docker applications - No...NoSQLmatters
In this talk we'll focus on the use of Crate alongside Weave in Docker containers, the technical challenges, best practices learned, and getting a big data application running alongside it. You'll learn about the reasons why Crate.IO is building "yet another NoSQL database" and why it's unique and important when running web scale containerized applications. We'll show why the shared-nothing architecture is so important when deploying large clusters in containers and how it addresses the issues and fears of a Docker-based persistence layer. You will learn how to deploy a Crate cluster in the cloud within minutes using Docker, some of the challenges you'll encounter, and how to overcome them in order to scale your backends efficiently. We focused on super simple integration with any cloud provider, striving it to be as turnkey as possible with minimal up-front configuration required to establish a cluster. Once established, we'll show how to scale the cluster horizontally by simply adding more nodes. The session will also give you examples when you should use Crate compared to other similar technologies such as MongoDB, Hadoop, Cassandra or FoundationDB. We'll talk about this approach's strengths and what types of applications are well-suited for this type of data store, as well what is not. Finally we'll outline how to architect an application that is easy to scale using Crate and Docker.
Similar to Caching 101: Caching on the JVM (and beyond) (20)
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
1. @alexsnaps @ljacomet#Devoxx #cache101
Caching 101
Caching on the JVM
and beyond
alex snaps - principal software engineer at terracotta
louis jacomet - lead software engineer at terracotta
3. @alexsnaps @ljacomet#Devoxx #cache101
Introductory Poll
•Who knows nothing about caching?
•Who already uses caching in production?
•Who had caching related problems in
production?
•Who is interested in advanced caching patterns?
7. @alexsnaps @ljacomet#Devoxx #cache101
Cache coherence
• P write A to X, P read X => returns A when no writes
happened in between, must be valid for single processor too
• P2 writes A to X, P1 reads X => returns A if “enough time” has
elapsed
• P1 writes A to X, P2 writes B to X => no one can ever see B
then A at X
9. @alexsnaps @ljacomet#Devoxx #cache101
Numbers explained
L1 cache reference 1 s 2 heartbeats
Branch mispredict 3 s Yawn
L2 cache reference 4 s (Longer) Yawn
Mutex lock / unlock 17 s Coffee preparation
Main memory reference 100 s Brushing your teeth
Send 2k over commodity network 4.2 s A song
10. @alexsnaps @ljacomet#Devoxx #cache101
Numbers explained
Compress 1kB with zippy 33 m Sitcom episode
Read 1MB from memory 2 h Bad commute
SSD random read 4 h Half day at work
Read 1MB from SSD 2.6 d Long week-end
Round trip in datacenter 10 d Two weeks delivery
Packet roundtrip CA to Be 4.8 y PhD thesis
14. @alexsnaps @ljacomet#Devoxx #cache101
Further reading
• Gustafson’s law states that
computations involving
arbitrarily large data sets
can be efficiently
parallelized.
• As computing powers
grow, larger problems can
be solved in a given time
16. @alexsnaps @ljacomet#Devoxx #cache101
What is a cache in an application?
• Data structure holding a temporary copy of some data
• Trade off between higher memory usage for reduced latency
• Targets:
• Data which is reused
• Data which is expensive to compute or retrieve
19. @alexsnaps @ljacomet#Devoxx #cache101
JSR 107
• Java Community Process driven standard
• Specifies API and semantics for temporary,
in-memory caching of Java objects, including object creation,
shared access, spooling, invalidation, and consistency across JVM's
26. @alexsnaps @ljacomet#Devoxx #cache101
Cache aside conclusions
• Solution most seen out there: Spring, Play, Grails, …
• Most often based on annotations
• Tricky to get the concurrency and / or atomicity right
• Especially when rolling your own
• Does not resolve doing multiple real invocations when warming cache
33. @alexsnaps @ljacomet#Devoxx #cache101
Cache through conclusions
• Requires different abstraction / modelling
• Viewing the system of record through the cache may not be easy
• Provides better guarantees and consistency as invalidation is no
longer required
• Still falls apart if data is modified by other applications / processes
• Cost of writing is paid by thread putting in the cache
39. @alexsnaps @ljacomet#Devoxx #cache101
Write behind conclusions
• Scales out your writes
• Batching and coalescing
• Persistent queue or not
• Idempotent operations are important in distributed and failure
conditions
45. @alexsnaps @ljacomet#Devoxx #cache101
Moving off heap
• JVMs have issues with large heaps due to garbage collection
• OffHeap is a nice alternative for data which has a well known
lifecycle
• Perfect match for cache data
• Performance hit due to binary representation compared to
objects on heap
47. @alexsnaps @ljacomet#Devoxx #cache101
Clustering
• Cache shared between multiple machines
• Different topologies
• Peer to peer
• Client server - with or without client cache
• Consistency becomes a much harder problem
51. @alexsnaps @ljacomet#Devoxx #cache101
Probabilistically Bounded Staleness
• In how “long” will a write be “readable” by all ?
• How many “old versions” are still around ?
• Depending on
• Network delay
• Node processing time
• … delayed replication
(e.g. batching)
• LinkedIn’s data stores returned consistent data 99.9 percent of the time
within 13.6 ms, and on SSDs (solid-state drives) within 1.63 ms.