The document discusses LDAP at Lightning Speed and provides information about LMDB, a new key-value database created by Howard Chu that is optimized for LDAP backends. LMDB uses a single-level store design with memory-mapped files and copy-on-write to provide fully transactional, ACID-compliant access without the need for caching, locking, or write-ahead logging. It has a simple configuration and outperforms Berkeley DB while using a fraction of the code size.
CGI SENEGAL annoté - Version Juin 2022.pdfMorGaye1
Le code général des impôts du Sénégal mis à jour en juin 2022. Le code est annoté. La circulaire d'application de la loi portant CGI ainsi que les arrêtés d'application sont insérés.
Etat des lieux et enjeux liés au RGPD (Règlement Général de la Protection des Données) aussi appelé GDPR (General Data Protection Regulation) en anglais.
Approche concrètes des enjeux pour les entreprises sur la mise en conformité avant le 25 mai 2018.
Augmentation de capital par apports nouveauxZak Area
Augmenter le capital en contrepartie d'apports nouveaux (en numéraire ou en nature), c'est mettre des moyens supplémentaires à la disposition de la société.
La traduction comptable de ces opérations est relativement simple : finalement, on débitera des comptes d'actif (moyens apportés) par le crédit des comptes de capitaux propres.
Le traitement financier de la rémunération des apports peut se faire selon deux modalités possibles :
- soit augmenter la valeur nominale des titres existants : les apports sont alors l'apanage des anciens associés, situation fréquente dans les sociétés de personnes, dans lesquelles les titres ne sont pas négociables. Par contre, cette situation est rare dans les sociétés de capitaux car l'unanimité des associés est requise pour modifier la valeur nominale des titres ;
- soit mettre de nouveaux titres en circulation : cela permet l'entrée éventuelle de nouveaux associés, notamment dans les sociétés de capitaux, ce qui entraîne des difficultés quant à la protection des intérêts des anciens associés. C'est cette dernière modalité, fréquemment mise en œuvre, que nous envisagerons ici dans le cadre des sociétés par actions.
CGI SENEGAL annoté - Version Juin 2022.pdfMorGaye1
Le code général des impôts du Sénégal mis à jour en juin 2022. Le code est annoté. La circulaire d'application de la loi portant CGI ainsi que les arrêtés d'application sont insérés.
Etat des lieux et enjeux liés au RGPD (Règlement Général de la Protection des Données) aussi appelé GDPR (General Data Protection Regulation) en anglais.
Approche concrètes des enjeux pour les entreprises sur la mise en conformité avant le 25 mai 2018.
Augmentation de capital par apports nouveauxZak Area
Augmenter le capital en contrepartie d'apports nouveaux (en numéraire ou en nature), c'est mettre des moyens supplémentaires à la disposition de la société.
La traduction comptable de ces opérations est relativement simple : finalement, on débitera des comptes d'actif (moyens apportés) par le crédit des comptes de capitaux propres.
Le traitement financier de la rémunération des apports peut se faire selon deux modalités possibles :
- soit augmenter la valeur nominale des titres existants : les apports sont alors l'apanage des anciens associés, situation fréquente dans les sociétés de personnes, dans lesquelles les titres ne sont pas négociables. Par contre, cette situation est rare dans les sociétés de capitaux car l'unanimité des associés est requise pour modifier la valeur nominale des titres ;
- soit mettre de nouveaux titres en circulation : cela permet l'entrée éventuelle de nouveaux associés, notamment dans les sociétés de capitaux, ce qui entraîne des difficultés quant à la protection des intérêts des anciens associés. C'est cette dernière modalité, fréquemment mise en œuvre, que nous envisagerons ici dans le cadre des sociétés par actions.
Redis Developers Day 2014 - Redis Labs TalksRedis Labs
These are the slides that the Redis Labs team had used to accompany the session that we gave during the first ever Redis Developers Day on October 2nd, 2014, London. It includes some of the ideas we've come up with to tackle operational challenges in the hyper-dense, multi-tenants Redis deployments that our service - Redis Cloud - consists of.
Aujourd’hui, un nombre important de nos clients ont franchi le cap et utilisent Oracle 12c pour des applications en production. Parmi tous ces clients, nous constatons de plus en plus un intérêt prononcé pour l’option Multitenant.
Même si le passage à l’option Multitenant ne présente pas de difficultés en soi, il existe de nombreux pièges à éviter ainsi que quelques points à clarifier, notamment sur les aspects relatifs à la performance.
En abordant un cas concret avec l’un de nos clients, cette présentation vous apportera des éléments de réponses afin de vous permettre d’envisager sereinement la mise en œuvre de cette option qui constitue un changement important et très intéressant d’un point vue architecture.
Database as a Service on the Oracle Database Appliance PlatformMaris Elsins
Speaker: Marc Fielding, Co-speaker: Maris Elsins.
Oracle Database Appliance provides a robust, highly-available, cost-effective, and surprisingly scalable platform for database as a service environment. By leveraging Oracle Enterprise Manager's self-service features, databases can be provisioned on a self-service basis to a cluster of Oracle Database Appliance machines. Discover how multiple ODA devices can be managed together to provide both high availability and incremental, cost-effective scalability. Hear real-world lessons learned from successful database consolidation implementations.
DbB 10 Webcast #3 The Secrets Of ScalabilityLaura Hood
The third in the Migration Month webcast series looking at DB2 10 migration planning. This webcast goes into the scalability benefits available in DB2 10, with Julian Stuhler of Triton Consulting & Jeff Josten of IBM.
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...HostedbyConfluent
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam K Dey | Current 2022
Robinhood’s mission is to democratize finance for all. Data driven decision making is key to achieving this goal. Data needed are hosted in various OLTP databases. Replicating this data near real time in a reliable fashion to data lakehouse powers many critical use cases for the company. In Robinhood, CDC is not only used for ingestion to data-lake but is also being adopted for inter-system message exchanges between different online micro services. .
In this talk, we will describe the evolution of change data capture based ingestion in Robinhood not only in terms of the scale of data stored and queries made, but also the use cases that it supports. We will go in-depth into the CDC architecture built around our Kafka ecosystem using open source system Debezium and Apache Hudi. We will cover online inter-system message exchange use-cases along with our experience running this service at scale in Robinhood along with lessons learned.
Best practices for DB2 for z/OS log based recoveryFlorence Dubois
The need to perform a DB2 log-based recovery of multiple objects is a very rare event, but statistically, it is more frequent than a true disaster recovery event (flood, fire, etc). Taking regular backups is necessary but far from sufficient for anything beyond minor application recovery. If not prepared, practiced and optimised, it can lead to extended application service downtimes – possibly many hours to several days. This presentation will provide many hints and tips on how to plan, design intelligently, stress test and optimise DB2 log-based recovery.
An Elastic Metadata Store for eBay’s Media PlatformMongoDB
In order to build a robust, multi-tenant, highly available storage services that meet the business’ SLA your databases has to be sharded. But if your service has to scale continuously through the incremental additions of storage without service interruption or human intervention, basic static sharding is not enough. At eBay, we are building MStore to solve this problem, with MongoDB as the storage engine. In this presentation, we will dive into the key design concepts of this solution.
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureDatabricks
Shuffle in Apache Spark is an intermediate phrase redistributing data across computing units, which has one important primitive that the shuffle data is persisted on local disks. This architecture suffers from some scalability and reliability issues. Moreover, the assumptions of collocated storage do not always hold in today’s data centers. The hardware trend is moving to disaggregated storage and compute architecture for better cost efficiency and scalability.
To address the issues of Spark shuffle and support disaggregated storage and compute architecture, we implemented a new remote Spark shuffle manager. This new architecture writes shuffle data to a remote cluster with different Hadoop-compatible filesystem backends.
Firstly, the failure of compute nodes will no longer cause shuffle data recomputation. Spark executors can also be allocated and recycled dynamically which results in better resource utilization.
Secondly, for most customers currently running Spark with collocated storage, it is usually challenging for them to upgrade the disks on every node to latest hardware like NVMe SSD and persistent memory because of cost consideration and system compatibility. With this new shuffle manager, they are free to build a separated cluster storing and serving the shuffle data, leveraging the latest hardware to improve the performance and reliability.
Thirdly, in HPC world, more customers are trying Spark as their high performance data analytics tools, while storage and compute in HPC clusters are typically disaggregated. This work will make their life easier.
In this talk, we will present an overview of the issues of the current Spark shuffle implementation, the design of new remote shuffle manager, and a performance study of the work.
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/39NIjLV.
Akhilesh Gupta does a technical deep-dive into how Linkedin uses the Play/Akka Framework and a scalable distributed system to enable live interactions like likes/comments at massive scale at extremely low costs across multiple data centers. Filmed at qconlondon.com.
Akhilesh Gupta is the technical lead for LinkedIn's Real-time delivery infrastructure and LinkedIn Messaging. He has been working on the revamp of LinkedIn’s offerings to instant, real-time experiences. Before this, he was the head of engineering for the Ride Experience program at Uber Technologies in San Francisco.
Next Generation Client APIs in Envoy MobileC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2x0Fav8.
Jose Nino guides the audience through the journey of Mobile APIs at Lyft. He focuses on how the team has reaped the benefits of API generation to experiment with the network transport layer. He also discusses recent developments the team has made with Envoy Mobile and the roadmap ahead. Filmed at qconlondon.com.
Jose Nino works as a Software Engineer at Lyft.
Redis Developers Day 2014 - Redis Labs TalksRedis Labs
These are the slides that the Redis Labs team had used to accompany the session that we gave during the first ever Redis Developers Day on October 2nd, 2014, London. It includes some of the ideas we've come up with to tackle operational challenges in the hyper-dense, multi-tenants Redis deployments that our service - Redis Cloud - consists of.
Aujourd’hui, un nombre important de nos clients ont franchi le cap et utilisent Oracle 12c pour des applications en production. Parmi tous ces clients, nous constatons de plus en plus un intérêt prononcé pour l’option Multitenant.
Même si le passage à l’option Multitenant ne présente pas de difficultés en soi, il existe de nombreux pièges à éviter ainsi que quelques points à clarifier, notamment sur les aspects relatifs à la performance.
En abordant un cas concret avec l’un de nos clients, cette présentation vous apportera des éléments de réponses afin de vous permettre d’envisager sereinement la mise en œuvre de cette option qui constitue un changement important et très intéressant d’un point vue architecture.
Database as a Service on the Oracle Database Appliance PlatformMaris Elsins
Speaker: Marc Fielding, Co-speaker: Maris Elsins.
Oracle Database Appliance provides a robust, highly-available, cost-effective, and surprisingly scalable platform for database as a service environment. By leveraging Oracle Enterprise Manager's self-service features, databases can be provisioned on a self-service basis to a cluster of Oracle Database Appliance machines. Discover how multiple ODA devices can be managed together to provide both high availability and incremental, cost-effective scalability. Hear real-world lessons learned from successful database consolidation implementations.
DbB 10 Webcast #3 The Secrets Of ScalabilityLaura Hood
The third in the Migration Month webcast series looking at DB2 10 migration planning. This webcast goes into the scalability benefits available in DB2 10, with Julian Stuhler of Triton Consulting & Jeff Josten of IBM.
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...HostedbyConfluent
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam K Dey | Current 2022
Robinhood’s mission is to democratize finance for all. Data driven decision making is key to achieving this goal. Data needed are hosted in various OLTP databases. Replicating this data near real time in a reliable fashion to data lakehouse powers many critical use cases for the company. In Robinhood, CDC is not only used for ingestion to data-lake but is also being adopted for inter-system message exchanges between different online micro services. .
In this talk, we will describe the evolution of change data capture based ingestion in Robinhood not only in terms of the scale of data stored and queries made, but also the use cases that it supports. We will go in-depth into the CDC architecture built around our Kafka ecosystem using open source system Debezium and Apache Hudi. We will cover online inter-system message exchange use-cases along with our experience running this service at scale in Robinhood along with lessons learned.
Best practices for DB2 for z/OS log based recoveryFlorence Dubois
The need to perform a DB2 log-based recovery of multiple objects is a very rare event, but statistically, it is more frequent than a true disaster recovery event (flood, fire, etc). Taking regular backups is necessary but far from sufficient for anything beyond minor application recovery. If not prepared, practiced and optimised, it can lead to extended application service downtimes – possibly many hours to several days. This presentation will provide many hints and tips on how to plan, design intelligently, stress test and optimise DB2 log-based recovery.
An Elastic Metadata Store for eBay’s Media PlatformMongoDB
In order to build a robust, multi-tenant, highly available storage services that meet the business’ SLA your databases has to be sharded. But if your service has to scale continuously through the incremental additions of storage without service interruption or human intervention, basic static sharding is not enough. At eBay, we are building MStore to solve this problem, with MongoDB as the storage engine. In this presentation, we will dive into the key design concepts of this solution.
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureDatabricks
Shuffle in Apache Spark is an intermediate phrase redistributing data across computing units, which has one important primitive that the shuffle data is persisted on local disks. This architecture suffers from some scalability and reliability issues. Moreover, the assumptions of collocated storage do not always hold in today’s data centers. The hardware trend is moving to disaggregated storage and compute architecture for better cost efficiency and scalability.
To address the issues of Spark shuffle and support disaggregated storage and compute architecture, we implemented a new remote Spark shuffle manager. This new architecture writes shuffle data to a remote cluster with different Hadoop-compatible filesystem backends.
Firstly, the failure of compute nodes will no longer cause shuffle data recomputation. Spark executors can also be allocated and recycled dynamically which results in better resource utilization.
Secondly, for most customers currently running Spark with collocated storage, it is usually challenging for them to upgrade the disks on every node to latest hardware like NVMe SSD and persistent memory because of cost consideration and system compatibility. With this new shuffle manager, they are free to build a separated cluster storing and serving the shuffle data, leveraging the latest hardware to improve the performance and reliability.
Thirdly, in HPC world, more customers are trying Spark as their high performance data analytics tools, while storage and compute in HPC clusters are typically disaggregated. This work will make their life easier.
In this talk, we will present an overview of the issues of the current Spark shuffle implementation, the design of new remote shuffle manager, and a performance study of the work.
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/39NIjLV.
Akhilesh Gupta does a technical deep-dive into how Linkedin uses the Play/Akka Framework and a scalable distributed system to enable live interactions like likes/comments at massive scale at extremely low costs across multiple data centers. Filmed at qconlondon.com.
Akhilesh Gupta is the technical lead for LinkedIn's Real-time delivery infrastructure and LinkedIn Messaging. He has been working on the revamp of LinkedIn’s offerings to instant, real-time experiences. Before this, he was the head of engineering for the Ride Experience program at Uber Technologies in San Francisco.
Next Generation Client APIs in Envoy MobileC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2x0Fav8.
Jose Nino guides the audience through the journey of Mobile APIs at Lyft. He focuses on how the team has reaped the benefits of API generation to experiment with the network transport layer. He also discusses recent developments the team has made with Envoy Mobile and the roadmap ahead. Filmed at qconlondon.com.
Jose Nino works as a Software Engineer at Lyft.
Software Teams and Teamwork Trends Report Q1 2020C4Media
How do we cope with an environment that has been radically disrupted, where people are suddenly thrust into remote work in a chaotic state? What are the emerging good practices and new ideas that are shaping the way in which software development teams work? What can we do to make the workplace a more secure and diverse one while increasing the productivity of our teams? This report aims to assist technical leaders in making mid- to long-term decisions that will have a positive impact on their organisations and teams and help individual contributors find the practices, approaches, tools, techniques, and frameworks that can help them get a better experience at work - irrespective of where they are working from.
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2QCmmJ0.
Mark Stoodley examines some of the strengths and weaknesses of the different Java compilation technologies, if one was to apply them in isolation. Stoodley discusses how production JVMs are assembling a combination of these tools that work together to provide excellent performance across the large spectrum of applications written in Java and JVM based languages. Filmed at qconsf.com.
Mark Stoodley joined IBM Canada to build Java JIT compilers for production use and led the team that delivered AOT compilation in the IBM SDK for Java 6. He spent the last five years leading the effort to open source nearly 4.3 million lines of source code from the IBM J9 Java Virtual Machine to create the two open source projects Eclipse OMR and Eclipse OpenJ9, and now co-leads both projects.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2y2yPiS.
Colin McCabe talks about the ongoing effort to replace the use of Zookeeper in Kafka: why they want to do it and how it will work. He discusses the limitations they have found and how Kafka benefits both in terms of stability and scalability by bringing consensus in house. He talks about their progress, what work is remaining, and how contributors can help. Filmed at qconsf.com.
Colin McCabe is a Kafka committer at Confluent, working on the scalability and extensibility of Kafka. Previously, he worked on the Hadoop Distributed Filesystem and the Ceph Filesystem.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2SXXXiD.
Katharina Probst talks about what it means to act like an owner and why teams need ownership to be high-performing. When team members, regardless of whether they have a formal leadership role or not, act like owners, magical things can happen. She shares ideas that we can apply to our own work, and talks about how to recognize when we don’t live up to our own expectations of acting like an owner. Filmed at qconsf.com.
Katharina Probst is a Senior Engineering Leader, Kubernetes & SaaS at Google. Before this, she was leading engineering teams at Netflix, being responsible for the Netflix API, which helps bring Netflix streaming to millions of people around the world. Prior to joining Netflix, she was in the cloud computing team at Google, where she saw cloud computing from the provider side.
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2T04Lw4.
Sergey Kuksenko talks about the performance benefits inline types bring to Java and how to exploit them. Inline/value types are the key part of experimental project Valhalla, which should bring new abilities to the Java language. Filmed at qconsf.com.
Sergey Kuksenko is a Java Performance Engineer at Oracle working on a variety of Java and JVM performance enhancements. He started working as Java Engineer in 1996 and as Java Performance Engineer in 2005. He has had a passion for exploring how Java works on modern hardware.
Do you need service meshes in your tech stack?
This on-line guide aims to answer pertinent questions for software architects and technical leaders, such as: what is a service mesh?, do I need a service mesh?, how do I evaluate the different service mesh offerings? In software architecture, a service mesh is a dedicated infrastructure layer for facilitating service-to-service communications between microservices, often using a sidecar proxy.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2UgQ3BU.
Christie Wilson describes what to expect from CI/CD in 2019, and how Tekton is helping bring that to as many tools as possible, such as Jenkins X and Prow. Wilson talks about Tekton itself and performs a live demo that shows how cloud native CI/CD can help debug, surface and fix mistakes faster. Filmed at qconsf.com.
Christie Wilson is a software engineer at Google, currently leading the Tekton project. Over the past decade, she has worked in the mobile, financial and video game industries. Prior to working at Google she led a team of software developers to build load testing tools for AAA video game titles, and founded the Vancouver chapter of PyLadies.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2S7lDiS.
Sasha Rosenbaum shows how a CI/CD pipeline for Machine Learning can greatly improve both productivity and reliability. Filmed at qconsf.com.
Sasha Rosenbaum is a Program Manager on the Azure DevOps engineering team, focused on improving the alignment of the product with open source software. She is a co-organizer of the DevOps Days Chicago and the DeliveryConf conferences, and recently published a book on Serverless computing in Azure with .NET.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/36epVKg.
Todd Montgomery discusses the techniques and lessons learned from implementing Aeron Cluster. His focus is on how Raft can be implemented on Aeron, minimizing the network round trip overhead, and comparing single process to a fully distributed cluster. Filmed at qconsf.com.
Todd Montgomery is a networking hacker who has researched, designed, and built numerous protocols, messaging-oriented middleware systems, and real-time data systems, done research for NASA, contributed to the IETF and IEEE, and co-founded two startups. He currently works as an independent consultant and is active in several open source projects.
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2FWc5Sk.
Ben Sigelman talks about "Deep Systems", their common properties and re-introduces the fundamentals of control theory from the 1960s, including the original conceptualizations of Observability & Controllability. He uses examples from Google & other companies to illustrate how deep systems have damaged people's ability to observe software, and what needs to be done in order to regain control. Filmed at qconsf.com.
Ben Sigelman is a co-founder and the CEO at LightStep, a co-creator of Dapper (Google’s distributed tracing system), and co-creator of the OpenTracing and OpenTelemetry projects (both part of the CNCF). His work and interests gravitate towards observability, especially where microservices, high transaction volumes, and large engineering organizations are involved.
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/39SddUL.
Victor Dibia provides a friendly introduction to machine learning, covers concrete steps on how front-end developers can create their own ML models and deploy them as part of web applications. He discusses his experience building Handtrack.js - a library for prototyping real time hand tracking interactions in the browser. Filmed at qconsf.com.
Victor Dibia is a Research Engineer with Cloudera’s Fast Forward Labs. Prior to this, he was a Research Staff Member at the IBM TJ Watson Research Center, New York. His research interests are at the intersection of human computer interaction, computational social science, and applied AI.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2s9T3Vl.
Colin Eberhardt looks at some of the internals of WebAssembly, explores how it works “under the hood”, and looks at how to create a (simple) compiler that targets this runtime. Filmed at qconsf.com.
Colin Eberhardt is the Technology Director at Scott Logic, a UK-based software consultancy where they create complex application for their financial services clients. He is an avid technology enthusiast, spending his evenings contributing to open source projects, writing blog posts and learning as much as he can.
User & Device Identity for Microservices @ Netflix ScaleC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2S9tOgy.
Satyajit Thadeshwar provides useful insights on how Netflix implemented a secure, token-agnostic, identity solution that works with services operating at a massive scale. He shares some of the lessons learned from this process, both from architectural diagrams and code. Filmed at qconsf.com.
Satyajit Thadeshwar is an engineer on the Product Edge Access Services team at Netflix, where he works on some of the most critical services focusing on user and device authentication. He has more than a decade of experience building fault-tolerant and highly scalable, distributed systems.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2Ezs08q.
Justin Ryan talks about Netflix’ scalability issues and some of the ways they addressed it. He shares successes they’ve had from unintuitively partitioning computation into multiple services to get better runtime characteristics. He introduces us to useful probabilistic data structures, innovative bi-directional data passing, open-source projects available from Netflix that make this all possible. Filmed at qconsf.com.
Justin Ryan is Playback Edge Engineering at Netflix. He works on some of the most critical services at Netflix, specifically focusing on user and device authentication. Years of building developer tools has also given him a healthy set of opinions on developer productivity.
Make Your Electron App Feel at Home EverywhereC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2Z4ZJjn.
Kilian Valkhof discusses the process of making an Electron app feel at home on all three platforms: Windows, MacOS and Linux, making devs aware of the pitfalls and how to avoid them. Filmed at qconsf.com.
Kilian Valkhof is a Front-end Developer & User-experience Designer at Firstversionist. He writes about various topics, from design to machine learning, on his personal website, kilianvalkhof.com and is a frequent contributer to open source software. He is part of the Electron governance team that oversees the development of the Electron framework.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/344PnB1.
Steve Klabnik goes over the deep details of how async/await works in Rust, covering concepts like coroutines, generators, stack-less vs stack-ful, "pinning", and more. Filmed at qconsf.com.
Steve Klabnik is on the core team of Rust, leads the documentation team, and is an author of "The Rust Programming Language." He is a frequent speaker at conferences and is a prolific open source contributor, previously working on projects such as Ruby and Ruby on Rails.
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2OUz6dt.
Chris Riccomini talks about the current state-of-the-art in data pipelines and data warehousing, and shares some of the solutions to current problems dealing with data streaming and warehousing. Filmed at qconsf.com.
Chris Riccomini works as a Software Engineer at WePay.
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2rm4hFD.
Yevgeniy Brikman talks about how to write automated tests for infrastructure code, including the code written for use with tools such as Terraform, Docker, Packer, and Kubernetes. Topics covered include: unit tests, integration tests, end-to-end tests, dependency injection, test parallelism, retries and error handling, static analysis, property testing and CI / CD for infrastructure code. Filmed at qconsf.com.
Yevgeniy Brikman is the co-founder of Gruntwork, a company that provides DevOps as a Service. He is the author of two books published by O'Reilly Media: Hello, Startup and Terraform: Up & Running. Previously, he worked as a software engineer at LinkedIn, TripAdvisor, Cisco Systems, and Thomson Financial.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
LDAP at Lightning Speed
1. LDAP at Lightning Speed
Howard Chu
CTO, Symas Corp. hyc@symas.com
Chief Architect, OpenLDAP hyc@openldap.org
2015-03-05
2. InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/lmdb
3. Presented at QCon London
www.qconlondon.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
4. 2
OpenLDAP Project
● Open source code project
● Founded 1998
● Three core team members
● A dozen or so contributors
● Feature releases every 12-18 months
● Maintenance releases as needed
5. 3
A Word About Symas
● Founded 1999
● Founders from Enterprise Software world
– platinum Technology (Locus Computing)
– IBM
● Howard joined OpenLDAP in 1999
– One of the Core Team members
– Appointed Chief Architect January 2007
● No debt, no VC investments: self-funded
6. 4
Intro
● Howard Chu
– Founder and CTO Symas Corp.
– Developing Free/Open Source software since
1980s
● GNU compiler toolchain, e.g. "gmake -j", etc.
● Many other projects...
– Worked for NASA/JPL, wrote software for Space
Shuttle, etc.
8. 6
(1) Background
● API inspired by Berkeley DB (BDB)
– OpenLDAP has used BDB extensively since 1999
– Deep experience with pros and cons of BDB design
and implementation
– Omits BDB features that were found to be of no benefit
● e.g. extensible hashing
– Avoids BDB characteristics that were problematic
● e.g. cache tuning, complex locking, transaction logs,
recovery
9. 7
(2) Features
LMDB At A Glance
● Key/Value store using B+trees
● Fully transactional, ACID compliant
● MVCC, readers never block
● Uses memory-mapped files, needs no tuning
● Crash-proof, no recovery needed after restart
● Highly optimized, extremely compact
– under 40KB object code, fits in CPU L1 I$
● Runs on most modern OSs
– Linux, Android, *BSD, MacOSX, iOS, Solaris, Windows, etc...
10. 8
Features
● Concurrency Support
– Both multi-process and multi-thread
– Single Writer + N readers
● Writers don't block readers
● Readers don't block writers
● Reads scale perfectly linearly with available CPUs
● No deadlocks
– Full isolation with MVCC - Serializable
– Nested transactions
– Batched writes
11. 9
Features
● Uses Copy-on-Write
– Live data is never overwritten
– DB structure cannot be corrupted by incomplete
operations (system crashes)
– No write-ahead logs needed
– No transaction log cleanup/maintenance
– No recovery needed after crashes
12. 10
Features
● Uses Single-Level Store
– Reads are satisfied directly from the memory map
● No malloc or memcpy overhead
– Writes can be performed directly to the memory map
● No write buffers, no buffer tuning
– Relies on the OS/filesystem cache
● No wasted memory in app-level caching
– Can store live pointer-based objects directly
● using a fixed address map
● minimal marshalling, no unmarshalling required
15. 13
Motivation
● BDB slapd backend always required careful,
complex tuning
– Data comes through 3 separate layers of caches
– Each layer has different size and speed traits
– Balancing the 3 layers against each other can be a
difficult juggling act
– Performance without the backend caches is
unacceptably slow - over an order of magnitude
16. 14
Motivation
● Backend caching significantly increased the
overall complexity of the backend code
– Two levels of locking required, since BDB database
locks are too slow
– Deadlocks occurring routinely in normal operation,
requiring additional backoff/retry logic
17. 15
Motivation
● The caches were not always beneficial, and were
sometimes detrimental
– Data could exist in 3 places at once - filesystem, DB,
and backend cache - wasting memory
– Searches with result sets that exceeded the configured
cache size would reduce the cache effectiveness to
zero
– malloc/free churn from adding and removing entries in
the cache could trigger pathological heap
fragmentation in libc malloc
18. 16
Obvious Solutions
● Cache management is a hassle, so don't do any
caching
– The filesystem already caches data; there's no
reason to duplicate the effort
● Lock management is a hassle, so don't do any
locking
– Use Multi-Version Concurrency Control (MVCC)
– MVCC makes it possible to perform reads with no
locking
19. 17
Obvious Solutions
● BDB supports MVCC, but still requires complex
caching and locking
● To get the desired results, we need to abandon BDB
● Surveying the landscape revealed no other DB
libraries with the desired characteristics
● Thus LMDB was created in 2011
– "Lightning Memory-Mapped Database"
– BDB is now deprecated in OpenLDAP
20. 18
Design Approach
● Based on the "Single-Level Store" concept
– Not new, first implemented in Multics in 1964
– Access a database by mapping the entire DB into
memory
– Data fetches are satisfied by direct reference to the
memory map; there is no intermediate page or
buffer cache
21. 19
Single-Level Store
● Only viable if process address spaces are
larger than the expected data volumes
– For 32 bit processors, the practical limit on data
size is under 2GB
– For common 64 bit processors which only
implement 48 bit address spaces, the limit is 47 bits
or 128 terabytes
– The upper bound at 63 bits is 8 exabytes
22. 20
Design Approach
● Uses a read-only memory map
– Protects the DB structure from corruption due to stray
writes in memory
– Any attempts to write to the map will cause a SEGV,
allowing immediate identification of software bugs
● Can optionally use a read-write mmap
– Slight performance gain for fully in-memory data sets
– Should only be used on fully-debugged application
code
23. 21
Design Approach
● Keith Bostic (BerkeleyDB author, personal email, 2008)
– "The most significant problem with building an mmap'd back-end is implementing
write-ahead-logging (WAL). (You probably know this, but just in case: the way
databases usually guarantee consistency is by ensuring that log records
describing each change are written to disk before their transaction commits, and
before the database page that was changed. In other words, log record X must hit
disk before the database page containing the change described by log record X.)
– In Berkeley DB WAL is done by maintaining a relationship between the database
pages and the log records. If a database page is being written to disk, there's a
look-aside into the logging system to make sure the right log records have already
been written. In a memory-mapped system, you would do this by locking modified
pages into memory (mlock), and flushing them at specific times (msync), otherwise
the VM might just push a database page with modifications to disk before its log
record is written, and if you crash at that point it's all over but the screaming."
24. 22
Design Approach
● Implement MVCC using copy-on-write
– In-use data is never overwritten, modifications are
performed by copying the data and modifying the copy
– Since updates never alter existing data, the DB
structure can never be corrupted by incomplete
modifications
● Write-ahead transaction logs are unnecessary
– Readers always see a consistent snapshot of the DB,
they are fully isolated from writers
● Read accesses require no locks
25. 23
MVCC Details
● "Full" MVCC can be extremely resource intensive
– DBs typically store complete histories reaching far back into time
– The volume of data grows extremely fast, and grows without
bound unless explicit pruning is done
– Pruning the data using garbage collection or compaction requires
more CPU and I/O resources than the normal update workload
● Either the server must be heavily over-provisioned, or updates must be
stopped while pruning is done
– Pruning requires tracking of in-use status, which typically
involves reference counters, which require locking
26. 24
Design Approach
● LMDB nominally maintains only two versions of the DB
– Rolling back to a historical version is not interesting for
OpenLDAP
– Older versions can be held open longer by reader transactions
●
LMDB maintains a free list tracking the IDs of unused
pages
– Old pages are reused as soon as possible, so data volumes
don't grow without bound
● LMDB tracks in-use status without locks
27. 25
Implementation Highlights
● LMDB library started from the append-only btree
code written by Martin Hedenfalk for his ldapd,
which is bundled in OpenBSD
– Stripped out all the parts we didn't need (page cache
management)
– Borrowed a couple pieces from slapd for expedience
– Changed from append-only to page-reclaiming
– Restructured to allow adding ideas from BDB that we
still wanted
28. 26
Implementation Highlights
● Resulting library was under 32KB of object
code
– Compared to the original btree.c at 39KB
– Compared to BDB at 1.5MB
● API is loosely modeled after the BDB API to
ease migration of back-bdb code
36. 34
Btree Operation
Pgno: 1
Misc...
offset: 4000
1,foo
Data Page
Pgno: 0
Misc...
Root : 1
Meta Page
Write-Ahead Logger
Add 1,foo to
page 1
Commit
Add 2,bar to
page 1
Write-Ahead Log
37. 35
Btree Operation
Pgno: 1
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 0
Misc...
Root : 1
Meta Page
Add 1,foo to
page 1
Commit
Add 2,bar to
page 1
Write-Ahead Log
Write-Ahead Logger
38. 36
Btree Operation
Pgno: 1
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 0
Misc...
Root : 1
Meta Page
Add 1,foo to
page 1
Commit
Add 2,bar to
page 1
Commit
Write-Ahead Log
Write-Ahead Logger
39. 37
Btree Operation
Pgno: 1
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Add 1,foo to
page 1
Commit
Add 2,bar to
page 1
Commit
Checkpoint
Write-Ahead Log
Write-Ahead Logger
Pgno: 1
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 0
Misc...
Root : 1
Meta Page
Pgno: 0
Misc...
Root : 1
Meta Page
40. 38
Btree Operation
How Append-Only/Copy-On-Write Works
● Updates are always performed bottom up
● Every branch node from the leaf to the root
must be copied/modified for any leaf update
● Any node not on the path from the leaf to the
root is unaltered
● The root node is always written last
49. 47
Btree Operation
In the Append-Only tree, new pages are always appended sequentially to
the DB file
● While there's significant overhead for making complete copies of
modified pages, the actual I/O is linear and relatively fast
● The root node is always the last page of the file, unless there was a
crash
● Any root node can be found by seeking backward from the end of the
file, and checking the page's header
● Recovery from a crash is relatively easy
– Everything from the last valid root to the beginning of the file is always pristine
– Anything between the end of the file and the last valid root is discarded
54. 52
Btree Operation
Append-Only
Pgno: 1
Misc...
offset: 4000
1,foo
Data Page
Pgno: 0
Misc...
Root : EMPTY
Meta Page
Pgno: 2
Misc...
Root : 1
Meta Page
Pgno: 3
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 4
Misc...
Root : 3
Meta Page
55. 53
Btree Operation
Append-Only
Pgno: 1
Misc...
offset: 4000
1,foo
Data Page
Pgno: 0
Misc...
Root : EMPTY
Meta Page
Pgno: 2
Misc...
Root : 1
Meta Page
Pgno: 3
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 4
Misc...
Root : 3
Meta Page
Pgno: 5
Misc...
offset: 4000
offset: 3000
2,bar
1,blah
Data Page
56. 54
Btree Operation
Append-Only
Pgno: 1
Misc...
offset: 4000
1,foo
Data Page
Pgno: 0
Misc...
Root : EMPTY
Meta Page
Pgno: 2
Misc...
Root : 1
Meta Page
Pgno: 3
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 4
Misc...
Root : 3
Meta Page
Pgno: 5
Misc...
offset: 4000
offset: 3000
2,bar
1,blah
Data Page
Pgno: 6
Misc...
Root : 5
Meta Page
57. 55
Btree Operation
Append-Only
Pgno: 1
Misc...
offset: 4000
1,foo
Data Page
Pgno: 0
Misc...
Root : EMPTY
Meta Page
Pgno: 2
Misc...
Root : 1
Meta Page
Pgno: 3
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 4
Misc...
Root : 3
Meta Page
Pgno: 5
Misc...
offset: 4000
offset: 3000
2,bar
1,blah
Data Page
Pgno: 6
Misc...
Root : 5
Meta Page
Pgno: 7
Misc...
offset: 4000
offset: 3000
2,xyz
1,blah
Data Page
58. 56
Btree Operation
Append-Only
Pgno: 1
Misc...
offset: 4000
1,foo
Data Page
Pgno: 0
Misc...
Root : EMPTY
Meta Page
Pgno: 2
Misc...
Root : 1
Meta Page
Pgno: 3
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 4
Misc...
Root : 3
Meta Page
Pgno: 5
Misc...
offset: 4000
offset: 3000
2,bar
1,blah
Data Page
Pgno: 6
Misc...
Root : 5
Meta Page
Pgno: 7
Misc...
offset: 4000
offset: 3000
2,xyz
1,blah
Data Page
Pgno: 8
Misc...
Root : 7
Meta Page
59. 57
Btree Operation
Append-Only disk usage is very inefficient
● Disk space usage grows without bound
● 99+% of the space will be occupied by old versions of
the data
● The old versions are usually not interesting
● Reclaiming the old space requires a very expensive
compaction phase
● New updates must be throttled until compaction
completes
60. 58
Btree Operation
The LMDB Approach
● Still Copy-on-Write, but using two fixed root nodes
– Page 0 and Page 1 of the file, used in double-buffer fashion
– Even faster cold-start than Append-Only, no searching
needed to find the last valid root node
– Any app always reads both pages and uses the one with
the greater Transaction ID stamp in its header
– Consequently, only 2 outstanding versions of the DB exist,
not fully "multi-version"
67. 65
Btree Operation
32
After this step the old yellow page is no longer referenced by
anything else in the database, so it can also be reclaimed
68. 66
Free Space Management
LMDB maintains two B+trees per root node
● One storing the user data, as illustrated above
● One storing lists of IDs of pages that have been freed
in a given transaction
● Old, freed pages are used in preference to new
pages, so the DB file size remains relatively static
over time
● No compaction or garbage collection phase is ever
needed
69. 67
Free Space Management
Pgno: 0
Misc...
TXN: 0
FRoot: EMPTY
DRoot: EMPTY
Meta Page
Pgno: 1
Misc...
TXN: 0
FRoot: EMPTY
DRoot: EMPTY
Meta Page
70. 68
Free Space Management
Pgno: 0
Misc...
TXN: 0
FRoot: EMPTY
DRoot: EMPTY
Meta Page
Pgno: 1
Misc...
TXN: 0
FRoot: EMPTY
DRoot: EMPTY
Meta Page
Pgno: 2
Misc...
offset: 4000
1,foo
Data Page
71. 69
Free Space Management
Pgno: 0
Misc...
TXN: 0
FRoot: EMPTY
DRoot: EMPTY
Meta Page
Pgno: 1
Misc...
TXN: 1
FRoot: EMPTY
DRoot: 2
Meta Page
Pgno: 2
Misc...
offset: 4000
1,foo
Data Page
72. 70
Free Space Management
Pgno: 0
Misc...
TXN: 0
FRoot: EMPTY
DRoot: EMPTY
Meta Page
Pgno: 1
Misc...
TXN: 1
FRoot: EMPTY
DRoot: 2
Meta Page
Pgno: 2
Misc...
offset: 4000
1,foo
Data Page
Pgno: 3
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
73. 71
Free Space Management
Pgno: 0
Misc...
TXN: 0
FRoot: EMPTY
DRoot: EMPTY
Meta Page
Pgno: 1
Misc...
TXN: 1
FRoot: EMPTY
DRoot: 2
Meta Page
Pgno: 2
Misc...
offset: 4000
1,foo
Data Page
Pgno: 3
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 4
Misc...
offset: 4000
txn 2,page 2
Data Page
74. 72
Free Space Management
Pgno: 0
Misc...
TXN: 2
FRoot: 4
DRoot: 3
Meta Page
Pgno: 1
Misc...
TXN: 1
FRoot: EMPTY
DRoot: 2
Meta Page
Pgno: 2
Misc...
offset: 4000
1,foo
Data Page
Pgno: 3
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 4
Misc...
offset: 4000
txn 2,page 2
Data Page
75. 73
Free Space Management
Pgno: 0
Misc...
TXN: 2
FRoot: 4
DRoot: 3
Meta Page
Pgno: 5
Misc...
offset: 4000
offset: 3000
2,bar
1,blah
Data Page
Pgno: 1
Misc...
TXN: 1
FRoot: EMPTY
DRoot: 2
Meta Page
Pgno: 2
Misc...
offset: 4000
1,foo
Data Page
Pgno: 3
Misc...
offset: 4000
offset: 3000
2,bar
1,foo
Data Page
Pgno: 4
Misc...
offset: 4000
txn 2,page 2
Data Page
81. 79
Free Space Management
● Caveat: If a read transaction is open on a
particular version of the DB, that version and
every version after it are excluded from page
reclaiming.
● Thus, long-lived read transactions should be
avoided, otherwise the DB file size may grow
rapidly, devolving into Append-Only behavior
until the transactions are closed
82. 80
Transaction Handling
● LMDB supports a single writer concurrent with many readers
– A single mutex serializes all write transactions
– The mutex is shared/multiprocess
● Readers run lockless and never block
– But for page reclamation purposes, readers are tracked
● Transactions are stamped with an ID which is a monotonically
increasing integer
– The ID is only incremented for Write transactions that actually modify
data
– If a Write transaction is aborted, or committed with no changes, the
same ID will be reused for the next Write transaction
83. 81
Transaction Handling
● Transactions take a snapshot of the currently valid
meta page at the beginning of the transaction
● No matter what write transactions follow, a read
transaction's snapshot will always point to a valid
version of the DB
● The snapshot is totally isolated from subsequent
writes
● This provides the Consistency and Isolation in ACID
semantics
84. 82
Transaction Handling
● The currently valid meta page is chosen based
on the greatest transaction ID in each meta
page
– The meta pages are page and CPU cache aligned
– The transaction ID is a single machine word
– The update of the transaction ID is atomic
– Thus, the Atomicity semantics of transactions are
guaranteed
85. 83
Transaction Handling
● During Commit, the data pages are written and
then synchronously flushed before the meta page
is updated
– Then the meta page is written synchronously
– Thus, when a commit returns "success", it is
guaranteed that the transaction has been written intact
– This provides the Durability semantics
– If the system crashes before the meta page is updated,
then the data updates are irrelevant
86. 84
Transaction Handling
● For tracking purposes, Readers must acquire a slot in the
readers table
– The readers table is also in a shared memory map, but separate
from the main data map
– This is a simple array recording the Process ID, Thread ID, and
Transaction ID of the reader
– The array elements are CPU cache aligned
– The first time a thread opens a read transaction, it must acquire
a mutex to reserve a slot in the table
– The slot ID is stored in Thread Local Storage; subsequent read
transactions performed by the thread need no further locks
87. 85
Transaction Handling
● Write transactions use pages from the free list before
allocating new disk pages
– Pages in the free list are used in order, oldest transaction first
– The readers table must be scanned to see if any reader is
referencing an old transaction
– The writer doesn't need to lock the reader table when
performing this scan - readers never block writers
● The only consequence of scanning with no locks is that the writer
may see stale data
● This is irrelevant, newer readers are of no concern; only the oldest
readers matter
88. 86
(5) Special Features
● Reserve Mode
– Allocates space in write buffer for data of user-
specified size, returns address
– Useful for data that is generated dynamically
instead of statically copied
– Allows generated data to be written directly to DB,
avoiding unnecessary memcpy
89. 87
Special Features
● Fixed Mapping
– Uses a fixed address for the memory map
– Allows complex pointer-based data structures to be
stored directly with minimal serialization
– Objects using persistent addresses can thus be
read back and used directly, with no deserialization
90. 88
Special Features
● Sub-Databases
– Store multiple independent named B+trees in a single
LMDB environment
– A Sub-DB is simply a key/data pair in the main DB,
where the data item is the root node of another tree
– Allows many related databases to be managed easily
● Transactions may span all of the Sub-DBs
● Used in back-mdb for the main data and all of the indices
● Used in SQLightning for multiple tables and indices
91. 89
Special Features
● Sorted Duplicates
– Allows multiple data values for a single key
– Values are stored in sorted order, with customizable
comparison functions
– When the data values are all of a fixed size, the values are
stored contiguously, with no extra headers
● maximizes storage efficiency and performance
– Implemented by the same code as SubDB support
● maximum coding efficiency
– Can be used to efficiently implement inverted indices and sets
92. 90
Special Features
● Atomic Hot Backup
– The entire database can be backed up live
– No need to stop updates while backups run
– The backup runs at the maximum speed of the
target storage medium
– Essentially: write(outfd, map, mapsize);
● No memcpy's in or out of user space
● Pure DMA from the database to the backup
93. 91
(6) Results
● Support for LMDB is available in scores of open source
projects and all major Linux and BSD distros
● In OpenLDAP slapd
– LMDB reads are 5-20x faster than BDB
– Writes are 2-5x faster than BDB
– Consumes 1/4 as much RAM as BDB
● In MemcacheDB
– LMDB reads are 2-200x faster than BDB
– Writes are 5-900x faster than BDB
– Multi-thread reads are 2-8x faster than pure-memory Memcached
94. 92
Results
● LMDB has been tested exhaustively by multiple
parties
– Symas has tested on all major filesystems: btrfs, ext2,
ext3, ext4, jfs, ntfs, reiserfs, xfs, zfs
– ext3, ext4, jfs, reiserfs, xfs also tested with external
journalling
– Testing on physical servers, VMs, HDDs, SSDs, PCIe NVM
– Testing crash reliability as well as performance and
efficiency - LMDB is proven corruption-proof in real world
conditions
98. 96
Results
● Microbenchmarks
– On-disk, 1.6Billion records, 16 byte keys, 96 byte
values, 160GB on disk with 32GB RAM, VM
00:00:00.00
00:28:48.00
00:57:36.00
01:26:24.00
01:55:12.00
02:24:00.00
02:52:48.00
03:21:36.00
03:50:24.00
Load Time
smaller is better
User
Sys
Wall
Time
109. 107
Results
● HyperDex/YCSB
LMDB update LMDB read LevelDB update LevelDB read
0.1
1
10
100
1000
10000
100000
Latency
Random Update/Read
Max
99%
95%
Avg
Min
msec
110. 108
Results
● An Interview with Armory Technologies CEO Alan Reiner
– JMC For more normal users, who have been frustrated with long
load times. In my testing of the latest beta build, using bitcoin 0.10
and the new headers first format, I’ve seen you optimise the load
time from 3 days, to less than 2 hours now. Well done! Can you talk
us through how you did this?
– AR. It really comes down to the new database engine (LMDB instead
of LevelDB) and really hard [work] by some of our developers to
reshape the architecture and the optimizations of the databases
● http://bitcoinsinireland.com/an-interview-with-armory-
technologies-ceo-alan-reiner/
111. 109
Results
● LDAP Benchmarks - compared to:
– OpenLDAP 2.4 back-mdb and -hdb
– OpenLDAP 2.4 back-mdb on Windows 2012 x64
– OpenDJ 2.4.6, 389DS, ApacheDS 2.0.0-M13
– Latest proprietary servers from CA, Microsoft,
Novell, and Oracle
– Test on a VM with 32GB RAM, 10M entries
112. 110
Results
● LDAP Benchmarks
OL mdb 2.5
OL mdb
OL hdb
OL mdb W64
OpenDJ
389DS
Other #1
Other #2
Other #3
Other #4
AD LDS 2012
ApacheDS
0
5000
10000
15000
20000
25000
30000
35000
40000
LDAP Performance
Search Mixed Search Modify Mixed Mod
Ops/second
113. 111
Results
● Full benchmark reports are available on the
LMDB page
– http://www.symas.com/mdb/
● Supported builds of LMDB-based packages are
available from Symas
– http://www.symas.com/
– OpenLDAP, Cyrus-SASL, Heimdal Kerberos
114. 112
Conclusions
● The combination of memory-mapped operation with MVCC
is extremely potent
– Reduced administrative overhead
● no periodic cleanup / maintenance required
● no particular tuning required
– Reduced developer overhead
● code size and complexity drastically reduced
– Enhanced efficiency
● minimal CPU and I/O use
– allows for longer battery life on mobile devices
– allows for lower electricity/cooling costs in data centers
– allows more work to be done with less hardware