This document provides an overview of PHP memory usage and management. It introduces key concepts like the Zend Memory Manager (ZendMM), which handles memory allocation and freeing for each PHP request. The document demonstrates how to monitor memory usage from PHP using functions like memory_get_usage() and from the OS perspective using /proc. It also discusses potential memory issues like references and circular references, and how to track reference counts. The goal is to help understand and optimize PHP memory consumption.
Adapting and adopting SQL Plan Management (SPM) to achieve execution plan stability for sub-second queries on a high-rate OLTP mission-critical application
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
Free and useful tools have proliferated since the launch of the CodePlex and SourceForge websites. Join Kevin Kline, long-time author of the SQL Server Magazine column "Tool Time", as he profiles the very best of the free tools covered in his monthly column - dozens of free tools and utilities! Some of the cover tools help to:
- Track database growth
- Implement logging in SSIS job steps
- Stress test your database applications
- Automate important preventative maintenance tasks
- Automate maintenance tasks for Analysis Services
- Help protect against SQL Injection attacks
- Graphically manage Extended Events
- Utilize PowerShell scripts to ease administration
And much more. These tools are all free and independently supported by SQL Server enthusiasts around the world.
End-to-end Troubleshooting Checklist for Microsoft SQL ServerKevin Kline
Learning how to detect, diagnose and resolve performance problems in SQL Server is tough. Often, years are spent learning how to use the tools and techniques that help you detect when a problem is occurring, diagnose the root-cause of the problem, and then resolve the problem.
In this session, attendees will see demonstrations of the tools and techniques which make difficult troubleshooting scenarios much faster and easier, including:
• XEvents, Profiler/Traces, and PerfMon
• Using Dynamic Management Views (DMVs)
• Advanced Diagnostics Using Wait Stats
• Reading SQL Server execution plan
Every DBA needs to know how to keep their SQL Server in tip-top condition, and you’ll need skills the covered in this session to do it.
In Java 9, Garbage First Garbage Collector (G1 GC) will be the default GC. This presentation makes an effort to help Hotspot VM users to understand the concept of G1 GC as well as provides some tuning advice.
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)Jean-François Gagné
To get better replication speed and less lag, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK. But fully benefiting from this feature is not as simple as just enabling it.
In this talk, I explain in detail how this feature works. I also cover how to optimize parallel replication and the improvements made in MySQL 8.0 and back-ported in 5.7 (Write Sets), greatly improving the potential for parallel execution on replicas (but needing RBR).
Come to this talk to get all the details about MySQL 5.7 and 8.0 Parallel Replication.
Adapting and adopting SQL Plan Management (SPM) to achieve execution plan stability for sub-second queries on a high-rate OLTP mission-critical application
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
Free and useful tools have proliferated since the launch of the CodePlex and SourceForge websites. Join Kevin Kline, long-time author of the SQL Server Magazine column "Tool Time", as he profiles the very best of the free tools covered in his monthly column - dozens of free tools and utilities! Some of the cover tools help to:
- Track database growth
- Implement logging in SSIS job steps
- Stress test your database applications
- Automate important preventative maintenance tasks
- Automate maintenance tasks for Analysis Services
- Help protect against SQL Injection attacks
- Graphically manage Extended Events
- Utilize PowerShell scripts to ease administration
And much more. These tools are all free and independently supported by SQL Server enthusiasts around the world.
End-to-end Troubleshooting Checklist for Microsoft SQL ServerKevin Kline
Learning how to detect, diagnose and resolve performance problems in SQL Server is tough. Often, years are spent learning how to use the tools and techniques that help you detect when a problem is occurring, diagnose the root-cause of the problem, and then resolve the problem.
In this session, attendees will see demonstrations of the tools and techniques which make difficult troubleshooting scenarios much faster and easier, including:
• XEvents, Profiler/Traces, and PerfMon
• Using Dynamic Management Views (DMVs)
• Advanced Diagnostics Using Wait Stats
• Reading SQL Server execution plan
Every DBA needs to know how to keep their SQL Server in tip-top condition, and you’ll need skills the covered in this session to do it.
In Java 9, Garbage First Garbage Collector (G1 GC) will be the default GC. This presentation makes an effort to help Hotspot VM users to understand the concept of G1 GC as well as provides some tuning advice.
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)Jean-François Gagné
To get better replication speed and less lag, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK. But fully benefiting from this feature is not as simple as just enabling it.
In this talk, I explain in detail how this feature works. I also cover how to optimize parallel replication and the improvements made in MySQL 8.0 and back-ported in 5.7 (Write Sets), greatly improving the potential for parallel execution on replicas (but needing RBR).
Come to this talk to get all the details about MySQL 5.7 and 8.0 Parallel Replication.
MySQL/MariaDB replication is asynchronous. You can make replication faster by using better hardware (faster CPU, more RAM, or quicker disks), or you can use parallel replication to remove it single-threaded limitation; but lag can still happen. This talk is not about making replication faster, it is how to deal with its asynchronous nature, including the (in-)famous lag.
We will start by explaining the consequences of asynchronous replication and how/when lag can happen. Then, we will present the solution used at Booking.com to avoid both creating lag and minimize the consequence of stale reads on slaves (hint: this solution does not mean reading from the master because this does not scale).
Once all above is well understood, we will discuss how Booking.com’s solution can be improved: this solution was designed years ago and we would do this differently if starting from scratch today. Finally, I will present an innovative way to avoid lag: the no-slave-left-behind MariaDB patch.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
USENIX ATC 2017: Visualizing Performance with Flame GraphsBrendan Gregg
Talk by Brendan Gregg for USENIX ATC 2017.
"Flame graphs are a simple stack trace visualization that helps answer an everyday problem: how is software consuming resources, especially CPUs, and how did this change since the last software version? Flame graphs have been adopted by many languages, products, and companies, including Netflix, and have become a standard tool for performance analysis. They were published in "The Flame Graph" article in the June 2016 issue of Communications of the ACM, by their creator, Brendan Gregg.
This talk describes the background for this work, and the challenges encountered when profiling stack traces and resolving symbols for different languages, including for just-in-time compiler runtimes. Instructions will be included generating mixed-mode flame graphs on Linux, and examples from our use at Netflix with Java. Advanced flame graph types will be described, including differential, off-CPU, chain graphs, memory, and TCP events. Finally, future work and unsolved problems in this area will be discussed."
There exist some valid reasons to rebuild indexes on an Oracle database (not many). This presentation is about some of those reasons and how to automate such online index rebuild.
We will show the advantages of having a geo-distributed database cluster and how to create one using Galera Cluster for MySQL. We will also discuss the configuration and status variables that are involved and how to deal with typical situations on the WAN such as slow, untrusted or unreliable links, latency and packet loss. We will demonstrate a multi-region cluster on Amazon EC2 and perform some throughput and latency measurements in real-time (video http://galeracluster.com/videos/using-galera-replication-to-create-geo-distributed-clusters-on-the-wan-webinar-video-3/)
My talk for "MySQL, MariaDB and Friends" devroom at Fosdem on February 2, 2019
Born in 2010 in MySQL 5.5.3 as "a feature for monitoring server execution at a low level," grown in 5.6 times with performance fixes and DBA-faced features, in MySQL 5.7 Performance Schema is a mature tool, used by humans and more and more monitoring products. It becomes more popular over the years. In this talk I will give an overview of Performance Schema, focusing on its tuning, performance, and usability.
Performance Schema helps to troubleshoot query performance, complicated locking issues, memory leaks, resource usage, problematic behavior, caused by inappropriate settings and much more. It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
Performance Schema is a potent tool. And very complicated at the same time. It does not affect performance in most cases and can slow down server dramatically if configured without care. It collects a lot of data, and sometimes this data is hard to read.
This talk will start from the introduction of how Performance Schema designed, and you will understand why it slowdowns server in some cases and does not affect your queries in others. Then we will discuss which information you can retrieve from Performance Schema and how to do it effectively.
I will cover its companion sys schema and graphical monitoring tools.
Parallel search or multithreaded search is a way to increase search speed by using additional processors.
Parallel search algorithms are classified by their scalability (that means the behavior of the algorithm as the number of processors becomes large) and their speed up.
Lei Shi & Mei Wang, Qihoo 360
Virtualization is one of the most complicated software in the world. The VMware workstation is very popular in many fields. The windows 10 has a lot of mitigation technology to get avoid of exploitation. It's a great challenge to make a vm escape in VMware workstation under Win 10. Especially when the guest and host are both win 10 and the guest user are NO-ADMIN. This talk will present how to make a vm escape and execute arbitrary code in the host from a NO-ADMIN guest user under Win 10(both the guest and host are Win 10). They have developed three different exploitation. This talk will introduce them and show a very elegant exploitation technology of vm escape. Besides the vm escape technology, this talk will also show the exploitation technology in Win 10. It is quite attractive because there's a process continuation, saying that the guest can execute the exploitation without crashing/disturbing the host process(VMware workstation virtual machine process). The exploitation is very reliable, it reaches nearly 100% successful rate.
PHP may seem to be a very easy language but many of don't know how PHP works. We will discuss the less known facts about PHP and we will also cover some common type of software design patterns used with PHP
MySQL/MariaDB replication is asynchronous. You can make replication faster by using better hardware (faster CPU, more RAM, or quicker disks), or you can use parallel replication to remove it single-threaded limitation; but lag can still happen. This talk is not about making replication faster, it is how to deal with its asynchronous nature, including the (in-)famous lag.
We will start by explaining the consequences of asynchronous replication and how/when lag can happen. Then, we will present the solution used at Booking.com to avoid both creating lag and minimize the consequence of stale reads on slaves (hint: this solution does not mean reading from the master because this does not scale).
Once all above is well understood, we will discuss how Booking.com’s solution can be improved: this solution was designed years ago and we would do this differently if starting from scratch today. Finally, I will present an innovative way to avoid lag: the no-slave-left-behind MariaDB patch.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
USENIX ATC 2017: Visualizing Performance with Flame GraphsBrendan Gregg
Talk by Brendan Gregg for USENIX ATC 2017.
"Flame graphs are a simple stack trace visualization that helps answer an everyday problem: how is software consuming resources, especially CPUs, and how did this change since the last software version? Flame graphs have been adopted by many languages, products, and companies, including Netflix, and have become a standard tool for performance analysis. They were published in "The Flame Graph" article in the June 2016 issue of Communications of the ACM, by their creator, Brendan Gregg.
This talk describes the background for this work, and the challenges encountered when profiling stack traces and resolving symbols for different languages, including for just-in-time compiler runtimes. Instructions will be included generating mixed-mode flame graphs on Linux, and examples from our use at Netflix with Java. Advanced flame graph types will be described, including differential, off-CPU, chain graphs, memory, and TCP events. Finally, future work and unsolved problems in this area will be discussed."
There exist some valid reasons to rebuild indexes on an Oracle database (not many). This presentation is about some of those reasons and how to automate such online index rebuild.
We will show the advantages of having a geo-distributed database cluster and how to create one using Galera Cluster for MySQL. We will also discuss the configuration and status variables that are involved and how to deal with typical situations on the WAN such as slow, untrusted or unreliable links, latency and packet loss. We will demonstrate a multi-region cluster on Amazon EC2 and perform some throughput and latency measurements in real-time (video http://galeracluster.com/videos/using-galera-replication-to-create-geo-distributed-clusters-on-the-wan-webinar-video-3/)
My talk for "MySQL, MariaDB and Friends" devroom at Fosdem on February 2, 2019
Born in 2010 in MySQL 5.5.3 as "a feature for monitoring server execution at a low level," grown in 5.6 times with performance fixes and DBA-faced features, in MySQL 5.7 Performance Schema is a mature tool, used by humans and more and more monitoring products. It becomes more popular over the years. In this talk I will give an overview of Performance Schema, focusing on its tuning, performance, and usability.
Performance Schema helps to troubleshoot query performance, complicated locking issues, memory leaks, resource usage, problematic behavior, caused by inappropriate settings and much more. It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
Performance Schema is a potent tool. And very complicated at the same time. It does not affect performance in most cases and can slow down server dramatically if configured without care. It collects a lot of data, and sometimes this data is hard to read.
This talk will start from the introduction of how Performance Schema designed, and you will understand why it slowdowns server in some cases and does not affect your queries in others. Then we will discuss which information you can retrieve from Performance Schema and how to do it effectively.
I will cover its companion sys schema and graphical monitoring tools.
Parallel search or multithreaded search is a way to increase search speed by using additional processors.
Parallel search algorithms are classified by their scalability (that means the behavior of the algorithm as the number of processors becomes large) and their speed up.
Lei Shi & Mei Wang, Qihoo 360
Virtualization is one of the most complicated software in the world. The VMware workstation is very popular in many fields. The windows 10 has a lot of mitigation technology to get avoid of exploitation. It's a great challenge to make a vm escape in VMware workstation under Win 10. Especially when the guest and host are both win 10 and the guest user are NO-ADMIN. This talk will present how to make a vm escape and execute arbitrary code in the host from a NO-ADMIN guest user under Win 10(both the guest and host are Win 10). They have developed three different exploitation. This talk will introduce them and show a very elegant exploitation technology of vm escape. Besides the vm escape technology, this talk will also show the exploitation technology in Win 10. It is quite attractive because there's a process continuation, saying that the guest can execute the exploitation without crashing/disturbing the host process(VMware workstation virtual machine process). The exploitation is very reliable, it reaches nearly 100% successful rate.
PHP may seem to be a very easy language but many of don't know how PHP works. We will discuss the less known facts about PHP and we will also cover some common type of software design patterns used with PHP
PHP is one of the most popular open source programming languages in the world. It powers some of the highest traffic sites in the world, and at the same time it powers some of the lowest traffic sites in the world. But have you ever wondered how it works under the hood? Have you been overwelmed by the thought of looking at the C code that runs PHP? Well, this talk is for you!
We're going to explore how PHP works under the hood, by looking at a PHP implementation of it: PHPPHP! Have you ever wondered what an OPCODE Cache is really doing? Have you ever wondered what a T_PAAMAYIM_NEKUDOTAYIM is? Have you ever wondered why an interpreted languages has a compiler? We'll explore all of these topics, and more! And the best part of it all? You don't need to know C to understand the details! Using PHPPHP, we can explore the language details in a high level language, where things like memory management don't get in the way of the real content. If you've ever wanted to know how PHP works, this is the talk for you!
ارائه «گزارشی از برنامهنویسی موازی در پیاچپی» در اولین همایش پیاچپی ایران
برای کسب اطلاعات بیشتر و یا سوالی در زمینه این اسلاید میتوانید به آدرس زیر مراجعه کنید.
http://goo.gl/C9pYi8
ارائه «استفاده از شبکه عصبی مصنوعی» در همایش کدرکانف - مرداد ۱۳۹۵
در این ارائه ابتدا در زمینه یادگیری ماشین توضیحاتی ارائه داده شد و در ادامه راجعبه شیوه کارکرد الگوریتم صحبت شد. در این ارائه سعی شد بدون توجه به زبان برنامهنویسی خاص تنها مقدمات استفاده از این الگوریتم شرح داده شود
در صورت علاقه به این مبحث میتوانید مقاله منتشر شده در همایش را از آدرس زیر دریافت کنید.
(آدرس به زودی معرفی میشود)
A talk I gave at WordCamp Sofa 2016 on measuring and optimizing memory usage, dealing with memory related errors, as well as monitoring server memory health.
Linux memory consumption - Why memory utilities show a little amount of free RAM? How does Linux kernel utilizes free RAM? What is the real amount of free RAM in the system?
Developing High Performance and Scalable ColdFusion Applications Using Terrac...Shailendra Prasad
1. How to scale – options (pros and cons)
2. Caching basics (various options available)
3. Recent updates of Open source Ehcache project.
4. Scaling your existing application with Ehcache, Terracotta OSS
5. Advance caching techniques for scaling using Terracotta BigMemory
6. Customer use cases where caching was mission critical
Let's look at ways how to make PHP applications remember what your user did on the last web page.
The first half will be about the classic PHP sessions, what the SessionHandlerInterface is about and why all of it's methods are important for a successful alternative implementation.
The second half will explore alternatives to the problems created using PHP sessions, including alternatives to sessions: Splitting the problem on the server or moving the data to the client.
Built-in query caching for all PHP MySQL extensions/APIsUlf Wendel
Query caching boosts the performance of PHP MySQL applications. Caching can be done on the database server or at the web clients. A new mysqlnd plugin adds query caching to all PHP MySQL extension: written in C, immediately usable with any PHP application because of no API changes, supports Memcache, APC, SQLite and main memory storage, integrates itself smoothless into existing PHP deployment infrastructure, helps you to scale by client, ... Enjoy!
Caching and tuning fun for high scalabilityWim Godden
Caching has been a 'hot' topic for a few years. But caching takes more than merely taking data and putting it in a cache : the right caching techniques can improve performance and reduce load significantly. But we'll also look at some major pitfalls, showing that caching the wrong way can bring down your site.
If you're looking for a clear explanation about various caching techniques and tools like Memcached, Nginx and Varnish, as well as ways to deploy them in an efficient way, this talk is for you.
ECECS 472572 Final Exam ProjectRemember to check the errata EvonCanales257
ECE/CS 472/572 Final Exam Project
Remember to check the errata section (at the very bottom of the page) for updates.
Your submission should be comprised of two items: a .pdf file containing your written report and a .tar file containing a directory structure with your C or C++ source code. Your grade will be reduced if you do not follow the submission instructions.
All written reports (for both 472 and 572 students) must be composed in MS Word, LaTeX, or some other word processor and submitted as a PDF file.
Please take the time to read this entire document. If you have questions there is a high likelihood that another section of the document provides answers.
Introduction
In this final project you will implement a cache simulator. Your simulator will be configurable and will be able to handle caches with varying capacities, block sizes, levels of associativity, replacement policies, and write policies. The simulator will operate on trace files that indicate memory access properties. All input files to your simulator will follow a specific structure so that you can parse the contents and use the information to set the properties of your simulator.
After execution is finished, your simulator will generate an output file containing information on the number of cache misses, hits, and miss evictions (i.e. the number of block replacements). In addition, the file will also record the total number of (simulated) clock cycles used during the situation. Lastly, the file will indicate how many read and write operations were requested by the CPU.
It is important to note that your simulator is required to make several significant assumptions for the sake of simplicity.
1. You do not have to simulate the actual data contents. We simply pretend that we copied data from main memory and keep track of the hypothetical time that would have elapsed.
2. Accessing a sub-portion of a cache block takes the exact same time as it would require to access the entire block. Imagine that you are working with a cache that uses a 32 byte block size and has an access time of 15 clock cycles. Reading a 32 byte block from this cache will require 15 clock cycles. However, the same amount of time is required to read 1 byte from the cache.
3. In this project assume that main memory RAM is always accessed in units of 8 bytes (i.e. 64 bits at a time).
When accessing main memory, it's expensive to access the first unit. However, DDR memory typically includes buffering which means that the RAM can provide access to the successive memory (in 8 byte chunks) with minimal overhead. In this project we assume an overhead of 1 additional clock cycle per contiguous unit.
For example, suppose that it costs 255 clock cycles to access the first unit from main memory. Based on our assumption, it would only cost 257 clock cycles to access 24 bytes of memory.
4. Assume that all caches utilize a "fetch-on-write" scheme if a miss occurs on a Store operation. This means that you must always fetch ...
ECECS 472572 Final Exam ProjectRemember to check the errat.docxtidwellveronique
ECE/CS 472/572 Final Exam Project
Remember to check the errata section (at the very bottom of the page) for updates.
Your submission should be comprised of two items:
a
.pdf
file containing your written report and a
.tar
file containing a directory structure with your C or C++ source code. Your grade will be reduced if you do not follow the submission instructions.
All written reports (for both 472 and 572 students) must be composed in MS Word, LaTeX, or some other word processor and submitted as a PDF file.
Please take the time to read this entire document. If you have questions there is a high likelihood that another section of the document provides answers.
Introduction
In this final project you will implement a cache simulator. Your simulator will be configurable and will be able to handle caches with varying capacities, block sizes, levels of associativity, replacement policies, and write policies. The simulator will operate on trace files that indicate memory access properties. All input files to your simulator will follow a specific structure so that you can parse the contents and use the information to set the properties of your simulator.
After execution is finished, your simulator will generate an output file containing information on the number of cache misses, hits, and miss evictions (i.e. the number of block replacements). In addition, the file will also record the total number of (simulated) clock cycles used during the situation. Lastly, the file will indicate how many read and write operations were requested by the CPU.
It is important to note that your simulator is required to make several significant assumptions for the sake of simplicity.
1. You do not have to simulate the actual data contents. We simply pretend that we copied data from main memory and keep track of the hypothetical time that would have elapsed.
2. Accessing a sub-portion of a cache block takes the exact same time as it would require to access the entire block. Imagine that you are working with a cache that uses a 32 byte block size and has an access time of 15 clock cycles. Reading a 32 byte block from this cache will require 15 clock cycles. However, the same amount of time is required to read 1 byte from the cache.
3. In this project assume that main memory RAM is always accessed in units of 8 bytes (i.e. 64 bits at a time).
When accessing main memory, it's expensive to access the first unit. However, DDR memory typically includes buffering which means that the RAM can provide access to the successive memory (in 8 byte chunks) with minimal overhead. In this project we assume an
overhead of 1 additional clock cycle per contiguous unit
.
For example, suppose that it costs 255 clock cycles to access the first unit from main memory. Based on our assumption, it would only cost 257 clock cycles to access 24 bytes of memory.
4. Assume that all caches utilize a "fetch-on-write" scheme if a miss occurs on a Store operation. This means that .
ECECS 472572 Final Exam ProjectRemember to check the err.docxtidwellveronique
ECE/CS 472/572 Final Exam Project
Remember to check the errata section (at the very bottom of the page) for updates.
Your submission should be comprised of two items:
a
.pdf
file containing your written report and a
.tar
file containing a directory structure with your C or C++ source code. Your grade will be reduced if you do not follow the submission instructions.
All written reports (for both 472 and 572 students) must be composed in MS Word, LaTeX, or some other word processor and submitted as a PDF file.
Please take the time to read this entire document. If you have questions there is a high likelihood that another section of the document provides answers.
Introduction
In this final project you will implement a cache simulator. Your simulator will be configurable and will be able to handle caches with varying capacities, block sizes, levels of associativity, replacement policies, and write policies. The simulator will operate on trace files that indicate memory access properties. All input files to your simulator will follow a specific structure so that you can parse the contents and use the information to set the properties of your simulator.
After execution is finished, your simulator will generate an output file containing information on the number of cache misses, hits, and miss evictions (i.e. the number of block replacements). In addition, the file will also record the total number of (simulated) clock cycles used during the situation. Lastly, the file will indicate how many read and write operations were requested by the CPU.
It is important to note that your simulator is required to make several significant assumptions for the sake of simplicity.
1. You do not have to simulate the actual data contents. We simply pretend that we copied data from main memory and keep track of the hypothetical time that would have elapsed.
2. Accessing a sub-portion of a cache block takes the exact same time as it would require to access the entire block. Imagine that you are working with a cache that uses a 32 byte block size and has an access time of 15 clock cycles. Reading a 32 byte block from this cache will require 15 clock cycles. However, the same amount of time is required to read 1 byte from the cache.
3. In this project assume that main memory RAM is always accessed in units of 8 bytes (i.e. 64 bits at a time).
When accessing main memory, it's expensive to access the first unit. However, DDR memory typically includes buffering which means that the RAM can provide access to the successive memory (in 8 byte chunks) with minimal overhead. In this project we assume an
overhead of 1 additional clock cycle per contiguous unit
.
For example, suppose that it costs 255 clock cycles to access the first unit from main memory. Based on our assumption, it would only cost 257 clock cycles to access 24 bytes of memory.
4. Assume that all caches utilize a "fetch-on-write" scheme if a miss occurs on a Store operation. This means that.
Today's high-traffic web sites must implement performance-boosting measures that reduce data processing and reduce load on the database, while increasing the speed of content delivery. One such method is the use of a cache to temporarily store whole pages, database recordsets, large objects, and sessions. While many caching mechanisms exist, memcached provides one of the fastest and easiest-to-use caching servers. Coupling memcached with the alternative PHP cache (APC) can greatly improve performance by reducing data processing time. In this talk, Ben Ramsey covers memcached and the pecl/memcached and pecl/apc extensions for PHP, exploring caching strategies, a variety of configuration options to fine-tune your caching solution, and discusses when it may be appropriate to use memcached vs. APC to cache objects or data.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
2. Hello everybody
Julien PAULI
Programming with PHP since early 2000s
Programming in C
PHP Internals programmer/reviewer
PHP 5.5 & 5.6 Release Manager
@julienpauli
Tech blog at jpauli.github.io
jpauli@php.net
3. What about you ?
Got some Unix/Linux understandings ?
Have already experienced C programming ?
Have already experienced PHP programming ?
4. What we'll cover together
Memory , what's that ?
bytes, stack, heap, etc.
Measuring a process memory consumption
memory image map analysis
Understanding PHP memory consumption
Zend Memory Manager coming
Measuring PHP memory consumption
from PHP land or from system land
7. Memory in a user process
The Virtual Memory image is divided in segments
text
data
heap
stack
8. Memory usage can grow
Stack will grow as functions will get called
And will narrow as the calls stop and return
Heap will grow as the programmer will decide
Using dynamic allocation functions (malloc, mmap)
Programmer has to free memory by hand
If not : memory leak
10. Linux memory monitoring
'top' or /proc FS
> cat /proc/28754/status
VmPeak: 20452 kB
VmSize: 20324 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 316 kB
VmRSS: 316 kB
VmData: 16440 kB
VmStk: 136 kB
VmExe: 4 kB
VmLib: 1664 kB
VmPTE: 28 kB
VmSwap: 0 kB
Size of the VM map (out of total mem)
Resident Set Size : Size actually in PM
Size of the data segment in VM
Size of the stack segment in VM
Size of the text segment in VM
pid
11. Going even deeper
Let's show the detailed process memory map :
> cat /proc/28754/smaps
shared segment
private mem
shared mem
13. PHP, just a process
PHP is just a process like any other
You then can monitor its memory usage by asking
the OS
<?php
passthru(sprintf('cat /proc/%d/status', getmypid()));
<?php
function heap() {
return shell_exec(sprintf('grep "VmRSS:" /proc/%d/status', getmypid()));
}
15. The PHP model
One same process may treat many requests
If the process leaks memory, you'll suffer from that
Request n+1 must know nothing about request n
Need to "flush" the request-allocated memory
Need to track request-bound memory claims
ZendMM is the layer that does the job
Share-nothing architecture : by design.
17. Why a custom layer ?
ZendMM
Allows monitoring of request-bound heap usage by
basically counting malloc/free calls
Allows the PHP user to limit the heap mem usage
Allows caching of alloced blocks to prevent memory
fragmentation and syscalls
Allows preallocating blocks of well-known-sizes for
PHP internal structures to fit-in in an aligned way
Ease memory leaks debugging in core and exts
18. ZendMM guards and leak tracker
Only works in debug mode PHP
report_memleaks=1 in php.ini
Does NOT replace valgrind
19. Zend MM test script
<?php
ini_set('memory_limit', -1); /* unlimited ZendMM heap */
function heap() {
return shell_exec(sprintf('grep "VmRSS:" /proc/%d/status', getmypid()));
}
echo heap();
$a = range(1, 1024*1024); /* Stress memory by allocating */
echo heap();
unset($a); /* Stress memory by freeing */
echo heap();
20. Zend MM launch test
> time USE_ZEND_ALLOC=1 php zendmm.php
VmRSS: 9640 kB
VmRSS: 159296 kB
VmRSS: 10068 kB
real 0m0.237s
user 0m0.148s
sys 0m0.080s
> time USE_ZEND_ALLOC=0 php zendmm.php
VmRSS: 9608 kB
VmRSS: 148988 kB
VmRSS: 140804 kB
real 0m0.288s
user 0m0.176s
sys 0m0.108s
21. Valgrind tests for Symfony2 command
app/console runs lots of PHP code under SF2
>USE_ZEND_ALLOC=1 valgrind php app/console debug:container
total heap usage: 84,000 allocs, 84,000 frees, 25,966,154 bytes allocated
>USE_ZEND_ALLOC=0 valgrind php app/console debug:container
total heap usage: 813,570 allocs, 813,570 frees, 74,579,732 bytes allocated
23. ZendMM benefits
Better memory management
More memory efficient
Far less malloc/free calls
Less context switches, less Kernel stress
Less CPU usage
Less heap fragmentation / compaction
A PHP ~10% faster with ZendMM enabled
Really depends on use-case
24. Memory manager internals
Layer on top of the heap
Will allocate memory from the heap by chunks of
customizable size (segments)
Will use a customizable low level heap (malloc /
mmap)
25. A quick word on ZendMM internals
ZEND_MM_SEG_SIZE env variable to customize
segment size
Must be power-of-two aligned
Default value is 256Kb
26. ZendMM in PHP user land
memory_limit (INI setting)
memory_get_usage(true)
Returns the size of all the allocated segments
memory_get_usage()
Returns the occupied size in all the allocated
segments
memory_get_[peak]_usage([real])
Returns the max memory that has been
allocated/used. Could have been freed meantime
30. memory_get_usage() ?
memory_get_usage() tells you how much your
allocated blocs consume
They usually don't fill their segment entirely
Thus memory_get_usage(true) shows more
This doesn't count stack
This does only count request-bound memory
This doesn't count linked libraries present in the
process memory map
This doesn't show non-request-bound memory
34. Recommendations / statements
Understand memory_get_usage()
It only shows request-bound allocations, not
persistent allocations (that reside through requests)
PHP extensions may allocate persistent memory
Do NOT activate extensions you will not use
Libraries used by PHP may also allocate persistent
memory
Use your OS to monitor your process memory
accurately
36. Master your PHP mem usage
In PHP land ...
all variable types consume memory
every script asked for compilation will eat memory
This memory will be allocated using ZendMM
The memory for parsed script is freed when the
request ends
The memory for user variable is freed when the
data is not used any more
And here comes the challenge
When isn't the data needed any more ??
37. Compilation eats memory
Compiling a script eats request-bound memory
If you compile a class, that eats lots of memory
You'd better use that class at runtime
Use an autoloader to be sure
<?php
$mem = memory_get_usage();
require __DIR__ . '/../vendor/autoload.php';
require __DIR__ . '/../src/Symfony/Component/DependencyInjection/ContainerBuilder.php'
echo memory_get_usage() - $mem . "n";
php app/bar.php
246,692
39. PHP Variables
What eats memory is what is stored into the zval,
not really the zval itself :
A huge string
Tip : a file_get_contents('huge-file') is a huge string
A complex array or object
Resources don't really consume mem in zval
<?php
/* This consumes sizeof(zval) + 1024 bytes */
$a = str_repeat('a', 1024);
40. PHP Variables
What you want to avoid is have PHP duplicate the
zval
But PHP is kind about that
What you want to happen in PHP freeing the
memory ASAP
Should you know when PHP duplicates or frees zval
, that's the most important !
41. zvals and refcount
PHP simply counts how many symbols (PHP
vars) point to a zval
This is called the refcount
<?php
$a = "foo";
$b = $a;
42. zvals and refcount
PHP uses a CopyOnWrite (COW) system for
zvals
Memory is saved
Memory gets allocated only on changes
<?php
$a = "foo";
$b = $a;
$a = 17;
43. zvals and refcount
PHP frees memory for a zval
when its refcount reaches 0
Yes, unset() just refcount-- ,
that's all
<?php
$a = "foo";
$b = $a;
$c = $b;
$b = "bar";
unset($a);
44. No references needed
You see how smart PHP is with memory ?
It's been designed with that in mind
No references needed to hint PHP !
Don't try to hint PHP with references
References can lead to adverse effects
Force PHP to copy a zval
Prevents PHP from freeing memory of a zval
&
45. Tracking refcount
xdebug_debug_zval()
symfony_zval_info()
namespace Foo;
class A
{
public $var = 'varA';
}
$a = new A();
xdebug_debug_zval('a');
a: (refcount=1, is_ref=0)=class FooA { public $var =
(refcount=2, is_ref=0)='varA'; }
46. Tracking refcount
namespace Foo;
class C
{
public $b;
public function __construct(B $b) {
$this->b = $b;
}
}
$c = new C($b = new B);
xdebug_debug_zval('c');
c: (refcount=1, is_ref=0)=class FooC { public $b = (refcount=2, is_ref=0)=class FooB { } }
namespace Foo;
class B
{
}
unset($b);
xdebug_debug_zval('c');
c: (refcount=1, is_ref=0)=class FooC { }
c: (refcount=1, is_ref=0)=class FooC { public $b = (refcount=1, is_ref=0)=class FooB { } }
unset($c->b);
xdebug_debug_zval('c');
47. Garbage collector ?
As of PHP5.3 , a garbage collector exists
Used to free circular references
And that's all !
PHP already frees itself your vars as their
refcount reaches 0
And it's always been like that
48. Circular references
$a = new A;
$b = new B
(object) 'A'
refcount = 1
$a (object) 'B'
refcount = 1
$b
50. Circular references
Objects are still in memory but no more PHP var
point to them
We can call that a "PHP Userland memory leak"
unset($a, $b);
(object) 'A'
refcount = 1
$b->a (object) 'B'
refcount = 1
$a->b
53. PHP references main line
Using references (&) in PHP can really fool you
They usually force the engine to duplicate memory
containers
Which is bad for performance
Especially when the var container is huge
54. References mismatch
<?php
function foo($data)
{
echo "in function : " . memory_get_usage() . "n";
}
echo "Initial memory : " . memory_get_usage() . "n";
$r = range(1, 1024);
echo "Array created : " . memory_get_usage() . "n";
foo($r);
echo "End of function " . memory_get_usage() . "n";
Initial memory : 227.136
Array created : 374.912
in function : 374.944
End of function 374.944
55. References mismatch
<?php
function foo($data)
{
echo "in function : " . memory_get_usage() . "n";
}
echo "Initial memory : " . memory_get_usage() . "n";
$r = range(1, 1024);
$r2 = &$r;
echo "Array created : " . memory_get_usage() . "n";
foo($r);
echo "End of function " . memory_get_usage() . "n";
Initial memory : 227.208
Array created : 375.096
in function : 473.584
End of function 375.128
56. When does the engine separate ?
In any mismatch case :
zval passed to function arginfo decl. zval received in function separated by engine?
is_ref=0
refcount = 1
pass_by_ref=0 is_ref=0
refcount = 2
NO
is_ref=1
refcount > 1
pass_by_ref=0 is_ref=1
refcount =1
YES
is_ref=0
refcount > 1
pass_by_ref=0 is_ref=0
refcount > 1 ++
NO
is_ref=0
refcount = 1
pass_by_ref=1 is_ref=1
refcount = 2
YES
is_ref=1
refcount > 1
pass_by_ref=1 is_ref=1
refcount > 1 ++
NO
is_ref=0
refcount > 1
pass_by_ref=1 is_ref=1
refcount = 2
YES
57. Symfony_debug to notice mismatches
https://github.com/symfony/debug
function bar($var) { }
$a = "foo";
$b = &$a;
bar($a);
Notice: Separating zval for call to function 'bar' in /tmp/memory.php on line 20
59. Foreach separation behavior
It happens sometimes foreach() duplicates your
variable for iteration
This will eat performances in case of big or
complex arrays being iterated
There is nothing special to say about objects
implementing Traversable
The behavior is then just yours
60. foreach iterating array #1
If the variable has a refcount of 1, no duplication is
performed by foreach()
$a = range(1,1024);
echo memory_get_usage() . "n" ;
foreach ($a as $v) {
if ($v == 1) {
echo memory_get_usage() . "n" ;
}
}
echo memory_get_usage() . "n" ;
373936
374056
374056
61. foreach iterating array #2
If the variable has a refcount >1, foreach() will
duplicate it fully for iteration
$a = range(1,1024);
$b = $a;
echo memory_get_usage() . "n" ;
foreach ($a as $v) {
if ($v == 1) {
echo memory_get_usage() . "n" ;
}
}
echo memory_get_usage() . "n" ;
373936
472512
374056
62. foreach iterating array #3
If the variable is a reference, foreach() will work
onto that array and won't perform duplication
$a = range(1,1024);
$b = &$a;
echo memory_get_usage() . "n" ;
foreach ($a as $v) {
if ($v == 1) {
echo memory_get_usage() . "n" ;
}
}
echo memory_get_usage() . "n" ;
373936
374056
374056
63. Monitoring memory usage
Not much tools exist (for PHP)
memory_get_usage()
OS' help (/proc , pmap , etc...)
Valgrind with massif tool
PHP Extensions
Xdebug
memprof
memtrack
64. Quick intro to memprof
https://github.com/arnaud-lb/php-memory-profiler
$b = range(1, 1024 * 1024); /* a lot of memory */
$b[] = foo();
loader('/Zend/Date.php'); /* a lot of PHP source code */
memprof_dump_callgrind(fopen('/tmp/trace.out', 'w'));