There is an increasing interest in understanding and analyzing the use of resources in software and hardware systems. Certifying memory consumption is vital to ensure safety in embedded systems as well as proper administration of their power consumption; understanding the number of messages sent through a network is useful to detect performance bottlenecks or reduce communication costs, etc. Assessing resource usage is indeed a cornerstone in a wide variety of software-intensive system ranging from embedded to Cloud computing. It is well known that inferring, and even checking, quantitative bounds is difficult (actually undecidable). Memory consumption is a particularly challenging case of resource-usage analysis due to its non-accumulative nature. Inferring memory consumption requires not only computing bounds for allocations but also taking into account the memory recovered by a GC. In this talk I will present some of the work our group have been performing in order to automatically analyze heap memory requirements. In particular, I will show some basic ideas which are core to our techniques and how they were applied to different problems, ranging from inferring sizes of memory regions in real-time Java to analyzing heap memory requirements in Java/.Net. Then, I will introduce our new compositional approach which is used to analyze (infer/verify) Java and .Net programs. Finally, I will explain some limitations of our approach and discuss some key challenges and directions for future research.
Gentle Introduction to Functional ProgrammingSaurabh Singh
This slide is basically aimed at professionals and students to introduce them with functional programming.
I haven't used much functional programming terminologies because I personally feel they could be overwhelming to people getting introduced to FP for the first time. For similar reasons I have deliberately avoided using any functional programming language and kept the discussions programming language agnostic as far as possible.
This document provides an overview of basic data structures concepts. It discusses bits and data types, different numeric representations like binary and decimal. It introduces common data types used in programming like integers and floats. It also covers abstract data types and provides examples like stacks, queues and lists. The document describes iterative and recursive algorithms for problems like factorial calculation, binary search and the Towers of Hanoi. It analyzes the time complexity of algorithms like selection sort.
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationHidekazu Oiwa
Large-Scale Learning with Less RAM via Randomization proposes algorithms that reduce memory usage for machine learning models during training and prediction while maintaining prediction accuracy. It introduces a method called randomized rounding that represents model weights with fewer bits by randomly rounding values to the nearest representation. An algorithm is proposed that uses randomized rounding and adaptive learning rates on a per-coordinate basis, providing theoretical guarantees on regret bounds. Memory usage is reduced by 50% during training and 95% during prediction compared to standard floating point representation.
This document summarizes a research paper titled "Fast Image Tagging" presented at ICML2013. The paper proposes a new method called FastTag that can quickly and accurately tag images with relevant keywords. FastTag learns a mapping from image features to a completed tag set by simultaneously training two classifiers - one to predict the complete tag set from images, and another to enrich existing sparse tags. It uses a marginalized blank-out regularization technique to guide the learning without needing corrupted training data. Experiments show FastTag achieves state-of-the-art accuracy on par with previous best methods but with faster training and testing times.
Fast Identification of Heavy Hitters by Cached and Packed Group TestingRakuten Group, Inc.
The document summarizes a research paper on efficiently identifying heavy hitters in data streams using cached and packed group testing techniques. The paper proposes using packed bidirectional counter arrays to implement the operations of combinatorial group testing (CGT) in constant time. This improves the time complexity of CGT for updating frequencies and querying heavy hitters from O(log(n)) to O(1), eliminating dependency on the size of the data universe n. Experimental results show the proposed method achieves competitive precision, update throughput, and query throughput compared to existing CGT and hierarchical count-min sketch approaches.
This document discusses using Gaussian process models for change point detection in atmospheric dispersion problems. It proposes using multiple kernels in a Gaussian process to model different regimes indicated by change points. A two-stage process is used to first estimate the change point (release time) and then estimate the source location. Simulation results show the approach outperforms existing techniques in estimating change points and source locations from concentration sensor measurements. The approach is applied to model real concentration data to estimate a CBRN release scenario.
Faster Practical Block Compression for Rank/Select DictionariesRakuten Group, Inc.
We present faster practical encoding and decoding procedures for block compression. Such encoding and decoding procedures are important to efficiently support rank/select queries on compressed bit vectors. This paper was presented at the 24th International Symposium on String Processing and Information Retrieval (SPIRE 2017) in Palermo, Italy.
Gentle Introduction to Functional ProgrammingSaurabh Singh
This slide is basically aimed at professionals and students to introduce them with functional programming.
I haven't used much functional programming terminologies because I personally feel they could be overwhelming to people getting introduced to FP for the first time. For similar reasons I have deliberately avoided using any functional programming language and kept the discussions programming language agnostic as far as possible.
This document provides an overview of basic data structures concepts. It discusses bits and data types, different numeric representations like binary and decimal. It introduces common data types used in programming like integers and floats. It also covers abstract data types and provides examples like stacks, queues and lists. The document describes iterative and recursive algorithms for problems like factorial calculation, binary search and the Towers of Hanoi. It analyzes the time complexity of algorithms like selection sort.
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationHidekazu Oiwa
Large-Scale Learning with Less RAM via Randomization proposes algorithms that reduce memory usage for machine learning models during training and prediction while maintaining prediction accuracy. It introduces a method called randomized rounding that represents model weights with fewer bits by randomly rounding values to the nearest representation. An algorithm is proposed that uses randomized rounding and adaptive learning rates on a per-coordinate basis, providing theoretical guarantees on regret bounds. Memory usage is reduced by 50% during training and 95% during prediction compared to standard floating point representation.
This document summarizes a research paper titled "Fast Image Tagging" presented at ICML2013. The paper proposes a new method called FastTag that can quickly and accurately tag images with relevant keywords. FastTag learns a mapping from image features to a completed tag set by simultaneously training two classifiers - one to predict the complete tag set from images, and another to enrich existing sparse tags. It uses a marginalized blank-out regularization technique to guide the learning without needing corrupted training data. Experiments show FastTag achieves state-of-the-art accuracy on par with previous best methods but with faster training and testing times.
Fast Identification of Heavy Hitters by Cached and Packed Group TestingRakuten Group, Inc.
The document summarizes a research paper on efficiently identifying heavy hitters in data streams using cached and packed group testing techniques. The paper proposes using packed bidirectional counter arrays to implement the operations of combinatorial group testing (CGT) in constant time. This improves the time complexity of CGT for updating frequencies and querying heavy hitters from O(log(n)) to O(1), eliminating dependency on the size of the data universe n. Experimental results show the proposed method achieves competitive precision, update throughput, and query throughput compared to existing CGT and hierarchical count-min sketch approaches.
This document discusses using Gaussian process models for change point detection in atmospheric dispersion problems. It proposes using multiple kernels in a Gaussian process to model different regimes indicated by change points. A two-stage process is used to first estimate the change point (release time) and then estimate the source location. Simulation results show the approach outperforms existing techniques in estimating change points and source locations from concentration sensor measurements. The approach is applied to model real concentration data to estimate a CBRN release scenario.
Faster Practical Block Compression for Rank/Select DictionariesRakuten Group, Inc.
We present faster practical encoding and decoding procedures for block compression. Such encoding and decoding procedures are important to efficiently support rank/select queries on compressed bit vectors. This paper was presented at the 24th International Symposium on String Processing and Information Retrieval (SPIRE 2017) in Palermo, Italy.
Hyperparameter optimization with approximate gradientFabian Pedregosa
This document discusses hyperparameter optimization using approximate gradients. It introduces the problem of optimizing hyperparameters along with model parameters. While model parameters can be estimated from data, hyperparameters require methods like cross-validation. The document proposes using approximate gradients to optimize hyperparameters more efficiently than costly methods like grid search. It derives the gradient of the objective with respect to hyperparameters and presents an algorithm called HOAG that approximates this gradient using inexact solutions. The document analyzes HOAG's convergence and provides experimental results comparing it to other hyperparameter optimization methods.
The document summarizes key concepts in social network analysis including metrics like degree distribution, path lengths, transitivity, and clustering coefficients. It also discusses models of network growth and structure like random graphs, small-world networks, and preferential attachment. Computational aspects of analyzing large networks like calculating shortest paths and the diameter are also covered.
The document discusses improved approaches to implementing dynamic tries in a space-efficient manner. It summarizes the Bonsai data structure, which supports dynamic trie operations in O(1) expected time but uses O(nlogσ + nloglogn) bits of space. The document then proposes a new approach called m-Bonsai that uses only O(nlogσ) bits of space in expectation while also supporting O(1) expected time operations, achieving the optimal space bound. Experimental results show m-Bonsai uses significantly less memory than Bonsai and has comparable or better performance.
The document discusses algorithms complexity and data structures efficiency. It covers topics like asymptotic notation to analyze algorithm complexity, different time and memory complexities like constant, logarithmic, linear, quadratic, and exponential. Examples are provided to illustrate complexity calculations for algorithms with operations on arrays, matrices and recursion. Choosing the right data structures depends on the algorithm complexity and problem requirements.
Our techniques provide fast wavelet tree construction in practice based on recent theoretical work. Experiments on real datasets show our methods using the PEXT and PSHUFB CPU instructions outperform previous approaches. For wavelet trees, our methods are 1.9x faster than naive construction on average and competitive with state-of-the-art. For wavelet matrices, we achieve speedups of 1.1-1.9x over the state-of-the-art. This work provides the first practical implementation of the fastest known wavelet tree construction algorithms.
Electronic Codebook Book (ECB) encrypts each message block independently without chaining blocks together. This can reveal patterns in the ciphertext if the plaintext has repetitive blocks. Cipher Block Chaining (CBC) chains blocks together by XORing the previous ciphertext block with the current plaintext block before encryption. Counter (CTR) mode encrypts a counter value rather than feedback from previous blocks, allowing parallel encryption. These modes of operation can be used to encrypt data blocks securely depending on needs such as bulk encryption or streaming data.
CUDA First Programs: Computer Architecture CSE448 : UAA Alaska : NotesSubhajit Sahu
The document provides examples of simple CUDA programs for adding vectors and 2D arrays using kernel functions. It begins with a "Hello World" CUDA program and explains how to compile and run it. It then shows a CUDA program that adds two numbers in a kernel function using thread indexing. Next, it presents a CUDA program for adding two vectors with one thread per element. Finally, it demonstrates how to map a 2D array to linear memory and write a kernel to add 2D arrays using block indexing.
M A T H E M A T I C S I I I J N T U M O D E L P A P E R{Wwwguest3f9c6b
This document contains a mathematics examination for an engineering course. It consists of 8 questions testing various topics in complex analysis, including:
1) Evaluating integrals using techniques like Cauchy's integral formula and the residue theorem.
2) Finding Taylor and Laurent series expansions.
3) Identifying poles and residues of complex functions.
4) Applying complex transformations to map regions in the z-plane to the w-plane.
The questions cover topics such as contour integration, power series, analytic functions, and bilinear transformations. Students are instructed to answer any 5 of the 8 questions, each worth equal marks, in their 3-hour examination.
The document presents duality theory for composite geometric programs (CGPs), which include exponential GPs (EGPs) as a special case. An EGP allows some posynomial terms in the objective function to be multiplied by an exponential factor of another posynomial term. The key results are:
1) EGP problems can be formulated as convex programs by a change of variables.
2) The dual problem of an EGP is a posynomial program.
3) Strong duality holds between primal and dual EGP programs, and optimal solutions can be recovered from each other using extremality conditions.
4) Motivating examples like maximum likelihood estimation of Poisson and exponential parameters can be solved as
The document discusses attribute-based encryption (ABE) schemes, including Key-Policy ABE (KP-ABE) and Ciphertext-Policy ABE (CP-ABE). It defines the components of KP-ABE and CP-ABE, including setup, encryption, key generation, and decryption algorithms. It also describes the security models and proves the selective security of the GPSW KP-ABE scheme and correctness of the Waters CP-ABE scheme under the decisional bilinear Diffie-Hellman assumption. The document outlines the KP-ABE and CP-ABE constructions and security proofs in detail.
This document introduces a generalized method for constructing sub-quadratic complexity multipliers for finite fields of characteristic 2. It begins by reintroducing the Winograd short convolution algorithm in the context of polynomial multiplication. It then presents a recursive construction technique that extends any d-point multiplier into an n=dk-point multiplier with sub-quadratic area and logarithmic delay complexity. Several new constructions are obtained using this technique, one of which is identical to the Karatsuba multiplier. The techniques aim to develop bit-parallel multipliers with better time and/or space complexity than the traditional quadratic complexity approaches.
The document provides an overview of recursive and iterative algorithms. It discusses key differences between recursive and iterative algorithms such as definition, application, termination, usage, code size, and time complexity. Examples of recursive algorithms like recursive sum, factorial, binary search, tower of Hanoi, and permutation generator are presented along with pseudocode. Analysis of recursive algorithms like recursive sum, factorial, binary search, Fibonacci number, and tower of Hanoi is demonstrated to determine their time complexities. The document also discusses iterative algorithms, proving an algorithm's correctness, the brute force approach, and store and reuse methods.
1. Motivation: why do we need low-rank tensors
2. Tensors of the second order (matrices)
3. CP, Tucker and tensor train tensor formats
4. Many classical kernels have (or can be approximated in ) low-rank tensor format
5. Post processing: Computation of mean, variance, level sets, frequency
Profiling in Python provides concise summaries of key profiling tools in 3 sentences:
cProfile and line_profiler profile execution time and identify slow lines of code. memory_profiler profiles memory usage with line-by-line or time-based outputs. YEP extends profiling to compiled C/C++ extensions like Cython modules, which are not covered by the standard Python profilers.
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...Matt Moores
There are many approaches to Bayesian computation with intractable likelihoods, including the exchange algorithm, approximate Bayesian computation (ABC), thermodynamic integration, and composite likelihood. These approaches vary in accuracy as well as scalability for datasets of significant size. The Potts model is an example where such methods are required, due to its intractable normalising constant. This model is a type of Markov random field, which is commonly used for image segmentation. The dimension of its parameter space increases linearly with the number of pixels in the image, making this a challenging application for scalable Bayesian computation. My talk will introduce various algorithms in the context of the Potts model and describe their implementation in C++, using OpenMP for parallelism. I will also discuss the process of releasing this software as an open source R package on the CRAN repository.
A Proposition for Business Process ModelingAng Chen
The document proposes a graphical notation called BPMN+ for modeling business processes. It aims to be graphical and easy to use while also having formal semantics and being operational/executable. Key goals include being expressive, capable of distributed programming, and compatible with the existing BPMN standard. The notation is based on a Service Component Model and uses concepts like services, properties, and concurrent data objects. It also defines synchronization patterns like simultaneous, sequence, and alternative to model relationships between process steps.
This document discusses algorithms complexity and data structures efficiency. It covers topics like time and memory complexity, asymptotic notation, fundamental data structures like arrays, lists, trees and hash tables, and choosing proper data structures. Computational complexity is important for algorithm design and efficient programming. The document provides examples of analyzing complexity for different algorithms.
PVS-Studio team experience: checking various open source projects, or mistake...Andrey Karpov
To let the world know about our product, we check open-source projects. By the moment we have checked 245 projects. A side effect: we found 9574 errors and notified the authors about them.
Discuss seven functions, Analysis of algorithms- Experimental Studies/Primitive operations/Asymptotic notation- Big Oh/Big-Omega/Big-Theta
(Download is recommended to make the animations work)
Hyperparameter optimization with approximate gradientFabian Pedregosa
This document discusses hyperparameter optimization using approximate gradients. It introduces the problem of optimizing hyperparameters along with model parameters. While model parameters can be estimated from data, hyperparameters require methods like cross-validation. The document proposes using approximate gradients to optimize hyperparameters more efficiently than costly methods like grid search. It derives the gradient of the objective with respect to hyperparameters and presents an algorithm called HOAG that approximates this gradient using inexact solutions. The document analyzes HOAG's convergence and provides experimental results comparing it to other hyperparameter optimization methods.
The document summarizes key concepts in social network analysis including metrics like degree distribution, path lengths, transitivity, and clustering coefficients. It also discusses models of network growth and structure like random graphs, small-world networks, and preferential attachment. Computational aspects of analyzing large networks like calculating shortest paths and the diameter are also covered.
The document discusses improved approaches to implementing dynamic tries in a space-efficient manner. It summarizes the Bonsai data structure, which supports dynamic trie operations in O(1) expected time but uses O(nlogσ + nloglogn) bits of space. The document then proposes a new approach called m-Bonsai that uses only O(nlogσ) bits of space in expectation while also supporting O(1) expected time operations, achieving the optimal space bound. Experimental results show m-Bonsai uses significantly less memory than Bonsai and has comparable or better performance.
The document discusses algorithms complexity and data structures efficiency. It covers topics like asymptotic notation to analyze algorithm complexity, different time and memory complexities like constant, logarithmic, linear, quadratic, and exponential. Examples are provided to illustrate complexity calculations for algorithms with operations on arrays, matrices and recursion. Choosing the right data structures depends on the algorithm complexity and problem requirements.
Our techniques provide fast wavelet tree construction in practice based on recent theoretical work. Experiments on real datasets show our methods using the PEXT and PSHUFB CPU instructions outperform previous approaches. For wavelet trees, our methods are 1.9x faster than naive construction on average and competitive with state-of-the-art. For wavelet matrices, we achieve speedups of 1.1-1.9x over the state-of-the-art. This work provides the first practical implementation of the fastest known wavelet tree construction algorithms.
Electronic Codebook Book (ECB) encrypts each message block independently without chaining blocks together. This can reveal patterns in the ciphertext if the plaintext has repetitive blocks. Cipher Block Chaining (CBC) chains blocks together by XORing the previous ciphertext block with the current plaintext block before encryption. Counter (CTR) mode encrypts a counter value rather than feedback from previous blocks, allowing parallel encryption. These modes of operation can be used to encrypt data blocks securely depending on needs such as bulk encryption or streaming data.
CUDA First Programs: Computer Architecture CSE448 : UAA Alaska : NotesSubhajit Sahu
The document provides examples of simple CUDA programs for adding vectors and 2D arrays using kernel functions. It begins with a "Hello World" CUDA program and explains how to compile and run it. It then shows a CUDA program that adds two numbers in a kernel function using thread indexing. Next, it presents a CUDA program for adding two vectors with one thread per element. Finally, it demonstrates how to map a 2D array to linear memory and write a kernel to add 2D arrays using block indexing.
M A T H E M A T I C S I I I J N T U M O D E L P A P E R{Wwwguest3f9c6b
This document contains a mathematics examination for an engineering course. It consists of 8 questions testing various topics in complex analysis, including:
1) Evaluating integrals using techniques like Cauchy's integral formula and the residue theorem.
2) Finding Taylor and Laurent series expansions.
3) Identifying poles and residues of complex functions.
4) Applying complex transformations to map regions in the z-plane to the w-plane.
The questions cover topics such as contour integration, power series, analytic functions, and bilinear transformations. Students are instructed to answer any 5 of the 8 questions, each worth equal marks, in their 3-hour examination.
The document presents duality theory for composite geometric programs (CGPs), which include exponential GPs (EGPs) as a special case. An EGP allows some posynomial terms in the objective function to be multiplied by an exponential factor of another posynomial term. The key results are:
1) EGP problems can be formulated as convex programs by a change of variables.
2) The dual problem of an EGP is a posynomial program.
3) Strong duality holds between primal and dual EGP programs, and optimal solutions can be recovered from each other using extremality conditions.
4) Motivating examples like maximum likelihood estimation of Poisson and exponential parameters can be solved as
The document discusses attribute-based encryption (ABE) schemes, including Key-Policy ABE (KP-ABE) and Ciphertext-Policy ABE (CP-ABE). It defines the components of KP-ABE and CP-ABE, including setup, encryption, key generation, and decryption algorithms. It also describes the security models and proves the selective security of the GPSW KP-ABE scheme and correctness of the Waters CP-ABE scheme under the decisional bilinear Diffie-Hellman assumption. The document outlines the KP-ABE and CP-ABE constructions and security proofs in detail.
This document introduces a generalized method for constructing sub-quadratic complexity multipliers for finite fields of characteristic 2. It begins by reintroducing the Winograd short convolution algorithm in the context of polynomial multiplication. It then presents a recursive construction technique that extends any d-point multiplier into an n=dk-point multiplier with sub-quadratic area and logarithmic delay complexity. Several new constructions are obtained using this technique, one of which is identical to the Karatsuba multiplier. The techniques aim to develop bit-parallel multipliers with better time and/or space complexity than the traditional quadratic complexity approaches.
The document provides an overview of recursive and iterative algorithms. It discusses key differences between recursive and iterative algorithms such as definition, application, termination, usage, code size, and time complexity. Examples of recursive algorithms like recursive sum, factorial, binary search, tower of Hanoi, and permutation generator are presented along with pseudocode. Analysis of recursive algorithms like recursive sum, factorial, binary search, Fibonacci number, and tower of Hanoi is demonstrated to determine their time complexities. The document also discusses iterative algorithms, proving an algorithm's correctness, the brute force approach, and store and reuse methods.
1. Motivation: why do we need low-rank tensors
2. Tensors of the second order (matrices)
3. CP, Tucker and tensor train tensor formats
4. Many classical kernels have (or can be approximated in ) low-rank tensor format
5. Post processing: Computation of mean, variance, level sets, frequency
Profiling in Python provides concise summaries of key profiling tools in 3 sentences:
cProfile and line_profiler profile execution time and identify slow lines of code. memory_profiler profiles memory usage with line-by-line or time-based outputs. YEP extends profiling to compiled C/C++ extensions like Cython modules, which are not covered by the standard Python profilers.
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...Matt Moores
There are many approaches to Bayesian computation with intractable likelihoods, including the exchange algorithm, approximate Bayesian computation (ABC), thermodynamic integration, and composite likelihood. These approaches vary in accuracy as well as scalability for datasets of significant size. The Potts model is an example where such methods are required, due to its intractable normalising constant. This model is a type of Markov random field, which is commonly used for image segmentation. The dimension of its parameter space increases linearly with the number of pixels in the image, making this a challenging application for scalable Bayesian computation. My talk will introduce various algorithms in the context of the Potts model and describe their implementation in C++, using OpenMP for parallelism. I will also discuss the process of releasing this software as an open source R package on the CRAN repository.
A Proposition for Business Process ModelingAng Chen
The document proposes a graphical notation called BPMN+ for modeling business processes. It aims to be graphical and easy to use while also having formal semantics and being operational/executable. Key goals include being expressive, capable of distributed programming, and compatible with the existing BPMN standard. The notation is based on a Service Component Model and uses concepts like services, properties, and concurrent data objects. It also defines synchronization patterns like simultaneous, sequence, and alternative to model relationships between process steps.
This document discusses algorithms complexity and data structures efficiency. It covers topics like time and memory complexity, asymptotic notation, fundamental data structures like arrays, lists, trees and hash tables, and choosing proper data structures. Computational complexity is important for algorithm design and efficient programming. The document provides examples of analyzing complexity for different algorithms.
PVS-Studio team experience: checking various open source projects, or mistake...Andrey Karpov
To let the world know about our product, we check open-source projects. By the moment we have checked 245 projects. A side effect: we found 9574 errors and notified the authors about them.
Discuss seven functions, Analysis of algorithms- Experimental Studies/Primitive operations/Asymptotic notation- Big Oh/Big-Omega/Big-Theta
(Download is recommended to make the animations work)
This document discusses algorithms and their analysis. It begins by defining an algorithm and its key characteristics like being finite, definite, and terminating after a finite number of steps. It then discusses designing algorithms to minimize cost and analyzing algorithms to predict their performance. Various algorithm design techniques are covered like divide and conquer, binary search, and its recursive implementation. Asymptotic notations like Big-O, Omega, and Theta are introduced to analyze time and space complexity. Specific algorithms like merge sort, quicksort, and their recursive implementations are explained in detail.
09 a1ec01 c programming and data structuresjntuworld
This document contains questions for an exam on C programming and data structures. It asks students to answer any 5 of the following 8 questions:
1. Explain functions of preprocessor, compiler, linker and provide a flowchart to find maximum and minimum of 3 numbers.
2. Write C expressions for various mathematical operations and determine differences between C terms. Provide a program to check if a number is prime.
3. Write recursive and iterative functions to calculate power and provide a program to print currency denominations for a given amount.
4. Trace a sample program and determine its output.
5. Provide structures for college data and a function to calculate student strength in a college.
6.
Two dimensional arrays allow the storage of tables of values arranged in rows and columns. They are declared with the general form type array_name[row_size][column_size]. Elements are accessed using two indices, the first for the row and second for the column. Elements are stored in memory in row-major order, with contiguous blocks for each row. Dynamic allocation of 2D arrays involves allocating an array of pointers, with each pointer storing the address of a dynamically allocated 1D array for that row.
Effective Numerical Computation in NumPy and SciPyKimikazu Kato
This document provides an overview of effective numerical computation in NumPy and SciPy. It discusses how Python can be used for numerical computation tasks like differential equations, simulations, and machine learning. While Python is initially slower than languages like C, libraries like NumPy and SciPy allow Python code to achieve sufficient speed through techniques like broadcasting, indexing, and using sparse matrix representations. The document provides examples of how to efficiently perform tasks like applying functions element-wise to sparse matrices and calculating norms. It also presents a case study for efficiently computing a formula that appears in a machine learning paper using different sparse matrix representations in SciPy.
This document provides an overview of pointers in C programming. It discusses seven rules for pointers, including that pointers are integer variables that store memory addresses, how to dereference and reference pointers, NULL pointers, and arithmetic operations on pointers. It also covers dynamic memory allocation using malloc, calloc, realloc, and free and different approaches to 2D arrays. Finally, it discusses function pointers and their uses, including as callback functions.
C Recursion, Pointers, Dynamic memory managementSreedhar Chowdam
The document summarizes key topics related to recursion, pointers, and dynamic memory management in C programming:
Recursion is introduced as a process where a function calls itself repeatedly to solve a problem. Examples of recursive functions like factorial, Fibonacci series, and Towers of Hanoi are provided.
Pointers are defined as variables that store the memory addresses of other variables. Pointer operations like incrementing, decrementing, and arithmetic are described. The use of pointers to pass arguments to functions and access array elements is also demonstrated.
Dynamic memory allocation functions malloc(), calloc(), and realloc() are explained along with examples. These functions allocate and manage memory during run-time in C programs.
The document discusses algorithms and their analysis. It defines an algorithm as a well-defined computational procedure that takes inputs and produces outputs. It discusses analyzing algorithms based on their time complexity, space complexity, and correctness. It provides examples of analyzing simple algorithms and calculating their complexity based on the number of elementary operations.
Introducción al Análisis y diseño de algoritmosluzenith_g
The document discusses algorithms and their analysis. It defines an algorithm as a well-defined computational procedure that takes inputs and produces outputs. It discusses analyzing algorithms to determine their time and space complexity, and how this involves determining how the resources required grow with the size of the problem. It provides examples of analyzing simple algorithms and determining whether they have linear, quadratic, or other complexity.
The document discusses the objectives and concepts of cryptography. The four main objectives are confidentiality, data integrity, authentication, and non-repudiation. It describes symmetric-key cryptography which uses a single secret key for encryption and decryption, and asymmetric key cryptography which uses different keys for encryption and decryption. It also provides an overview of elliptic curve cryptography, including how it works and some benefits over RSA in providing equivalent security with smaller key sizes.
This document discusses algorithms and data structures. It begins by defining an algorithm as a set of instructions to accomplish a task and lists criteria such as being unambiguous and terminating. Data types and abstract data types are introduced. Methods for analyzing programs are covered, including time and space complexity using asymptotic notation. Examples are provided to illustrate iterative and recursive algorithms for summing lists as well as matrix operations.
This document discusses key concepts in algorithm analysis and design including algorithms, data types, specifications vs implementations, time and space complexity analysis, asymptotic notation, and examples. It defines an algorithm as a set of instructions to accomplish a task, and a data type as a collection of objects and operations on those objects. The document covers analyzing time complexity using counting steps and asymptotic notation such as Big O, Omega, and Theta. Examples of matrix and list functions and their analyses are provided.
This document discusses fundamental concepts in data structures and algorithms including:
1) How to design algorithms by analyzing requirements, designing data objects and operations, and refining and coding programs.
2) Key criteria for algorithms including being unambiguous, terminating in a finite number of steps, and having basic instructions.
3) Abstract data types which separate the specification of objects and operations from their implementation.
4) Asymptotic analysis which classifies algorithms according to how their running time grows relative to the input size.
This document discusses fundamental concepts in data structures and algorithms including:
1) How to design algorithms by analyzing requirements, designing data objects and operations, and refining and coding programs.
2) Key criteria for algorithms including being input/output defined, unambiguous, and able to terminate in a finite number of steps.
3) How abstract data types specify objects and operations separately from implementation.
4) Examples of analyzing time and space complexity for different algorithms including iterative and recursive solutions.
5) Common asymptotic notations like O(1), O(n), O(n^2), and O(nlogn) and how they describe an algorithm's growth rate.
The document summarizes a presentation on revocable identity-based encryption (RIBE) from codes with rank metric. Key points:
- RIBE adds an efficient revocation procedure to identity-based encryption by using a binary tree structure and key updates.
- The construction is based on low rank parity-check codes, with the master secret key defined as the "trapdoor" generated by the RankSign algorithm.
- Security relies on the rank syndrome decoding problem. Key updates are done efficiently through the binary tree with logarithmic complexity.
- Parameters are given that allow decoding of up to 2wr errors with small failure probability, suitable for the identity-based encryption scheme.
Similar to ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to understand heap memory requirements (20)
From Natural Language to Structured Solr Queries using LLMsSease
This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints.
That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal.
The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index’s metadata.
This approach leverages the LLM’s ability to understand the nuances of natural language and the structure of documents within Apache Solr.
The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.
This talk will cover ScyllaDB Architecture from the cluster-level view and zoom in on data distribution and internal node architecture. In the process, we will learn the secret sauce used to get ScyllaDB's high availability and superior performance. We will also touch on the upcoming changes to ScyllaDB architecture, moving to strongly consistent metadata and tablets.
ScyllaDB is making a major architecture shift. We’re moving from vNode replication to tablets – fragments of tables that are distributed independently, enabling dynamic data distribution and extreme elasticity. In this keynote, ScyllaDB co-founder and CTO Avi Kivity explains the reason for this shift, provides a look at the implementation and roadmap, and shares how this shift benefits ScyllaDB users.
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
Discover the Unseen: Tailored Recommendation of Unwatched ContentScyllaDB
The session shares how JioCinema approaches ""watch discounting."" This capability ensures that if a user watched a certain amount of a show/movie, the platform no longer recommends that particular content to the user. Flawless operation of this feature promotes the discover of new content, improving the overall user experience.
JioCinema is an Indian over-the-top media streaming service owned by Viacom18.
Introducing BoxLang : A new JVM language for productivity and modularity!Ortus Solutions, Corp
Just like life, our code must adapt to the ever changing world we live in. From one day coding for the web, to the next for our tablets or APIs or for running serverless applications. Multi-runtime development is the future of coding, the future is to be dynamic. Let us introduce you to BoxLang.
Dynamic. Modular. Productive.
BoxLang redefines development with its dynamic nature, empowering developers to craft expressive and functional code effortlessly. Its modular architecture prioritizes flexibility, allowing for seamless integration into existing ecosystems.
Interoperability at its Core
With 100% interoperability with Java, BoxLang seamlessly bridges the gap between traditional and modern development paradigms, unlocking new possibilities for innovation and collaboration.
Multi-Runtime
From the tiny 2m operating system binary to running on our pure Java web server, CommandBox, Jakarta EE, AWS Lambda, Microsoft Functions, Web Assembly, Android and more. BoxLang has been designed to enhance and adapt according to it's runnable runtime.
The Fusion of Modernity and Tradition
Experience the fusion of modern features inspired by CFML, Node, Ruby, Kotlin, Java, and Clojure, combined with the familiarity of Java bytecode compilation, making BoxLang a language of choice for forward-thinking developers.
Empowering Transition with Transpiler Support
Transitioning from CFML to BoxLang is seamless with our JIT transpiler, facilitating smooth migration and preserving existing code investments.
Unlocking Creativity with IDE Tools
Unleash your creativity with powerful IDE tools tailored for BoxLang, providing an intuitive development experience and streamlining your workflow. Join us as we embark on a journey to redefine JVM development. Welcome to the era of BoxLang.
Session 1 - Intro to Robotic Process Automation.pdfUiPathCommunity
👉 Check out our full 'Africa Series - Automation Student Developers (EN)' page to register for the full program:
https://bit.ly/Automation_Student_Kickstart
In this session, we shall introduce you to the world of automation, the UiPath Platform, and guide you on how to install and setup UiPath Studio on your Windows PC.
📕 Detailed agenda:
What is RPA? Benefits of RPA?
RPA Applications
The UiPath End-to-End Automation Platform
UiPath Studio CE Installation and Setup
💻 Extra training through UiPath Academy:
Introduction to Automation
UiPath Business Automation Platform
Explore automation development with UiPath Studio
👉 Register here for our upcoming Session 2 on June 20: Introduction to UiPath Studio Fundamentals: https://community.uipath.com/events/details/uipath-lagos-presents-session-2-introduction-to-uipath-studio-fundamentals/
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...AlexanderRichford
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation Functions to Prevent Interaction with Malicious QR Codes.
Aim of the Study: The goal of this research was to develop a robust hybrid approach for identifying malicious and insecure URLs derived from QR codes, ensuring safe interactions.
This is achieved through:
Machine Learning Model: Predicts the likelihood of a URL being malicious.
Security Validation Functions: Ensures the derived URL has a valid certificate and proper URL format.
This innovative blend of technology aims to enhance cybersecurity measures and protect users from potential threats hidden within QR codes 🖥 🔒
This study was my first introduction to using ML which has shown me the immense potential of ML in creating more secure digital environments!
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfleebarnesutopia
So… you want to become a Test Automation Engineer (or hire and develop one)? While there’s quite a bit of information available about important technical and tool skills to master, there’s not enough discussion around the path to becoming an effective Test Automation Engineer that knows how to add VALUE. In my experience this had led to a proliferation of engineers who are proficient with tools and building frameworks but have skill and knowledge gaps, especially in software testing, that reduce the value they deliver with test automation.
In this talk, Lee will share his lessons learned from over 30 years of working with, and mentoring, hundreds of Test Automation Engineers. Whether you’re looking to get started in test automation or just want to improve your trade, this talk will give you a solid foundation and roadmap for ensuring your test automation efforts continuously add value. This talk is equally valuable for both aspiring Test Automation Engineers and those managing them! All attendees will take away a set of key foundational knowledge and a high-level learning path for leveling up test automation skills and ensuring they add value to their organizations.
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Ukraine
Під час доповіді відповімо на питання, навіщо потрібно підвищувати продуктивність аплікації і які є найефективніші способи для цього. А також поговоримо про те, що таке кеш, які його види бувають та, основне — як знайти performance bottleneck?
Відео та деталі заходу: https://bit.ly/45tILxj
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsScyllaDB
ScyllaDB monitoring provides a lot of useful information. But sometimes it’s not easy to find the root of the problem if something is wrong or even estimate the remaining capacity by the load on the cluster. This talk shares our team's practical tips on: 1) How to find the root of the problem by metrics if ScyllaDB is slow 2) How to interpret the load and plan capacity for the future 3) Compaction strategies and how to choose the right one 4) Important metrics which aren’t available in the default monitoring setup.
Must Know Postgres Extension for DBA and Developer during MigrationMydbops
Mydbops Opensource Database Meetup 16
Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration
Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting
Date & Time: 8th June | 10 AM - 1 PM IST
Venue: Bangalore International Centre, Bangalore
Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle.
Key Takeaways:
* Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities.
* Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom.
* Discover how these key extensions can empower both developers and DBAs during the migration process.
* Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends.
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Follow us on LinkedIn: https://in.linkedin.com/company/mydbops
For more details and updates, please follow up the below links.
Meetup Page : https://www.meetup.com/mydbops-databa...
Twitter: https://twitter.com/mydbopsofficial
Blogs: https://www.mydbops.com/blog/
Facebook(Meta): https://www.facebook.com/mydbops/
AppSec PNW: Android and iOS Application Security with MobSFAjin Abraham
Mobile Security Framework - MobSF is a free and open source automated mobile application security testing environment designed to help security engineers, researchers, developers, and penetration testers to identify security vulnerabilities, malicious behaviours and privacy concerns in mobile applications using static and dynamic analysis. It supports all the popular mobile application binaries and source code formats built for Android and iOS devices. In addition to automated security assessment, it also offers an interactive testing environment to build and execute scenario based test/fuzz cases against the application.
This talk covers:
Using MobSF for static analysis of mobile applications.
Interactive dynamic security assessment of Android and iOS applications.
Solving Mobile app CTF challenges.
Reverse engineering and runtime analysis of Mobile malware.
How to shift left and integrate MobSF/mobsfscan SAST and DAST in your build pipeline.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to understand heap memory requirements
1. Quantitative analysis of Java/.Net like
programs to understand heap memory
requirements
Diego Garbervetsky
Departamento de Computación
Facultad de Ciencias Exactas y Naturales
Universidad de Buenos Aires (UBA) ByteCode 2012
3. Static analysis of heap memory in Java like programs
• Analysis of memory allocations is very hard
– Problem undecidable in general
– Impossible to find an exact expression for the number on
allocated objects
• Predicting actual heap memory requirements is
harder
– Garbage Collection memory required <= memory
requested/allocated (live object <= object allocated)
– Requires analysis of object lifetime
4. #Live Objects != Memory required
• Analysis actual memory consumption also requires understanding
internals of the underlying VM, the memory manager(and
potentially the operating system).
• We still believe analyzing number of allocations and live objects is
cornerstone
5. Stack allocation vs Heap allocation
• Heap allocation: Associated with data structures
created by the app, controlled by GC
– requires analysis of object lifetime
• Stack allocation: Frame bounding + Stack Depth
– requires analysis of recursive method calls
• Popeea [CNPQ08]: ISMM08, Albert [ISMM07], and approaches for
functional languages
6. Example
public static D[][] init(int n, int m) {
D[][] matrix = new Object[n][m];
for(int i = 0; i < n; i++)
for(int j = 0; j < m; j++)
matrix[i][j] = new D(i);
return matrix;
}
• n*m objects of type D and an array of n*m references (D[][])
• Ignoring types, we can consider the total allocated as 2(n.m)
or n.m+1 objects (depending how we count arrays)
7. Verification
public static D[][] init(int n, int m)
ensures memoryAlloc <= 2*n*m;
{
D[][] matrix = new Object[n][m];
for(int i = 0; i < n; i++)
invariant memoryAlloc <= n*m + i;
for(int j = 0; j < m; j++)
invariant memoryAlloc <= n*m + n*i+j;
matrix[i][j] = new D(i);
return matrix;
}
• Non linear SMT solver + invariants
8. Verification
public static D[][] init(int n, int m)
ensures memoryReq <= 2*n*m + 1;
{
escapes D[][] matrix = new Object[n][m];
for(int i = 0; i < n; i++)
invariant memoryReq <= n*m + i;
for(int j = 0; j < m; j++)
invariant memoryReq <= n*m + n*i+j + 1;
escapes matrix[i][j] = new D(i);
collectable A a = new A();
return matrix;
}
• Include lifetime/sharing/shape information
9. Inference of bounds in imperative languages
Some approaches for imperative languages:
• Abstract Interpretation [e.g., BCJP09]
• Recurrence equations [e.g., AGG07/09]
• Iteration patterns / Ranking functions [e.g., GZ10]
• Counting / Iteration spaces [BGY05/06,BFGY08,…]
10. [e.g., BCJP09]
Abstract interpretation
• Use counters to represent memory allocation (and deallocation)
• Compute program invariants (using a lattice and fixpoint)
public static D[][] init(int n, int m)
{
D[][] matrix = new D[n][m];
objects += n*m
for(int i = 0; i < n; i++)
for(int j = 0; j < m; j++)
matrix[i][j] = new D(i);
objects +=1
return matrix;
}
• Requires inferring non-linear invariants
– Rodriguez-Carbonell, Müller-Olm, Cachera, etc
11. [e.g., AGG07/09]
Recurrence Equations
• Computes a set of recurrence equations
– Then tries to find a close-form solution to the recurrence
equation
public static D[][] init(int n, int m) init(n,m) = loop1(n,m,0)
{
{i<n,i'=i+1}
D[][] matrix = new D[n][m];
loop1(n,m,i) = loop2(m,0) + loop1(n,m,i')
for(int i = 0; i < n; i++) {i>=n}
loop1(n,m,i) = 0
for(int j = 0; j < m; j++)
matrix[i][j] = new D(i); {j<m, j'=j+1}
loop2(m,j) = 4 + c(D) + loop2(m,j')
return matrix; {j>=m}
}
loop2(m,j) = 0
Solution init(n, m) = 2.n.m
- Albert and collaborators. / Costa Analyzer
12. BGY05/06
for(i=0;i<n;i++) {0≤ i < n, 0≤j<i}: a set of
for(j=0;j<i;j++) constraints describing a iteration
• new C() space
Dynamic Memory request number of visits to new statements
number of possible variable assignments at its control location
number of integer solutions of a predicate constraining variable
assignments at its control location (i.e. an invariant) j
For linear invariants, # of integer solutions = # of integer
n
points = Ehrhart polynomial
size(C) * ( ½n2+½n) i
n+1 14
13. Outline
• Inferring parametric upper-bounds of heap
memory usage (or live objects)
• Our new compositional approach
• Verification of .NET programs
• Conclusions
15. Example
• How much memory is required to run m0?
void m0(int mc) {
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for (int i = 1; i <= k; i++){
4: A a = new A();
5: B[] dummyArr= m2(i);
}
}
B[] m2(int n) {
6: B[] arrB = new B[n];
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();
9: C c = new C();
10: c.value = arrB[j-1];
}
11: return arrB;
}
16. Example
• How much memory is required to run m0?
“Ideal” consumption m0(2)
void m0(int mc) { 14
1: m1(mc); 12
2: B[] m2Arr=m2(2 * mc); 10
} 8
void m1(int k) { 6
3: for (int i = 1; i <= k; i++){ 4
4: A a = new A(); 2
5: B[] dummyArr= m2(i); 0
} ret m1
}
B[] m2(int n) { Ideal consumption m0(7)
80
6: B[] arrB = new B[n];
70
7: for (int j = 1; j <= n; j++) { 60
8: arrB[j-1] = new B(); 50
9: C c = new C(); 40
10: c.value = arrB[j-1]; 30
20
}
10
11: return arrB; 0
} 18
ret m1
17. Our goal
An expression over-approximating the peak
amount of memory consumed by a method
Parametric
Easy to evaluate at run time
E. g. :Required(m)(p1,p2) = 2p2 + p1
Evaluation cost known “a priori”
Given a method m(p1,..,pn)
peak(m): an expression in terms of p1,…,pn
for the max amount of memory consumed by m
18. In a nutshell
How: Performing a good approximation of memory
requirements using a region-based memory manager
(RTSJ)
1. Infer total allocations by counting visits to new
statements
2. Compute memory regions using escape analysis
3. Compute peak consumption using a region based
memory manager
19. BGY05/06
Inferring parametric upper-
bounds of heap memory
usage (or live objects)
I) Computing dymamic memory
allocations
Víctor A. Braberman, Diego Garbervetsky, Sergio Yovine: A Static Analysis for Synthesizing Parametric Specifications of
Dynamic Memory Consumption. Journal of Object Technology 5(5): 31-58 (2006)
20. Memory requested by a method
void m0(int mc) {
1: Identify allocation sites
1: m1(mc);
Key of the technique: Manipulate
2: B[] m2Arr=m2(2 * mc); Ex: m0.1.m1.5.m2.8, cs for new B with stack
} linear expressions, easier to handle (m0.1.m1.5). m0.2.m2.8 = stack (m0.2)
void and less expensive in order to
m1(int k) {
3: for (int non-linear expressions
generate i = 1; i <= k; i++){ 2: Find invariants for creation sites
4: A a = new A();
5: B[] dummyArr= m2(i); Im0(m0.1.m1.5.m2.8) {k=mc 1≤i≤k n=i
} 1≤j≤n}
}
B[] m2(int n) { 3: Count the number of solutions (in terms of
6: B[] arrB = new B[n]; MUA parameters)
7: for (int j = 1; j <= n; j++) {
{(k,i,j,n)| (k=mc 1≤i≤k n=i 1≤j≤n) }
8: arrB[j-1] = new B();
9: C c = new C(); = ½ mc2 + ½ mc
10: c.value = arrB[j-1]; 4: Transform number of visits into memory
} consumption
11: return arrB;
} size(B)*½ mc2 + ½ mc
Creation sites reachable from m0: 5: Sum up the resulting expressions
CSm0
={m0.1.m1.4, m0.1.m1.5.m2.6, m0.1.m1.5.m2.8, (size(B[]) + size(B) + size(C))(1/2 mc2 +5/2 mc)
m0.1.m1.5.m2.9 +size(A)mc
21. Memory requested by a method
How much memory (in terms of m0
parameters) is requested/allocated by m0
totAlloc(m0)(mc) = S(m0, cs)
cs CS_m0
= (size(B[])+size(B)+ size(C))(1/2 mc2 +5/2 mc)
+size(A)mc
= 3/2 mc2 +17/2 mc [considering (size(T) = 1]
28
22. Problem
• Memory is released by a garbage collector
– Very difficult to predict when, where, and how many object are collected
Ideal consumption Ideal consumption
15 80
60
10
40
5
20
0 0
m0(2) m0(7)
• Our approach: Approximate GC using a scope-based region memory
manager
29
24. RTSJ (Real time specification for Java)
• Scoped memory management
– Dynamic memory organized in regions associated to
particular scopes
• Methods, threads, etc.
• Advantages
– Better time predictability
• (compared with non RTGCs)
– More controlled object allocation
and deallocation
• Useful for memory consumption predictability
• But… you must respect scoping restriction
– Potential dangling references
31
25. Region-based memory management
void m1(int k) {
SM.enter(Regions.rm1);
for (int i = 1; i <= k; i++) {
A a =SM.newInstance(CSs.m1_4, A.class);
B[] dummyArr = m2(i); Rm1(m1.4,m2.6,m2.8)
}
SM.exit(); m1.4 m2.6 m2.8
} m2.6 m2.8
B[] m2(int n) { m2.6 m2.8
SM.enter(Regions.rm2);
B[] arrB = (B[])SM.newAInstance(CSs.m2_6, B.class,n);
• Region-based program
for (int j = 1; j <= n; j++) {
arrB[j - 1] = (B)SM.newInstance(CSs.m2_8, B.class); Rm2(m2.9)
C c = (C)SM.newInstance(CSs.m2_9, C.class);
c.value = arrB[j - 1];
}
m2.9
SM.exit(); m2.9
return arrB; m2.9
}
Regions.rm1 = <“rm1”,{CSs.m1_4, CSs.m2_6, CSs.m2_8}>
Regions.rm1 = <{“rm2”,{CSs.m2_9}>
Diego Garbervetsky, Chaker Nakhli, Sergio Yovine, Hichem Zorgati: Program Instrumentation and Run-Time Analysis of
Scoped Memory in Java. Electr. Notes Theor. Comput. Sci. 113: 105-121 (2005)
26. Region-based memory management
• Memory organized using m-regions
void m0(int mc) {
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for (int i = 1; i <= k; i++){
4: A a = new A();
5: B[] dummyArr= m2(i);
}
}
B[] m2(int n) {
6: B[] arrB = new B[n];
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();
9: C c = new C();
10: c.value = arrB[j-1];
}
11: return arrB;
}
33
27. Region-based memory management
• Escape Analysis to infer regions
• Escape(m): objects that live beyond m
void m0(int mc) {
1: m1(mc); – Escape(mo) = {}
2: B[] m2Arr=m2(2 * mc); – Escape(m1) = {}
}
void m1(int k) { – Escape(m2) = {m2.6, m2.8}
3: for (int i = 1; i <= k; i++){
4: A a = new A();
• Capture(m): objects that do not live
5: B[] dummyArr= m2(i);
} more that m
}
– Capture(mo) = {m0.2.m2.6, m0.2.m2.8},
B[] m2(int n) {
6: B[] arrB = new B[n]; – Capture(m1) =
7: for (int j = 1; j <= n; j++) { {m1.4, m0.1.m1.5.m2.6, m0.1.m1.5.m2.8},
8: arrB[j-1] = new B();
– Capture(m2) = {m2.9}
9: C c = new C();
10: c.value = arrB[j-1]; • Region(m) Capture(m)
}
11: return arrB;
}
34
28. Region-based memory management
• Memory organized using m-regions
void m0(int mc) {
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for (int i = 1; i <= k; i++){
4: A a = new A();
5: B[] dummyArr= m2(i);
}
}
B[] m2(int n) {
6: B[] arrB = new B[n];
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();
9: C c = new C();
10: c.value = arrB[j-1];
}
11: return arrB;
}
35
29. Obtaining region sizes
• Region(m) Capture(m)
• memCap(m): an expression in terms of p1,…,pn for
the amount of memory required for the region associated
with m
• memCap(m)is totAlloc(m)applied only to captured
allocations
• memCap(m0) = (size(B[]) + size(B)).2mc
• memCap(m1) = (size(B[]) + size(B)).(1/2 k2 +1/2k) +size(A).k
• memCap(m2)= size(C).n 37
30. Inferring parametric upper-
bounds of heap memory
usage (or live objects)
III) Approximating peak consumption
-Víctor A. Braberman, Federico Javier Fernández, Diego Garbervetsky, Sergio Yovine: Parametric prediction of heap
memory requirements. ISMM 2008: 141-150
-Philippe Clauss, Federico Javier Fernández, Diego Garbervetsky, Sven Verdoolaege: Symbolic Polynomial Maximization
Over Convex Sets and Its Application to Memory Requirement Estimation. IEEE Trans. VLSI Syst. 17(8): 983-996 (2009)
31. Approximating peak consumption
We over approximate an ideal memory manager using a
scoped-based memory regions
– m-regions: one region per method
• When & Where:
• created at the beginning of method
• destroyed at the end
• How much memory is allocated/deallocated in each
region:
• memCap(m) >= actual region size of m for any call context
• How much memory is allocated in outer regions :
• memEsc(m) >= actual memory that is allocated in callers regions
39
32. Approximating Peak (m)
Region’s stack evolution
Some region configurations can not happen at the same time
e.g m0.1.m1.m2 and m0.2.m2
?
rm2 rm2
?
rm1 rm1 rm1 rm1 rm1 rm2
?
rm0 rm0 rm0 rm0 rm0 … rm0 rm0 rm0
peak ( 0, m0) = max size(rk( )) 41
33. Approximating Peak (m)
Region sizes may vary according to method calling context
rsize(m2) = n (assume size(C)=1)
m0.1.m1.5.m2
{ k= mc, 1 i k, n = i}
{ k= mc = n} maximizes
maxrsize(m0.1.m1.5.m2,m0) = mc
rm2 rm2
rm2
In terms of m0
rm1 rm1 rm1 parameters!
…
rm0 rm0 rm0
42
peak (m0)
34. Approximating Peak (m)
3. Maximizing instantiated regions
maxrsize( .m,m0)(Pm0)
= Maximize rsize(m) subject to I (Pm0 ,Pm, W)
• m-region expressed in terms of m parameters
– rsize(m2)(m0) = n
• A complex non-linear maximization problem
• Maximum according to calling context and in terms
even when parameters are instantiated (in
of MUA parameters
runtime)
– maxrsize(m0.1.m1.5.m2,m0) (mc) = mc
• Too expensive
– maxrsize(m0.2.m2,m0)(mc) = 2mc
• Execution time difficult to predict
43
35. Solving maxrsize
• Solution: an approach based on Bernstein
basis over polyhedral domains (Clauss et al. 2004)
– Enables bounding a polynomial over a
parametric domain given as a set of linear restraints
– Obtains a parametric solution
• Bernstein(pol, I):
– Input: a polynomial pol and a set of linear (parametric) constrains I
– Return a set of polynomials (candidates)
• Bound the maximum value of pol in the domain given by I
44
36. maxrsize
max { q(Pmo) C1} if D1(Pmo)
Maxrsize(m0, .mk)=
max { q(Pmo) Ck} if Dk(Pmo)
where {Ci, Di} = Bernstein(rsize(mk), I .mk,Pm0)
• Maxrsize(m0,m0)(mc) = (size(B[]) + size(B)).2mc
• Maxrsize(m0.1.m1,m0)(mc) =
(size(B[]) + size(B)).(1/2 mc2 +1/2mc) +size(A).mc
• Maxrsize(m0.1.m1.5.m2,m0)(mc) = size(C).mc
• Maxrsize(m0.21m2,m0)(mc) = size(C).2mc
45
37. Approximating Peak (m)
We consider the largest region for the same calling context
[1..k]
peak (m0) max maxrsize
1 k | |
m0
(m c) = mem (m0)
mo
m0.1.m1.5.m2
m0.1.m1.5 maxrm2
m0.2
maxrm1 maxrm1 maxrm2
m0
maxrm0 maxrm0 maxrm0 maxrm0
47
mem (m0)
max
38. Dynamic memory required to run a method
Memreqm0(mc) = mc2 +7mc
M2
M1
M0
ideal
memRq(4)
Init start call call ret call ret call ret call ret ret call ret ret end
m0 m1 m2 m2 m2 m2 m2 m2 m2 m2 m1 m2 m2 m0 51
39. Inferring parametric upper-
bounds of heap memory
usage (or live objects)
Tool support
Diego Garbervetsky, Sergio Yovine, Víctor A. Braberman, Martín Rouaux, Alejandro Taboada: Quantitative dynamic-
memory analysis for Java. Concurrency and Computation: Practice and Experience 23(14): 1665-1678 (2011)
43. Inferring total allocations
Counting
Ensures that variables
concerning visits to
statements are
considered in the
counting (to ensure
soundness)
Try to compute a
minimal set of variables
(to filter out irrelevant
variables, better
precision)
56
44. Inferring regions
Our tool
1. Automatic region inference of
m-regions
– Using escape analysis
2. Translation to region based
bytecode
– RC (regions library)
– RTSJ
– JikesVM
57
45. Refining memory regions
• Escape analysis over approximates object lifetime (to be safe)
– It may impact of memory regions
• JScoper: A tool for Region edition and visualization
– Call graph visualization
– Region edition
– Interfacing with Escape Analysis
– Region-based code
generation
– Region-based memory
manager simulator
Andrés Ferrari, Diego Garbervetsky, Víctor A. Braberman, Pablo Listingart, Sergio Yovine: JScoper: Eclipse support for
58
research on scoping and instrumentation for real time Java applications. ETX 2005: 50-54
46. Peak memory computation component
• Non linear maximization problem
solved using an approach based on
Bernstein basis over polyhedral domains
(Clauss et al. 2004-2009)
• Enables bounding a polynomial over
a parametric domain given as a set of
linear restraints
• Yields a set of candidate
polynomials
59
48. Limitations
• For loop intensive programs the tool performs very
well
– Not well suited for memory allocating recursions
• Imprecision comes mainly from:
– Escape analysis
– Program invariants
– Inductive variables analysis
– Approximations of maximum regions sizes
61
49. More limitations
• Global approach make the analysis and tuning of results a
hard task
– Too many variables, parameters and bindings
• Affects scalability and usability
• Sometimes it would be necessary to provide bounds
manually
– Recursion, non analyzable methods, easy to understand but with a
non-linear invariants
51. Why?
• Scalability
– Symbolic manipulation algorithms complexity heavily
depend on the number of the involved variables
• Usability:
– Manual inspection and tuning of program invariants are
much easier when dealing with local invariants
64
52. More reasons
• Dealing with non analyzable methods
– User provided annotations (applies also for mutually
recursive components)
• Enables the use of other counting
mechanisms
• Ability to analyze programs fragments
• Better support for Polymorphism
53. Compositional analysis = method summaries
B[] m2(int n) { MR_m2 = 3n
6: B[] arrB = new B[n]; 6: n (new B[n])
7: for (int j = 1; j <= n; j++) {
8: n (new B(), count {j = 1..n} =n)
8: arrB[j-1] = new B();
9: C c = new C();
9:= n (new C(), count {j = 1..n} =n
10: c.value = arrB[j-1];
}
11: return arrB;
}
4: k (new A(), count {i = 1..k} =k)
void m1(int k) MR_m1 = 3/2k^2 + 5/2k
{
3: for (int i = 1; i <= k; i++){ 5:= call to m2(i) {i = 1..k}
4: A a = new A();
5: B[] dummyArr= m2(i); sum{i = 1..k} 3i = 3*(k(k+1))/2=3/2(k2+k)
}
} symbolic
operation on
polynomials
void m0(int mc) {= 3/2mc^2+17/2mc
MR_m0 1: call to m1(mc) = 3/2mc2+5/2mc
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
2: call to m2(2*mc) = 3*(2mc) = 6mc
}
54. Challenge
Do not loose too much precision!!
Compositional: 3/2*mc2+17/2mc vs. Global: mc2 +7mc
• Specification of memory reclaiming in a
compositional fashion
• Symbolic manipulation of summaries
– Maximize polynomials over iterations spaces
– Sum polynomials over iterations spaces
56. Modeling object reclaming
B[] m2(int n) {
6: B[] arrB = new B[n]; 6 and 8 live longer than m2
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B(); A responsibility of the caller
9: C c = new C();
10: c.value = arrB[j-1]; 9 can be safely collected
}
11: return arrB;
}
when m2 finishes
Idea: Enrich summaries in order to distinguish
escaping objects from captured objects
6 y 8 are escaping or residual and 9 is auxiliary
57. Compositional analysis (simplified)
B[] m2(int n) { MR_m2 = 3n 6,8:= n+n = 2n (Escapes)
6: B[] arrB = new B[n]; ME_m2 = 2n
9:= n (do not escape)
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B(); MR_m2(n) = 3n ( ME, 2n, MT = 1)
9: C c = new C();
10: c.value = arrB[j-1];
} Escaping memory
11: return arrB;
is accumulative!
}
4: k (do not escape)
void m1(int k) { MR_m1 = k^2 + 3k 5:= call to m2(i) {i = 1..k}
3: for (int i = 1; ME_m1k;0 i++){
i <= = sum{i=1..k} ME_m2(i) = 2*(k(k+1))/2= (k2+k)
4: A a = new A(); max{i=1..k} MR-ME_m2(i) = k
5: B[] dummyArr= m2(i);
} MR = k + (k2+k) + k
symbolic
}
2+3mc
operation on
1: call to m1(mc) = mc polynomials
void m0(int mc) {
2: call to m2(2*mc) =
MR_m0 = mc^2+7mc
1: m1(mc); ME_m0 = 0 • From ME_m2 2*2mc = 4mc
2: B[] m2Arr=m2(2 * mc); • From MR-ME_m2 2mc = 2mc
}
MR = max(mc2+3mc, 2mc)+4mc
58. Precision vs. Granularity
• How we call help the caller to identify
if escaping objects can be collected?
• Subheap: A set of objects that have similar lifetime
– Subheap descriptor: An identifier for a subheap
• Esc can be organized in terms of subheaps
public B two() {
1:this.g = new A();
• Esc(all) = 3
2:b = new B() • Esc({This}) =1
3:b.f = new C();
4:return b;
• Esc({Ret}) = 2
} • Esc({Ret}) = 1 Esc({Ret_f}) = 1
59. Subheap descriptors and Esc Analysis
public B two() {
1:this.g = new A();
2:b = new B()
3:b.f = new C();
4:return b;
}
Salcianu’s Points –to Graph Steengard’s like equivalences
this
P {b~2~ 3~ret} sh descriptor = eq class
N b sh descriptor = {this~1}
fg • Esc({1}) = 1 • Esc(ret) = 2
• Esc({2}) = 1 • Esc(this) = 1
1 2
f
• Esc({3}) = 1
3
60. From Code Contracts to Memory Contracts
static public int GCD(int x, int y) {
Contract.Requires(x > 0 && y > 0);
Contract.Ensures(Contract.Result<int>()>0);
while (true) {
if (x < y){
y %= x;
if (y == 0) return x;
} else {
x %= y;
if (x == 0) return y;
}
}
}
• Code Contracts: Pre/posconditions loop and object invariants
– Runtime checking
– Static checking using abstract interpreter
• Automatic Inference of loop invariants!!
61. From Code Contracts to Memory Contracts
• Memory contracts
– Contract.Memory.X / X= {MemReq, RSD, TMP}
• Annotations for specifying object lifetime
• DestRsd: whether an allocations escapes through a subheap
• AddRsd: Determine the destination of a callee subheap in the
caller
• BindRsd: Associate a subheap name with a expr in code.
62. DestTmp() indicates the
next object is temporary
logger is a temporary
object since it can be
collected when method
finishes its ex ecution
Rsd(sh,n) or Esc speficies
the number of objects
escaping by sh is at most n
node is a residual object
because its lifetime
exceeds that of the DestRsd(sh) indicates the
method that creates it next object is escaping and
tagged with sh
Squiggles means errors
63. How annotations are checked?
• Code is instrumented to include counters
– Each subheap has its counter
– Annotations determine which counter should be added
• Lifetime annotations are checked using a Points-to and Escape
Analysis
Jonathan Tapicer, Diego Garbervetsky, Martín Roaux , "Resource Usage Contracts for .NET" , TOPI 2011: 1st Workshop on
Developing Tools as Plug-ins - 2011
64. How annotations are checked?
(simplified)
B[] m2(int n) {
B[] m2(int n) {
Contract.ensures(ret_m2<=2n)
Contract.Memory.Rsd(ret,2n) Contract.ensures (mr<= 3n)
Contract.Memory.MR(3n) ret_m2+=n; mr+=n;
Contract.Memory.DestRsd(); 6: B[] arrB = new B[n];
6: B[] arrB = new B[n]; 7: for (int j = 1; j <= n;j++){
7: for (int j = 1; j <= n; j++) { ret_m2++;mr++;
Contract.Memory.DestRsd(); 8: arrB[j-1] = new B();
8: arrB[j-1] = new B(); mr++;
9: C c = new C(); 9: C c = new C();
10: c.value = arrB[j-1]; 10: c.value = arrB[j-1];
} }
11: return arrB; 11: return arrB;
} }
• We leverage of automatic invariant inference from
Cousot (Code Contracts checker) to prove this
automatically
65. How annotations are checked? (problem)
B[] m2(int n) {
Contract.Memory.Rsd(Ret,2n)
Contract.Memory.MR(3n)
void m1(int k) {
void m1(int k) { Contract.ensures(this_m1<=k*k+k)
Contract.Memory.Rsd(this,k*K+K) Contract.ensures (mr<= k*k+3k)
Contract.Memory.MR(k*K+3K) 3: for (int i = 1; i <= k; i++){
3: for (int i = 1; i <= k; i++){ mr++;
4: A a = new A(); 4: A a = new A();
Contract.Memory.AddRsd(This,Ret); this_m1+=2i; mr+=2i
5: this.f[i] = m2(i); mr+=max{1,i};
} 5: this.f[i] = m2(i);
} }
}
• To prove this_m1<=k*k+k we need a non-linear invariant!
• Beyond the capabilities of Clousot
66. How annotations are checked?
Good news!
• We know how to count/sum solutions to iteration spaces
• We just need linear invariants
void m1(int k) { Inferred by
Contract.ensures(this_m1<=k*k+k) Clousot
Contract.ensures (mr<= k*k+3k)
3: for (int i = 1; i <= k; i++){
Contract.itSpace(1<=i<=k)
4: mr+;
A a = new A(); Computed using the
loop=sum(itSpace2,2i);
mr+=max{1,i};
symbolic calculator
5: this.f[i] = m2(i);
}
Contract.assume(loop<=k*k+k) We force the checker to
this_m1+=loop; mr+=loop; accept the bound
}
68. Polymorphism
public void test(List l) { ME_A1do = 2
foreach(e:l){ Class A1 extends A {
if(cond) a = new A1(); public int do(int n) { .. }
else a= new A2(); }
a.do(e);
} ME_A2do = n
Class A2 extends A {
}
public int do(int n) { .. }
}
sum{i=1..size(l)} (max{ ME_A1.do(l[i]), ME_A2.do(l[i])})
• Hard to solve simbolicaly….
– Need to over approximate the max operation
69. Polymorphism
public void test(List l) { ME_A1do = 2
if(cond) a = new A1(); Class A1 extends A {
else a= new A2(); public int do(int n) { .. }
}
foreach(e:l){
a.do(e); ME_A2do = n
Class A2 extends A {
} public int do(int n) { .. }
} }
• The object remains the same in all iterations
loop_A1 = sum{i=1..size(l)} ME_A1.do(l[i])
loop_A2 = sum{i=1..size(l)} ME_A2.do(l[i])
Max(loop_A1, loop_A2)
70. About Flow sensitiveness
public void m1_m2() { public void m1_m2() {
m1(); m2();
m2(); m1();
} }
M1 M2
M2 M1
M1_M2 M1_M2
Is more complicated with loops and conditionals….
public void for_m1_m2(int n) { public void for_m1_m2(int n) {
for(int i = 0; i < n; i++) { M1,M2 for(int i = 0; i < n; i++) { M1,M2
m1(i); M1,M2 if(i%2) m1(i); M2,M1
m2(i); … m2(i); M2,M2,
} } …
} }
71. From objects to actual memory consumption
• Consider all objects
– Model VM behavior
– VM Memory vs System memory
• Improve lifetime inference
– Reachable objects, live objects
– (parametric in GC)
• Consumption patterns
– Lazy initialization, global fields that are constantly overrriten
• Specification language (independent of the EA)
– For non-analyzable, interfaces
72. Current + Future work
• Improving the interprocedural inference analysis
– “Plugleable” object lifefime analysis
– Support for recursive methods invocations
• Better invariant inference
– Tool suppot for annotations / checking
• A new specification language
88
73. Current + Future work
• Experimenting with static verification of memory
consumption
– Code Contracts, SMT solver + barvinok library
• A intermediate language (a sort of Boogie)
– With translations to Java, C#, C, etc.
• Inference for new memory models
– ISMM 2011: Short-term memory for self-collecting
mutators.
89
74. Conclusions
• The analysis of memory requirements is feasible (but
very difficult…).
• We (and other groups) had made good progress
• We need to improve in order to analyze real programs
– Compute actual consumption
– Seriously improve scalability
• We believe compositional approaches is a promising
direction
75. Credits
• Victor Braberman • Javier Tapicer
• Sergio Yovine • Martin Rouaux
• Philippe Clauss • Andres Ferrari
• Samuel Hym • Alejandro Taboada
• Daniel Gorin • Guillaume Salagnac
• Federico Fernandez
• Matias Grumberg
Editor's Notes
Nowadays we can see more and more embedded systems and mobile computers surrounding us.Even they are becoming powerful devices they have limited resources. They have limited battery life, limited memory and they usually communicate with other computers, incurring in communications cost. So, it is becoming more and more necessary to understand and control the consumption of these resources. Even in other settings like computer farms, or clusters it is also vital for the business to make proper use of them.
In this talk the focus is in understanding memory consumption with Java likes language which provides automatic memory management which takes cares of the deallocation of unused objects. This kind of features make program development easier and less error-prone. However, in this setting where objects are allocated dynamically predicting memory utilization is very dificult. In fact, just predicting memory allocations is hard, indeed is undecidable. In fact, it is a problem similar to the halting problem assuming that every program statement consumes some amount of memory.Even harder is predicting actual memory requirements, taking into account that memory can be reused once unused objects are collected by the GC.
Even we know is not the complete solution, we believe that being able to understand things like the number of object allocations and the maximum number of live objects is one of the foundations for understanding real memory consumption.In some systems (like some region based memory manager) this information is closer to what we need to produce real bounds.
In this work we do not recursive program and do not generate bound of the size of the stack. There are nice works like the…. And for functional languages.Stack allocation: Chin Wei Ngan, CorneliuPopeea and colleagues from Singapur
Consider this example that return a matrix filled with objects of type D.Its allocates an arrays of n times m references and n times m objects of type D I will consider arrays not as one object but as collection of many references to objects.Taking this into account this method allocated 2 time n times objects .
Suppose we want to verify this program that simply specifies number of objects allocated.We will need to generate the verification condition in order to prove the program satisfies the spec.Since the program has loops we will need at least to provide loop invariant describing how memory evolves in order to prove the spec.
----- Meeting Notes (29/03/12 10:18) -----In general if we want to compute actual memory requirements we will need to include several annotations to explain object lifetime and the shape of the data structures.
Related with symbolic complexity computation
Requires computing non-linear invariant, like polynomial invariants, which requires the definition o a complex lattice, which tends to be expensive in terms of computational cost, or may need widenning operators. More recent works (2011) show improvements in this topic.
Thus, look at the following code. It basically allocates several objects in the body of two loops. For instance, when method m0 calls m1 allocates k objects of type A and then makes several calls method m2 which allocates n objects of type B and C and an array of type B. Later, m0 calls again m2. The pictures on the right show the memory consumption of two different executions of the same program using a sort of “ideal” GC which releases the objects when they are not longer reacheable. The amount of memory required for this program depends on the parameter “mc” which is m0’s parameter. Notice that that not only the amount of memory of the peak consumption of the program changes when mc changes, also the place where this peak is reached also changes according to the calling context.
Thus, look at the following code. It basically allocates several objects in the body of two loops. For instance, when method m0 calls m1 allocates k objects of type A and then makes several calls method m2 which allocates n objects of type B and C and an array of type B. Later, m0 calls again m2. The pictures on the right show the memory consumption of two different executions of the same program using a sort of “ideal” GC which releases the objects when they are not longer reacheable. The amount of memory required for this program depends on the parameter “mc” which is m0’s parameter. Notice that that not only the amount of memory of the peak consumption of the program changes when mc changes, also the place where this peak is reached also changes according to the calling context.
Our goal is to obtain an expression that overapproximates peak memory consumption. We want the expression to be in terms of the method under analysis’ parameters.We also want the expression to be easy to evaluate. That means, the evaluation cost has to be low or are least known before. In this work we propose a technique to obtain such expression. That is, given a method we obtain a parametric expression overapproximating the maximum amount of memory consumed by any run starting at m.
Luego, para obtener los certificados que aproximan la memoria solicitada aplicamos el siguiente algoritmo…
Remember that creation sites are paths that may involve several methods. So the invariants should predicate about the variables in those methods. Let’s take a look to our running example. The first case is for the creation site which represents that m0 that calls m1 and creates an object of type A inside the loop. There, we have the following invariant representing the binding in the call from m0 to m1 and the iteration space of the loop where objects are created.The second case is when m0 calls m1 and m1 calls m2.
Recall that to count visits to statements we will basically use invariants. However, as invariants only constraint variables, they loose information about the call stack configuration. And the call stack is relevant to count visits. For example:Examining the example’s call graph, we immediately observe that, from a static view, method m2 is called at least twice. So their allocation sites will be executed twice.To cope with issue, we decided to distinguish program locations not only by their localcontrol location, but also by their history. That is, the call chain that leads to that allocation site. We introduce the notion of creation site. That is, a path from the method under analysis to a new statement reachable from that method. The creation site denotes not only the new statement but also a control projection of the call stackwhen reaching this statement. For example, this is the set of creation sites for the program assuming that we are analyzing method m0Demos una pequeña recorrida a las partes del algoritmos . Lo primero que hacemos es identificar los puntos donde se crean objetos. Un aspecto importante es que nuestra técnica distingue a los puntos de solicitud no solo por el lugar en el método donde se ejecuta el new, sino por todo el camino desde el metodo bajo analisis hasta ese punto. En el ejemplo, podemos distinguir los objetos creados desde la rama m0,m1,m2 de los de la rama m0,m2. A los puntos de solicitud extendidos con el contexto de llamada los llamamos creationsites.
Remember that creation sites are paths that may involve several methods. So the invariants should predicate about the variables in those methods. Let’s take a look to our running example. The first case is for the creation site which represents that m0 that calls m1 and creates an object of type A inside the loop. There, we have the following invariant representing the binding in the call from m0 to m1 and the iteration space of the loop where objects are created.The second case is when m0 calls m1 and m1 calls m2.
Now we are ready to count the number of visits of a creation site, For example, we want an expression in terms of method m0 parameters (that is the variable mc) that over-approximates the number of visits to this statement for the following stack configuration.To solve that, we take the invariant for that creation site, and count the number of possible variable assignments satisfying that invariant, assuming that the value of mc is fixed (the m0 parameter).Assuming the invariant can be expressed as a polyhedron, using the technique that I mentioned before, we can get a polynomial expressing the number of solutions of this formula.----------------------------In this case k and n has only one possible value “mc” and “n” respectively. But i range between 1 and k, and j between 1 and i. So, the number of possible solutions is the following expression.-Note that, counting the number of solutions for an invariant for a creation site cs=π.l over approximates the number of visits of the new statement at l when program stack is π
We know how to approximate the number of visits of a statement that allocates memory. Now, we have to take into account the size of the allocated objects. Following the same example, we already know the number of visits. To get the desired expression, we just multiply this expression by a symbolic expression that denotes the size of allocated object in runtime. (opcional) For the case of array instantiations, we treated them as a set of nested loops that create single instance objects (of the size of a reference to the type of the objects contained in the array). We do that by adding new constrains to the invariant describing these loops iteration spaces.
For instance we can synthesize the following expression approximating the total amount of dynamic memory allocated by m0.However, this expression is too conservative if we want to predict actual memory requirements since object deallocation is not taking into account.
The problem is that is it diffcult to predict when objects are actually realeased by the GC and it is even more complicate to know how many objects. Thus, our approach is to approximate an ideal GC using a more coarse grained one based on memory regions.
The realtime specification for Java propose a memory management mechanism based on Scoped memories, where objects are organized in regions which are associated to a particular scope.This appraoch allows better time predictability, compared with GCs and more control in the way object are allocated and deallocated.
Veamos rapidamente como funciona un programa utilizando la API.
Volvamos a nuestro ejemplo. El grafico muestra un grafo que aproxima las referencias entre objetos que crea este programa. Cada nodo en este grafo representa todos los objetos que pueden crearse en ese punto. Hay tantos nodos como creationsites.
Thus, to sinthesize memory regions we use escape analysis as a way to approximate object lifetimes.Basically, the region of each method is composed by the creation sites that do not escape the method but escape from some of its callees.
Volvamos a nuestro ejemplo. El grafico muestra un grafo que aproxima las referencias entre objetos que crea este programa. Cada nodo en este grafo representa todos los objetos que pueden crearse en ese punto. Hay tantos nodos como creationsites.
In this case we define a memory manager where there is one region associated within each method. So, the lifetime of objects in a region is directly associated with the lifetime of the associated method. For instance, for our example we can make this region organization. Boxes represents creation sites that are an abstraction of all objects creating in a program location. Ligth green boxed corresponds to objects created in method m2 but following the call chain that passes from m1 and the dark green corresponds to objects from m2 when it is called directly from m0. To respect scoping rules, object has to be allocated in the region that has a lifetime equal or longer to the object lifetime. In this case, objects refer to objects in the same region or to objects with a longer lifetime.
Method’s associated regions can be sinthesize using escape analysis. Thus, we can adapt our technique that compute total allocations to consider only the creation sites caputed by the method under analysis.Using this approach we are able to obtain parametic expressions that overapproximate region sizes.
Ok… we can model a memory manager using a scoped-based region manager, sinthesize memory regions and compute its sizes.In fact, using a similar reasoning, we can also compute the amount of memory escaping a method.We will use all this information to model peak memory consumption under this setting….
Suposse, we want to analyze peak consumption of a method m. When “m” is executed a region for “m” will be created and also a region for each method that m calls. During the execution of those methods, some objects will be allocated in one of the newly created regions or if they lifetime exceed method “m” lifetime they will be allocated in a preexistent region. So, when computing the peak consumption for m we will distinguish between the maximum consumption produced in newly created regions from the consumption produced in already existent regions. Using that approach, we focus on a technique to produce an overapproximation of the peak amount of memory consumed by newly created regions. The approximation of the consumption for the pre-existent regions can be done using the estimation of the amount of memory escaping the method under analysis.mem (m) >= Peak (m) Approx of peak memory allocated in newly created regionsmem(m) >= Peak(m) Apprpx of memory allocated in preexistent regions (memEsc(m))
So what happens with the regions created by the method under analysis?This is how memory regions evolves following the execution of the methods. Observe that the maximum number of active regions is bounded by the size of the maximum path is the call graph. Thus, the maximum consumption of newly created region can be computed by taking the region configuration whose sum of region sizes is the largest one. The problem is that number of configurations can be infinite… and of course every region size depends on the state of the program when the associated method is called and
Notice that we have bounds of regions in terms of their associated method parameters. And regions sizes evolve according to their calling context. For instance, we can analyze the evolution of regions for m2 when it is called from m1. The size for the region of m2 is actually bounded by the maximum value the loop at m1 can reach which depends on the parameter used when m1 is invocated. All this is expressed in the invariant that models the calling context.We can approximate a set of region configurations within calling context by considering the one with the largest region size. We denote this maximum as “maxrsize”. In this case the largest region for m2 is produced when n is equal to mc. We want to express this expressions in terms of the root method parameters in order to able to compare it with other regions sizes later.
Thus, the most important problem now is to compute the largest region size for a given calling context. In particular we need an expression representing the largest size a region for a method m can reach when called from a method under analysis (in this case m0) using a particular calling context given by the call chain leading to m.The calling context will be given by a binding invariant and the region size is a function of its associated method parameters.Remenber that we already have a method to approximate region sizes. We can use that methods or provide this specifications by other means. Since our technique is able to provide polynomials in terms of the associated method parameters, we are dealing with a non-linear optimization problem. Non-linear optimization problems are hard to solve and its execution time is difficult to predict. Thus, we cannot solve this problem in runtime. We need to solve this in compile time. This makes the problem even more complicated because we need a parametric solution that can be instantiated when parameter values are known in runtime.
Lukelly, there is a solution to this problem. This solution is due to Clauss wich extend an original technique of Bernstein to bound polynomials. The technique of Clauss receives a polynomial and a domain expressed as a polytype produces a set of polynomials wich are candidates of the maximun and minimum bound of the input polynomial within that given domain.
Thus, if we can describebinding invariants using linear constrains and region sizes using polynomails, we can use this technique to compute maxrsize. For our example we can produce the following expressions which are in terms of method m0 parameters although region sizes are expressed in terms of their method parameters. Now, the whole problem is now reduced to max comparison between polynomial in terms of the same parameters.This variable renaming can be performed thanks to the Bernstein transformation and because the invariant binds the parameters of method under analysis with the parameters of the method from which we want to perform the maximization.
For instance, consider the following polynomial and domain and a set of parameters. Berstein applied to this polinomial returns the following two set of candidates determined for a parametric domain. Notice that the result is in terms of the set of selected parameters.This is a remarkable feature of this symbolic technique: the result is a parametric expression which doesn’t have to be in terms of the parameters of the input polynomial. However, notice that there are still some problems: the output is not a simple expression, it is a set of candidates polynomails that need to evaluated to determine which is the largest one. This set can be reduced by applying some symbolic techniques or it can be done directly in run-time. In any case the evaluation cost is known “a priori”
Once we approximate the regions sizes using max r-size, for non-recursive programs we obtain a finite set of regions stacks.Thus, we just need to compute the maximum among this configurations.
So far approximate the peak consumption, considering the sum of largest region of every region configuration among the possible call chains leading from the methdo under analysisNotice that is not n
Instead of comparing the region configurations for all paths in the program, we can compute recursively the same maximum by traversing the call graph en genereting an evaluation tree whose leaves are non-linear maximization problems and the nodes are summing or max operations. Max is used for branches in the call graph and sums when we go deeper to generate a region configuration.
We can try to simplify offline the tree until we get an expression that we cannot or we don’t want to simplify. This evaluation tree can be then translated into code that can be executed when the parameters becomes available.
For instance, for our example we can compute the following expression for the peak consumption of method m0.If we analyze the peak consumption using an sort of ideal GC we see that we are over-approximating. Nevertheless, once we assume our region based memory manager the obtained expression is quite accurate.
We integrated this technique into a tool that is able to compute total allocations and region sizes and also generate region base code out-of conventional java code by sinthesizing the memory regions using escape analysis.
Thisisthecomponents of thetoolperformingthistask.
Thisisthecomponents of thetoolperformingthistask.
In particular an important part is the one taking care of computing the variables which are actually relevant in terms of number of visits.
Las regiones son inferidas automaticamente usando escape analisis y utilizando una herramienta que permite la edicion manual. Una vez que tenemos las regiones generamos automaticamente un nuevo codigo que tiene la misma funcionalidad pero que utiliza las regiones. En nuestro caso el codigo utiliza un admnistrador de memoria adhoc basado en regiones, pero por ejemplo otro tesista genera codigo para RTSJ usando la misma idea
So, we were able to compute the peak consumption for the application, but maybe we can do something better. In particular, the computed expression strongly depends on the strategy used to allocate objects in regions.Since escape analysis sometimes is to coarse when approximating objects lifetimes, they can lead in coarser regions.Thus, we provide a mechanism to manually refine the regions.We call this tool Jscoper and is a tool that allow regions refinement and also the generation of region based code out of standard java code.Tambien desarrollamos una herramienta que asiste en el proceso de generación de programas basados en regiones. Esta herramienta permite convertir automaticamente un programa java convencional en uno basado en regiones. Para ello se nutre de información sobre regiones que puede obtener conectandose con una herramienta de escape analysis o permitiendo la edicion manual de regiones.La herramienta esta integrada al entorno de programación Eclipse permitiendo tambien el debugging de aplicaciones utilizando para este caso un simulador de regiones.
Let me show you a detailed view of the componentes that implements the technique.The most important components are the one responsible of solving non-linear maximization problem. For that we use a technique based on the bernstein transformation which given a polinomial and a restriction over its parameters it yields a set of candidates representing the maximum polynomials respecting the restriction. --The technique relies on having a call graph of the application to generate the potential memory regions configurations and a component that provides invariants which are used as binding invariants when we model the non-linear maximization problem of obtaining largest region sizes.
Too conservative for recursive methods or recursive data structures whose values affects future memory allocations
Symbolic tools complexity depends on the number of variable and debugging of invariants is hard looking at global states.
To overcome the aforementioned problems we propose the use of a compositional analysis
Moreover, it enables the analysis of non-analyzable method by providing summaries and even some fragments by using summaries as stubs.
We start by analyzing the methods in a bottom up fashion. We analyze each allocation using only local invariants or some counting mechanism and generate a sumary.When a method invoke a another method we use the summary. Notice we need a symbolic calculator capable to support arithmetic operations on polynomials within an iteration space.
The challenge here is to do the approach in a compositional fashion but without to much loosing precision. This implies we need some means to include in the summaries certain lifetime information. We also need some symbolic calculator to deal with non-linear constraints such as polynomials.
So now we take the example and we include not only the amount of live objects required but also how many of them are actually escaping the method.Then we analyze the invocation of method m2 in the loop of m1, we sum for each iteration only the objects from m1 that escape. For the remainding objects we know they can be collected, so we only consider the iteration that consumes the most,A similar idea is applied to m0, the summary of m1 says that no object escapes, so we can assume that after the call to m1 the memory can be recovered and then use them for the requirements of m2. So the consumption of m0 is solved by computing a max between summaries of m1 and m2. No decir? Actually we add the escaping objects from m2 because we are flow insensitive
OK, I lie a little bit In order to be more precise we need to help the caller to understand whether some objects escaping from the callee are escaping their own scope or not. One way we propose is to deal with this is by grouping set of objects according to their expected lifetime. For instance, we can all that all objects escape, or say 1 one object escapes through this, and 2 from the return or even split that subheap to give more detailed info.
In our analysis we think the escape analysis as an oracle and actually a subheap descriptor can be some element that can be mapped to the EA.Example: using Salcianu’s PTG a subheap descriptor can be want they call inside node which represent all objects allocated in a program point.Other analysis can use completely differentydespcriptor like one element of a equivelance classes representing possibly connected objects and references.
Squiggles means errors: for instance the method consumes more memory than the specified or an object is declared as non-escaping but we cannot prove it because the analysis says it escapes.----- Meeting Notes (30/03/12 23:12) -----Sacar lo de temporary
So, how we check the annotations: Essentially transforming them into code contracts assertions. And by checking lifetime annotations using a PT and EA.
The idea is include a counter for each subheap descriptor that is updated in allocations
For method invocations we use the information from the summary add the amount to the subheap indicated in the annotation.The problem is that in order to prove the ensure clause we need non-linear arithmetic which is beyond that Clousot.
What we do to overcome this id using our inference technique. We only request linear invariant that can be inferred or check by Clousot.
If we think as Tmp as the complement of escaping we should say 0 although only one object is escaping.This semantic is meaning that Tmp is a lower bound which is complex to compute using counting If we think as an overapproximation the are saying that it is an upperbound
Regarding polymorphism, in the case of virtual call we should compute a summary representing all callees. It can be tricky when you have different subheapsbut it is essentially computing a symbolic maximum between polynomials and them apply the symbolic sum for the result. The problem is that max operation sometimes cannot be resolved completely and cannot be expressed as a polinomial. In these cases it doesn’t combine well with the sum operator.
What we do is trying to discover is the reciever actually changes during the loop. If it doesn’t change we can inverse the operation and apply the sum for each receiver and then apply the max.
In one moment I might say the analysis in flow insensitive. Actually the analysis would be much precise if we consider the flow of operations. For instance having….The problem is that in some situations is not very easy to determine the order of execution statically.