Do you want to check the efficiency of the new, state of the art, GraalVM JIT Compiler in comparison to the old but mostly used JIT C2? Let’s have a side by side comparison from a performance standpoint on the same source code.
The talk reveals how traditional Just In Time Compiler (e.g. JIT C2) from HotSpot/OpenJDK internally manages runtime optimizations for hot methods in comparison to the new, state of the art, GraalVM JIT Compiler on the same source code, emphasizing all of the internals and strategies used by each Compiler to achieve better performance in most common situations (or code patterns). For each optimization, there is Java source code and corresponding generated assembly code in order to prove what really happens under the hood.
Each test is covered by a dedicated benchmark (JMH), timings and conclusions. Main topics of the agenda: - Scalar replacement - Null Checks - Virtual calls - Lock coarsening - Lock elision - Virtual calls - Scalar replacement - Lambdas - Vectorization (few cases)
The tools used during my research study are JITWatch, Java Measurement Harness, and perf. All test scenarios will be launched against the latest official Java release (e.g. version 11).
Slides of my talk at QCon Hangzhou 2011, on the things that my team has been doing on the JVM, esp. on customization. Video available at http://www.infoq.com/cn/presentations/ms-jvm-taobao
Postgresql 12 streaming replication holVijay Kumar N
This is a step by step hands on lab for PostgreSQL 12 , setup of replication, replication slot, failover (promoting) to standby as new master cluster and also covering the scenario where old master has to be reinstated using the utility "pg_rewind"
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...ScaleGrid.io
Compare top PostgreSQL high availability frameworks - PostgreSQL Automatic Failover (PAF), Replication Manager (repmgr) and Patroni to improve your app uptime. ScaleGrid blog - https://scalegrid.io/blog/whats-the-best-postgresql-high-availability-framework-paf-vs-repmgr-vs-patroni-infographic/
Slides of my talk at QCon Hangzhou 2011, on the things that my team has been doing on the JVM, esp. on customization. Video available at http://www.infoq.com/cn/presentations/ms-jvm-taobao
Postgresql 12 streaming replication holVijay Kumar N
This is a step by step hands on lab for PostgreSQL 12 , setup of replication, replication slot, failover (promoting) to standby as new master cluster and also covering the scenario where old master has to be reinstated using the utility "pg_rewind"
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...ScaleGrid.io
Compare top PostgreSQL high availability frameworks - PostgreSQL Automatic Failover (PAF), Replication Manager (repmgr) and Patroni to improve your app uptime. ScaleGrid blog - https://scalegrid.io/blog/whats-the-best-postgresql-high-availability-framework-paf-vs-repmgr-vs-patroni-infographic/
This talk cover various advanced topics in the area of backups:
- incremental backups;
- archive management;
- backup validation;
- retention policies;
etc.
Based on these features, we'll compare various backup/recovery solutions for PostgreSQL.
This information will help you to choose the most appropriate tool for your system.
My presentation in SDCC 2012 (http://sdcc.csdn.net/index_en.html). The video recording of this session is available at http://v.csdn.hudong.com/s/article.html?arcid=2810640
Decompressed vmlinux: linux kernel initialization from page table configurati...Adrian Huang
Talk about how Linux kernel initializes the page table.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Elastic JVM for Scalable Java EE Applications Running in Containers #Jakart...Jelastic Multi-Cloud PaaS
Being configured smartly, Java can be scalable and cost-effective for all ranges of projects — from cloud-native startups to legacy enterprise applications. During this session, we will share our experiences in tuning RAM usage in a Java process to make it more elastic and gain the benefits of faster scaling and lower total cost of ownership (TCO). With microservices, cloud hosting, and vertical scaling in mind, we'll compare the top Java garbage collectors to see how efficiently they handle memory resources. The provided results of testing G1, Parallel, ConcMarkSweep, Serial, Shenandoah, ZGC and OpenJ9 garbage collectors while scaling Java EE applications vertically will help you to make the right choice for own projects.
More details about Garbage Collector types https://jelastic.com/blog/garbage-collection/
Free registration at Jelastic https://jelastic.com/
EnterpriseDB's Best Practices for Postgres DBAsEDB
This presentation reviews techniques to become a high performance Postgres DBA such as:
- Day to day monitoring
- Ongoing maintenance tasks including bloat and index maintenance
- Database and OS parameter tuning for performance
Security practices
- Planning for production deployment
- High availability best practices including strategies for backup and recovery
- Ideas for professional development
To listen to the recording of this presentation, visit Enterprisedb.com > Resources > Webcasts > On Demand Webcasts
An overview of reference architectures for PostgresEDB
EDB Reference Architectures are designed to help new and existing users alike to quickly design a deployment architecture that suits their needs. They can be used as either the blueprint for a deployment, or as the basis for a design that enhances and extends the functionality and features offered.
Add-on architectures allow users to easily extend their core database server deployment to add additional features and functionality "building block" style.
In this webinar, we will review the following architectures:
- Single Node
- Multi Node with Asynchronous Replication
- Multi Node with Synchronous Replication
- Add-on Architectures
Speaker:
Michael Willer
Sales Engineer, EDB
100 bugs in Open Source C/C++ projects Andrey Karpov
This article demonstrates capabilities of the static code analysis methodology. The readers are offered to study the samples of one hundred errors found in open-source projects in C/C++.
Five cool ways the JVM can run Apache Spark fasterTim Ellison
The IBM JVM runs Apache Spark fast! This talk explains some of the findings and optimizations from our experience of running Spark workloads.
The talk was originally presented at the SparkEU Summit 2015 in Amsterdam.
This talk cover various advanced topics in the area of backups:
- incremental backups;
- archive management;
- backup validation;
- retention policies;
etc.
Based on these features, we'll compare various backup/recovery solutions for PostgreSQL.
This information will help you to choose the most appropriate tool for your system.
My presentation in SDCC 2012 (http://sdcc.csdn.net/index_en.html). The video recording of this session is available at http://v.csdn.hudong.com/s/article.html?arcid=2810640
Decompressed vmlinux: linux kernel initialization from page table configurati...Adrian Huang
Talk about how Linux kernel initializes the page table.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Elastic JVM for Scalable Java EE Applications Running in Containers #Jakart...Jelastic Multi-Cloud PaaS
Being configured smartly, Java can be scalable and cost-effective for all ranges of projects — from cloud-native startups to legacy enterprise applications. During this session, we will share our experiences in tuning RAM usage in a Java process to make it more elastic and gain the benefits of faster scaling and lower total cost of ownership (TCO). With microservices, cloud hosting, and vertical scaling in mind, we'll compare the top Java garbage collectors to see how efficiently they handle memory resources. The provided results of testing G1, Parallel, ConcMarkSweep, Serial, Shenandoah, ZGC and OpenJ9 garbage collectors while scaling Java EE applications vertically will help you to make the right choice for own projects.
More details about Garbage Collector types https://jelastic.com/blog/garbage-collection/
Free registration at Jelastic https://jelastic.com/
EnterpriseDB's Best Practices for Postgres DBAsEDB
This presentation reviews techniques to become a high performance Postgres DBA such as:
- Day to day monitoring
- Ongoing maintenance tasks including bloat and index maintenance
- Database and OS parameter tuning for performance
Security practices
- Planning for production deployment
- High availability best practices including strategies for backup and recovery
- Ideas for professional development
To listen to the recording of this presentation, visit Enterprisedb.com > Resources > Webcasts > On Demand Webcasts
An overview of reference architectures for PostgresEDB
EDB Reference Architectures are designed to help new and existing users alike to quickly design a deployment architecture that suits their needs. They can be used as either the blueprint for a deployment, or as the basis for a design that enhances and extends the functionality and features offered.
Add-on architectures allow users to easily extend their core database server deployment to add additional features and functionality "building block" style.
In this webinar, we will review the following architectures:
- Single Node
- Multi Node with Asynchronous Replication
- Multi Node with Synchronous Replication
- Add-on Architectures
Speaker:
Michael Willer
Sales Engineer, EDB
100 bugs in Open Source C/C++ projects Andrey Karpov
This article demonstrates capabilities of the static code analysis methodology. The readers are offered to study the samples of one hundred errors found in open-source projects in C/C++.
Five cool ways the JVM can run Apache Spark fasterTim Ellison
The IBM JVM runs Apache Spark fast! This talk explains some of the findings and optimizations from our experience of running Spark workloads.
The talk was originally presented at the SparkEU Summit 2015 in Amsterdam.
This article demonstrates capabilities of the static code analysis methodology. The readers are offered to study the samples of one hundred errors found in open-source projects in C/C++. All the errors have been found with the PVS-Studio static code analyzer.
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationMonica Beckwith
Adaptive compilation and runtime in the OpenJDK Hotspot VM offers significant performance enhancements for our tools and applications in Java and other JVM languages. Understanding how it works provides developers with critical information on the Java HotSpot JIT compilation and runtime techniques such as vectorization, compressed OOPs etc., to assist in understanding performance for both client and server applications. We will focus on the internals of OpenJDK 8, the reference implementation for Java SE 8.
HKG15-300: Art's Quick Compiler: An unofficial overviewLinaro
HKG15-300: Art's Quick Compiler: An unofficial overview
---------------------------------------------------
Speaker: Matteo Franchin
Date: February 11, 2015
---------------------------------------------------
★ Session Summary ★
One of the important technical novelties introduced with the recent release of Android Lollipop is the replacement of Dalvik, the VM which was used to execute the bytecode produced from Java apps, with ART, a new Android Run-Time. One interesting aspect in this upgrade is that the use of Just-In-Time compilation was abandoned in favour of Ahead-Of-Time compilation. This delivers better performance [1], also leaving a good margin for future improvements. ART was designed to support multiple compilers. The compiler that shipped with Android Lollipop is called the “Quick Compiler”. This is simple, fast, and is derived from Dalvik’s JIT compiler. In 2014 our team at ARM worked in collaboration with Google to extend ART and its Quick Compiler to add support for 64-bit and for the A64 instruction set. These efforts culminated with the recent release of the Nexus 9 tablet, the first 64-bit Android product to hit the market. Despite Google’s intention of replacing the Quick Compiler with the so-called “Optimizing Compiler”, the job for the the Quick Compiler is not yet over. Indeed, the Quick Compiler will remain the only usable compiler in Android Lollipop. Therefore, all competing parties in the Android ecosystem have a huge interest in investigating and improving this component, which will very likely be one of the battlegrounds in the Android benchmark wars of 2015. This talk aims to give an unofficial overview of ART’s Quick compiler. It will first focus on the internal organisation of the compiler, adopting the point of view of a developer who is interested in understanding its limitations and strengths. The talk will then move to exploring the output produced by the compiler, discussing possible strategies for improving the generated code, while keeping in mind that this component may have a limited life-span, and that any long-term work would be better directed towards the Optimizing Compiler. [1] The ART runtime, B. Carlstrom, A. Ghuloum, and I. Rogers, Google I/O 2014,https://www.youtube.com/watch?v=EBlTzQsUoOw
--------------------------------------------------
★ Resources ★
Pathable: https://hkg15.pathable.com/meetings/250804
Video: https://www.youtube.com/watch?v=iho-e7EPHk0
Etherpad: N/A
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2015 - #HKG15
February 9-13th, 2015
Regal Airport Hotel Hong Kong Airport
---------------------------------------------------
http://www.linaro.org
http://connect.linaro.org
We are regularly asked to check various open-source projects with the PVS-Studio analyzer. If you want to offer some project for us to analyze too, please follow this link. Another project we have checked is Dolphin-emu.
How to write clean & testable code without losing your mindAndreas Czakaj
If you create software that is to be developed continuously over several years you'll need a sustainable approach to code quality.
In our early days of AEM development, however, we used to struggle with code that is rigid, hard to test and full of LOG.debug calls.
In this talk I will share some development best practices we have found that really work in actual AEM based software, e.g. to achieve 100% code coverage and provide high confidence in the code base.
Spoiler alert: no new libraries, frameworks or tools are required - once you know the ideas, plain old TDD and the S.O.L.I.D. principles of Clean Code will do the trick.
by Andreas Czakaj, mensemedia Gesellschaft für Neue Medien mbH
Presented at the adaptTo() 2017 conference in Berlin (https://adapt.to/2017/en/schedule/how-to-write-clean---testable-code-without-losing-your-mind.html).
Presentation video can be found on YouTube (https://www.youtube.com/watch?v=JbJw5oN_zL4)
Browser exploitation techniques and low-level binary exploitation, this presentation is presented in Stockholm SEC-T cybersecurity conference in September/2019
The Robots industry is promising major operational benefits, although no one is quite sure where robots and the Industrial Internet of Things (IIoT) will take manufacturing. IoT represents a closing of the gap between production and IT and is seen as the next big step for automation.
Lets see some algorithms and safety concepts behind this topic.
Top 10 bugs in C++ open source projects, checked in 2016PVS-Studio
While the world is discussing the 89th Ceremony of Oscar award and charts of actors and costumes, we've decided to write a review article about the IT-sphere. The article is going to cover the most interesting bugs, made in open source projects in 2016. This year was remarkable for our tool, as PVS-Studio has become available on Linux OS. The errors we present are hopefully, already fixed, but every reader can see how serious are the errors made by developers.
Similar to A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers better runtime performance? (20)
Massively scalable ETL in real world applications: the hard wayJ On The Beach
Big Data examples always give the correct answers. However, in the real world, Big Data might be corrupt, contradictory or consist of so many small files it becomes extremely hard to keep track - let alone scale. A solid architecture will help to overcome many of the difficulties.
Floris will talk about a real-world implementation of a massively scalable ETL architecture. Two years ago, at the time of the implementation, Airflow just became part of Apache and still left many features to be desired for. However, requirements from the start were thousands of ETL tasks per day on average, but on occasion, this could become hundreds of thousands. The script-based method that was in place was already not capable to meet the requirements on a day to day basis and needed to be replaced as soon as possible. So this custom framework was rolled out in just 8 weeks of development time.
Traditional Big Data is done on Data you have. You load the data into a repository and perform map reduce or other style calculations on the data. However, certain industries need to perform complex operations on data you might not have. Data you can acquire, Data that can be shared with you, and Data that you can model are all types of data you may not have but may need to integrate instantly into a complex data analysis. Problem is: you may not even know you need this data until deep into the execution stack at runtime. This talk discusses a new functional language paradigm for dealing naturally with data you don’t have and about how to make all data first-class citizens, regardless of whether you have it or you don’t, and we will give a demo of a project written in Scala to deal exactly with this issue.
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...J On The Beach
Industry 4.0, aka the "Fourth Industrial Revolution," refers to the computerization of manufacturing. One important aspect of Industry 4.0 is the ability to monitor the health and reliability of a physical manufacturing plant using low-cost IoT sensors. For example, machine learning models can be trained to predict the physical degradation of a manufacturing system as a function of acoustic measurements obtained from strategically placed microphones; however, the same acoustic measurements can be used to reverse engineer proprietary information about the manufacturing process and/or precisely what is being manufactured at the time of recording. Thus, improved reliability and fault tolerance is achieved at the cost of what appears to be an unprecedented new class of security vulnerabilities related to the acoustic side channel.
As a case study, we report a novel acoustic side channel attack against a commercial DNA synthesizer, a commonly used instrument in fields such as synthetic biology. Using a smart phone-quality microphone placed on or in the near vicinity of a DNA synthesizer, we were able to determine with 88.07% accuracy the sequence of DNA being produced; using a database of biologically relevant known-sequences, we increased the accuracy of our model to 100%. An academic or industrial research project may use the synthetic DNA to engineer an organism with desired traits or functions; however, while the organism is still under development, prior to publication, patent, and/or copyright, the research remains vulnerable to academic intellectual property theft and/or industrial espionage. On the other hand, this attack could also be used for benevolent purposes, for example, to determine whether a suspected criminal or terrorist is engineering a harmful pathogen. Thus, it is essential to recognize both the benefits and risks inherent to the cyber-physical systems that will inevitably control Industry 4.0 manufacturing processes and to take steps to mitigate them whenever possible.
Where is the edge in IoT and how much can you do there? Data collection? Analytics? I’ll show you how to build and deploy an embedded IoT edge platform that can do data collection, analytics, dashboarding and much more. All using Open Source.
As IoT deployments move forward, the need to collect, analyze, and respond to data further out on the edge becomes a critical factor in the success – or failure – of any IoT project. Network bandwidth costs may be dropping, and storage is cheaper than ever, but at IoT scale, these costs can still quickly overrun a project’s budget and ultimately doom it to failure.
The more you centralize your data collection and storage, the higher these costs become. Edge data collection and analysis can dramatically lower these costs, plus decrease the time to react to critical sensor data. With most data platforms, it simply isn’t practical, or even possible, to push collection AND analytics to the edge. In this talk I’ll show how I’ve done exactly this with a combination of open source hardware – Pine64 – and open source software – InfluxDB – to build a practical, efficient and scalable data collection and analysis gateway device for IoT deployments. The edge is where the data is, so the edge is where the data collection and analytics needs to be.
Drinking from the firehose, with virtual streams and virtual actorsJ On The Beach
Event Stream Processing is a popular paradigm for building robust and performant systems in many different domains, from IoT to fraud detection to high-frequency trading. Because of the wide range of scenarios and requirements, it is difficult to conceptualize a unified programming model that would be equally applicable to all of them. Another tough challenge is how to build streaming systems with cardinalities of topics ranging from hundreds to billions while delivering good performance and scalability.
In this session, Sergey Bykov will talk about the journey of building Orleans Streams that originated in gaming and monitoring scenarios, and quickly expanded beyond them. He will cover the programming model of virtual streams that emerged as a natural extension of the virtual actor model of Orleans, the architecture of the underlying runtime system, the compromises and hard choices made in the process. Sergey will share the lessons learned from the experience of running the system in production, and future ideas and opportunities that remain to be explored.
Over the last twenty years, there has been a paradigm shift in software development: from meticulously planned release cycles to an experimental way of working in which lead times are becoming shorter and shorter.
How can Java ever keep up with this trend when we have Docker containers that are several hundred megabytes in size, with warm-up times of ten minutes or longer? In this talk, I'll demonstrate how we can use Quarkus so that we can create super small, super fast Java containers! This will give us better possibilities for scaling up and down - which can be a game-changer, especially in a serverless environment. It will also provide the shortest possible lead times, as well as a much better use of cloud performance with the added bonus of lower costs.
When Cloud Native meets the Financial SectorJ On The Beach
We live in our own bubble of microservices and endlessly horizontal scaling infrastructure, but there is still critical infrastructure that runs the world of financial systems depending on Windows boxes, FTP servers, and single-threaded protocols. This talk is about how to glue these two worlds together, what works for us and what doesn't.
The advancement of technology in the last decade or so has allowed astronomy to see exponential growth in data volumes. ESA's space telescope Euclid will gather high-resolution images of a third of the sky, ~850GB of data downloaded daily for 6 years, by 2032 ground-based telescope LSST will have generated 500PB of data and the radio telescope SKA will be producing more data per second than the entire internet worldwide. This talk will address the questions of what current techniques exist to address big data volumes, how the astronomical community will prepare for this big data wave, and what other challenges lie ahead?
The world is moving from a model where data sits at rest, waiting for people to make requests of it, to where data is constantly moving and streams of data flow to and from devices with or without human interaction. Decisions need to be made based on these streams of data in real-time, models need to be updated, and intelligence needs to be gathered. In this context, our old-fashioned approach of CRUD REST APIs serving CRUD database calls just doesn't cut it. It's time we moved to a stream-centric view of the world.
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...J On The Beach
Our increasingly connected world leveraging the Internet of Things (IoT) creates great value, in connected healthcare, smart cities, and more. The increasing use of IoT also creates great risk. We will discuss the challenges and risks we need to address as developers in TIPPSS - Trust, Identity, Privacy, Protection, Safety, and Security - for devices, systems and solutions we deliver and use. Florence leads IEEE workstreams on clinical IoT and data interoperability with blockchain addressing TIPPSS issues. She is an author of IEEE articles on "Enabling Trust and Security - TIPPSS for IoT" and "Wearables and Medical Interoperability - the Evolving Frontier", "TIPPSS for Smart Cities" in the 2017 book "Creating, Analysing and Sustaining Smarter Cities: A Systems Perspective" , and Editor in Chief for an upcoming book on "Women Securing the Future with TIPPSS for IoT."
Pushing AI to the Client with WebAssembly and BlazorJ On The Beach
Want to run your AI algorithms directly in the browser on the client-side? Now you can with WebAssembly and Blazor. Join us as we write code directly in WebAssembly. Then, we’ll look at Blazor and how you can use it, along with WebAssembly to run your tooling client side in the browser.
Want to run your AI algorithms directly in the browser on the client-side without the need for transpilers or browser plug-ins? Well, now you can with WebAssembly and Blazor. WebAssembly (WASM) is the W3C specification that will be used to provide the next generation of development tools for the web and beyond. Blazor is Microsoft’s experiment that allows ASP.Net developers to create web pages that do much of the scripting work in C# using WASM. Come join us as we learn to write code directly in WebAssembly’s human-readable format. Then, we’ll look at the current state of Blazor and how you can use it, along with WebAssembly to run your tooling client side in the browser.
RAFT protocol is a well-known protocol for consensus in Distributed Systems. Want to learn how consensus is achieved in a system with a large amount of data such as Axon Server’s Event Store? Join this talk to hear about all specifics regarding data replication in highly available Event Store!
Axon is a free and open source Java framework for writing Java applications following DDD, event sourcing, and CQRS principles. While especially useful in a microservices context, Axon provides great value in building structured monoliths that can be broken down into microservices when needed.
Axon Server is a messaging platform specifically built to support distributed Axon applications. One of its key benefits is storing events published by Axon applications. In not so rare cases, the number of these events is over millions, even billions. Availability of Axon Server plays a significant role in the product portfolio. To keep event replication reliable we chose RAFT protocol for consensus implementation of our clustering features.
In short, consensus involves multiple servers agreeing on values. Once they reach a decision on a value, that decision is final. Typical consensus algorithms make progress when any majority of their servers is available; for example, a cluster of 5 servers can continue to operate even if 2 servers fail. If more servers fail, they stop making progress (but will never return an incorrect result).
Join this talk to learn why we chose RAFT; what were our findings during the design, the implementation, and testing phase; and what does it mean to replicate an event store holding billions of events!
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...J On The Beach
Thinking of moving to Microservices? Watch out! That quest is full of traps, social traps. If you are not able to handle it, you may be blocked by meetings, frustration, endless challenges that will make you miss the monolith. In this talk, I share my experience and mistakes, so you can avoid them.
Creating or migrating to a Microservices architecture might easily become a big mess, not only due to technical challenges but mostly because of human factors: it’s a major change in the software culture of a company. In this talk, I’ll share my past experience as the technical lead of an ambitious Microservices-based product, I’ll go through the parts we struggled with, and give you some advice on how to deal with what I call the Six Pitfalls:
The Common Patterns Phobia
The Book Club Cult
The Never-Decoupled Story
The Buzz Words Syndrome
The Agile Trap
The Conway’s Law Hackers
Instead of randomly injecting faults ( i.e. Chaos Monkey), what if we could order our experiments to perform min number of experiments for maximum yield? We present a solution(& results) to the problem of experiment selection using Lineage Driven Fault Injection to reduce the search space of faults.
Lineage Driven Fault Injection (LDFI) is a state of the art technique in chaos engineering experiment selection. LDFI since its inception has used an SAT solver under the hood which presents solutions to the decision problem (which faults to inject) in no particular order. As SRE’s we would like to perform experiments that reveal the bugs that the customers are most likely to hit first. In this talk, we present new improvements to LDFI that orders the experiment suggestions.
In the first the half of the talk we will show LDFI is a technique that can be widely used within an enterprise. We present the motivation for ordering the chaos experiments along with some prioritization we utilized while conducting the experiments. We also highlight how ordering is a general purpose technique that we can use to encode the peculiarities of a heterogeneous microservices architecture. LDFI can work in an enterprise by harnessing the observability infrastructure to model the redundancy of the system.
Next, we present experiments conducted within our organization using ordered LDFI and some preliminary results. We show examples of services where we discovered bugs, and how carefully controlling the order of experiments allowed LDFI to avoid running unnecessary experiments. We also present an example of an application where we declared the service shippable under crash stop model. We also present a comparison with Chaos Monkey and show how LDFI found the known bugs in a given application using orders of magnitude fewer experiments than a random fault injection tool like Chaos Monkey.
Finally, we discuss how we plan to take LDFI forward. We discuss open problems and possible solutions for scalarizing probabilities of failure, latency injection, integration with service mesh technologies like envoy for fine-grained fault injection, fault injection for stateful systems.
Key takeaways: 1) Understand how LDFI can be integrated in the enterprise by harnessing the observability infrastructure. 2) Limitations of LDFI w.r.t unordered solutions and why ordering matters for chaos engineering experiments. 3) Preliminary results of prioritized LDFI and a future direction for the community.
Complexity in systems should be defeated if it is possible to do. But the default nature of our computer systems are complex and servers are doomed to fail. In this talk, we will go through new approaches in modern architectures to design and evaluate new computer systems.
Interaction Protocols: It's all about good mannersJ On The Beach
Distributed systems collaborate to achieve collective goals via a system of rules. Rules that affords good hygiene, fault tolerance, effective communication and trusted feedback. These rules form protocols which enable the system to achieve its goals.
Distributed and concurrent systems can be considered a social group that collaborates to achieve collective goals. In order to collaborate a system of rules must be applied, that affords good hygiene, fault tolerance, and effective communication to coordinate, share knowledge, and provide feedback in a polite trusted manner. These rules form a number of protocols which enable the group to act as a system which is greater than the sum of the individual components.
In this talk, we will explore the history of protocols and their application when building distributed systems.
Leadership is easy when you're a manager, or an expert in a field, or a conference speaker! In a Kanban organisation, though, we "encourage acts of leadership at every level". In this talk, we look at what it means to be a leader in the uncertain, changing and high-learning environment of software development. We learn about the importance of safety in encouraging others to lead and follow, and how to get that safety using both technical and human practices; the necessity of a clear, compelling vision and provision of information on how we're achieving it; and the need to be able to ask awkward and difficult questions... especially the ones without easy answers.
Machine Learning: The Bare Math Behind LibrariesJ On The Beach
During this presentation, we will answer how much you’ll need to invest in a superhero costume to be as popular as Superman. We will generate a unique logo which will stand against the ever popular Batman and create new superhero teams. We shall achieve it using linear regression and neural networks.
Machine learning is one of the hottest buzzwords in technology today as well as one of the most innovative fields in computer science – yet people use libraries as black boxes without basic knowledge of the field. In this session, we will strip them to bare math, so next time you use a machine learning library, you’ll have a deeper understanding of what lies underneath.
During this session, we will first provide a short history of machine learning and an overview of two basic teaching techniques: supervised and unsupervised learning.
We will start by defining what machine learning is and equip you with an intuition of how it works. We will then explain the gradient descent algorithm with the use of simple linear regression to give you an even deeper understanding of this learning method. Then we will project it to supervised neural networks training.
Within unsupervised learning, you will become familiar with Hebb’s learning and learning with concurrency (winner takes all and winner takes most algorithms). We will use Octave for examples in this session; however, you can use your favourite technology to implement presented ideas.
Our aim is to show the mathematical basics of neural networks for those who want to start using machine learning in their day-to-day work or use it already but find it difficult to understand the underlying processes. After viewing our presentation, you should find it easier to select parameters for your networks and feel more confident in your selection of network type, as well as be encouraged to dive into more complex and powerful deep learning methods.
Getting started with Deep Reinforcement LearningJ On The Beach
Reinforcement Learning is a hot topic in Artificial Intelligence (AI) at the moment with the most prominent example of AlphaGo Zero. It shifted the boundaries of what was believed to be possible with AI. In this talk, we will have a look into Reinforcement Learning and its implementation.
Reinforcement Learning is a class of algorithms which trains an agent to act optimally in an environment. The most prominent example is AlphaGo Zero, where the agent is trained to place tokens on the board of Go in order to win the game. AlphaGo Zero has won against the world champion which was thought to be impossible at that time. This was enabled by combining Reinforcement Learning with Deep Neural Networks and is today known as Deep Reinforcement Learning. This has shifted the frontier of Artificial Intelligence and enabled multiple complex use cases, among them controlling the cooling devices in the server rooms by google. Applying Deep Reinforcement Learning saved them several million in power costs. In this talk, we will understand the basics of Deep Reinforcement Learning and implement a simple example. We will have a look at OpenAIs gym which is the defacto standard for Reinforcement Learning environments. This will enable the audience to implement both an environment and Reinforcement Learning agent on their own.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Modern design is crucial in today's digital environment, and this is especially true for SharePoint intranets. The design of these digital hubs is critical to user engagement and productivity enhancement. They are the cornerstone of internal collaboration and interaction within enterprises.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Why React Native as a Strategic Advantage for Startup Innovation.pdfayushiqss
Do you know that React Native is being increasingly adopted by startups as well as big companies in the mobile app development industry? Big names like Facebook, Instagram, and Pinterest have already integrated this robust open-source framework.
In fact, according to a report by Statista, the number of React Native developers has been steadily increasing over the years, reaching an estimated 1.9 million by the end of 2024. This means that the demand for this framework in the job market has been growing making it a valuable skill.
But what makes React Native so popular for mobile application development? It offers excellent cross-platform capabilities among other benefits. This way, with React Native, developers can write code once and run it on both iOS and Android devices thus saving time and resources leading to shorter development cycles hence faster time-to-market for your app.
Let’s take the example of a startup, which wanted to release their app on both iOS and Android at once. Through the use of React Native they managed to create an app and bring it into the market within a very short period. This helped them gain an advantage over their competitors because they had access to a large user base who were able to generate revenue quickly for them.
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
6. www.ionutbalosin.com
Configuration
6
CPU Intel i7-8550U Kaby Lake R
MEMORY 16GB DDR4 2400 MHz
OS / Kernel Ubuntu 18.10 / 4.18.0-17-generic
Java Version 12 Linux-x64
JVM / CE
JMH v1.21
5x10s warm-up iterations, 5x10s measurement
iterations, 3 JVM forks, single threaded
Note: C2 JIT might need less timings, being a native Compiler.
9. www.ionutbalosin.com
Fair comparison?
9
to
Well … it depends from the perspective:
Application/end user: performance and (licensing) costs are the
leading factors
Compiler/internals (e.g. implementation, maturity): very different
10. www.ionutbalosin.com
Graal JIT
✓ Java implementation
✓ still early stage (recently become
production ready; e.g. v19.0.0)
✓ targets JVM-based languages (e.g.
Java, Scala, Kotlin), dynamic
languages (e.g. JavaScript, Ruby, R)
but also native languages (e.g. C/C++,
Rust)
✓ modular design (JVMCI plug-in arch)
✓ Ahead Of Time flavor
10
11. www.ionutbalosin.com
Graal JIT
✓ Java implementation
✓ still early stage (recently become
production ready; e.g. v19.0.0)
✓ targets JVM-based languages (e.g.
Java, Scala, Kotlin), dynamic
languages (e.g. JavaScript, Ruby, R)
but also native languages (e.g. C/C++,
Rust)
✓ modular design (JVMCI plug-in arch)
✓ Ahead Of Time flavor
C2 JIT
✓ C++ implementation
✓ mature / production grade runtime
✓ written mainly to optimize Java
source code
✓ is essentially a local maximum of
performance and is reaching the end
of its design lifetime
✓ latest improvements were more
focused on intrinsics and
vectorization, rather than adding
new optimization heuristics
11
12. www.ionutbalosin.com
Nevertheless, nowadays the baseline for Graal JIT still
remains C2 JIT!
The target is to leverage its performance to be at least
on par with C2 JIT especially on real world applications /
production code.
12
13. www.ionutbalosin.com
There are 3 main distinguishing factors for Graal JIT:
13
Improved inlining
Partial Escape Analysis
Guard optimizations for efficient
handling of speculative code
14. www.ionutbalosin.com 14
1. Partial Escape Analysis (PEA) determines the escapability of objects on
individual branches and reduces object allocations even if the objects
escapes on some paths.
PEA works on compiler’s intermediate representation not on bytecode.
It is effective in conjunction with other parts of the compiler, such as
inlining, global value numbering, and constant folding.
15. www.ionutbalosin.com 15
2. Improved inlining strategy explores the call tree and performs late
inlining as default instead of inlining during bytecode parsing.
Graal JIT devise three novel heuristics: adaptive decision thresholds, callsite
clustering, and deep inlining trials.
16. www.ionutbalosin.com 16
3. Guard optimizations for efficient handling of speculative code has its
biggest impact on the JavaScript, Ruby, and R implementations built on
Graal’s multi-language framework Truffle.
18. www.ionutbalosin.com
Object scopes:
✓ NoEscape (local to current method/thread)
✓ ArgEscape (passed as an argument but not visible by other threads)
✓ GlobalEscape (visible outside of current method/thread)
18
Escape Analysis (EA) analyses the scope of a new object and decides
whether it might be allocated or not on the heap.
EA is not an optimization but rather an analysis phase for the optimizer.
The result of this analisys allows compiler to perform optimizations like
object allocations, synchronization primitives and field accesses.
19. www.ionutbalosin.com
For NoEscape objects compiler can remap the accesses
to the object fields to accesses to synthetic local
operands: Scalar Replacement optimization.
19
20. www.ionutbalosin.com
@Param({"3"}) private int param;
@Param({"128"}) private int size;
@Benchmark
public int no_escape() {
LightWrapper w = new LightWrapper(param);
return w.f1 + w.f2;
}
public static class LightWrapper {
public int f1;
public int f2;
//...
}
20
CASE I
21. www.ionutbalosin.com
@Param({"3"}) private int param;
@Param({"128"}) private int size;
@Benchmark
public int no_escape() {
LightWrapper w = new LightWrapper(param);
return w.f1 + w.f2;
}
public static class LightWrapper {
public int f1;
public int f2;
//...
}
21
equivalent to
return f1 + f2;
CASE I
22. www.ionutbalosin.com
@Param({"3"}) private int param;
@Param({"128"}) private int size;
@Benchmark
public int no_escape_object_array() {
HeavyWrapper w = new HeavyWrapper(param, size);
return w.f1 + w.f2 + w.z.length;
}
public static class HeavyWrapper {
public int f1;
public int f2;
public byte z[];
// ...
}
22
CASE II
23. www.ionutbalosin.com
@Param({"3"}) private int param;
@Param({"128"}) private int size;
@Benchmark
public int no_escape_object_array() {
HeavyWrapper w = new HeavyWrapper(param, size);
return w.f1 + w.f2 + w.z.length;
}
public static class HeavyWrapper {
public int f1;
public int f2;
public byte z[];
// ...
}
23
equivalent to
return f1 + f2 + length;
CASE II
24. www.ionutbalosin.com
@Param(value = {"false"}) private boolean objectEscapes;
@Benchmark
public void branch_escape_object(Blackhole bh) {
HeavyWrapper wrapper = new HeavyWrapper(param, size);
HeavyWrapper result;
if (objectEscapes) // false
result = wrapper;
else
result = globalCachedWrapper;
bh.consume(result);
}
24
CASE III
25. www.ionutbalosin.com
@Param(value = {"false"}) private boolean objectEscapes;
@Benchmark
public void branch_escape_object(Blackhole bh) {
HeavyWrapper wrapper = new HeavyWrapper(param, size);
HeavyWrapper result;
if (objectEscapes) // false
result = wrapper;
else
result = globalCachedWrapper;
bh.consume(result);
}
25
CASE III
branch is never taken,
wrapper never escapes,
its scope is NoEscape
26. www.ionutbalosin.com
@Param(value = {"false"}) private boolean objectEscapes;
@Benchmark
public void branch_escape_object(Blackhole bh) {
HeavyWrapper wrapper = new HeavyWrapper(param, size);
HeavyWrapper result;
if (objectEscapes) // false
result = wrapper;
else
result = globalCachedWrapper;
bh.consume(result);
}
26
branch is never taken,
wrapper never escapes,
its scope is NoEscape
CASE III
Based on Partial Escape Analysis, the
wrapper object allocation is eliminated!
27. www.ionutbalosin.com
@Benchmark
public boolean arg_escape_object() {
HeavyWrapper wrapper1 = new HeavyWrapper(param, size);
HeavyWrapper wrapper2 = new HeavyWrapper(param, size);
boolean match = false;
if (wrapper1.equals(wrapper2))
match = true;
return match;
}
27
CASE IV
28. www.ionutbalosin.com
@Benchmark
public boolean arg_escape_object() {
HeavyWrapper wrapper1 = new HeavyWrapper(param, size);
HeavyWrapper wrapper2 = new HeavyWrapper(param, size);
boolean match = false;
if (wrapper1.equals(wrapper2))
match = true;
return match;
}
28
wrapper1 scope is NoEscape
wrapper2 scope is:
- NoEscape, if inlining of equals() succeeds
- ArgEscape, if inlining fails or disabled
CASE IV
31. www.ionutbalosin.com
Conclusions
31
Graal JIT was able to get rid of heap allocations offering a constant
response time, around 3 ns/op.
C1/C2 JIT struggles to get rid of the heap allocations if:
- there is a condition which does not make obvious at compile time if
the object escapes,
- or if the object scope becomes local (i.e. NoEscape) after inlining.
C1/C2 JIT by default does not consider escaping arrays if their size (i.e.
the number of elements) is greater than 64.
-XX:EliminateAllocationArraySizeLimit=<arraySize>
50. www.ionutbalosin.com
Conclusions - bimorphic call site
50
Both compilers (Graal JIT and C2 JIT) optimizes bimorphic call sites in
a similar approach:
- inlines implementations
- adds a deoptimization routine (Graal JIT) or an uncommon trap (C2
JIT) for another (unknown) possible target invocation.
58. www.ionutbalosin.com
Conclusions - megamorphic_3 call site
58
Graal JIT optimizes three target method calls by:
- inlining the target implementations
- adding a deoptimization routine for another (unknown) possible
target invocation.
C2 JIT does not perform any further optimization, there is a virtual call
(e.g. vtable lookup) → due to “polluted profile” context (i.e. method is
used in many different contexts with independent operand types).
Note: megamorphic_4(), megamorphic_5(), and megamorphic_6() are
optimized based on similar principles by Graal JIT and C2 JIT.
60. www.ionutbalosin.com
Parallel execution
60
C[i]
B[i]
A[i]
x
=
C[i] C[i+1] C[i+2] C[i+3]
B[i]
A[i]
x
=
B[i+1]
A[i+1]
x
=
B[i+2]
A[i+2]
x
=
B[i+3]
A[i+3]
x
=
Scalar version processes
a single pair of operands
at a time.
Vectorized version carries out the
same instructions on multiple
pairs of operands at once.
61. www.ionutbalosin.com
Auto vectorization: modern compilers analyze loops in serial code to
identify if they could be vectorized (i.e. perform loop transformations).
Loops requirements for auto vectorization:
- countable loops → only unrolled loops
- single entry and exit
- straight-line code (e.g. no switch, if with masking is allowed)
- only for most inner loop (caution in case of loop interchange or loop
collapsing)
- no functions calls, but intrinsics
61
62. www.ionutbalosin.com
Countable loops
1) for (int i = start; i < limit; i += stride) {
// loop body
}
2) int i = start;
while (i < limit) {
// loop body
i += stride;
}
62
- limit is loop invariant
- stride is constant (compile-time known)
74. www.ionutbalosin.com
Conclusions – multiply_2_arrays_elements
74
C2 JIT uses vectorization for:
- main loop - YMM AVX2 registers → 4x8 float operations/loop cycle,
- post loop - YMM AVX2 registers → 8 float operations/loop cycle.
C2 JIT handles the remaining post loop without unrolling and without
vectorization, just one by one until reaches the end.
Graal JIT has not vectorization support → this is a core feature missing
in GraalVM CE!
Graal JIT does not trigger loop unrolling (in this case).
83. www.ionutbalosin.com
@Param({"4096"}) private int thresholdLimit;
private final int SIZE = 16384;
private int[] array = new int[SIZE];
// Initialize array elements with values between [0, thresholdLimit)
// e.g. array[i] = random.nextInt(thresholdLimit); // i = 0...SIZE
@Benchmark
public int unpredictable_branch() {
int sum = 0;
for (final int value : array) {
if (value <= (thresholdLimit / 2)) {
sum += value;
}
}
return sum;
}
83
93. www.ionutbalosin.com
Conclusions – unpredictable_branch
93
C2 JIT optimizes the loop as follows:
- main loop - with an unrolling factor of 8
- post loop - one by one, until it reaches the end.
Graal JIT does not unroll the loop (second example without unrolling)!
95. www.ionutbalosin.com 95
Graal JIT C1/C2 JIT
Scalar Replacement
Virtual Calls
Vectorization
Loop Unrolling
Intrinsics
Scalar Replacement: nice optimizations in Graal JIT (e.g. Partial Escape Analysis).
Virtual Calls: better optimized by Graal JIT (e.g. megamorphic call sites).
Vectorization: core compiler feature “missing” in Graal JIT (Graal CE).
Loop unrolling: not found in current Graal JIT test cases. Missing or just limited!?
Intrinsics: not covered but the support is lower in Graal JIT. See reference.