This is 101 introduction of GPORCA for Open Source developers. GPORCA is open source query optimizer for SQL on MPP (massive parallel processing) database system like Greenplum. You can find the overview of GPORCA, as well as how to debug and contribute back to OSS community.
GPORCA is query optimizer used inside Greenplum database, the first open source MPP solution based on PostgreSQL.
These are slides presented at the PGConf Seattle 2017. It introduced the internals of GPORCA, and provide OSS developers context to contribute back to the project.
This document provides an overview and agenda for a presentation on Yocto, an open source project for embedded Linux development. It discusses Yocto components, layers, recipes, classes, and the build system. It also outlines how Yocto can be used by both system developers and application developers.
Build your own embedded linux distributions by yocto projectYen-Chin Lee
The document discusses the Yocto Project, an open-source collaboration project that provides templates, tools, and methods for creating custom Linux-based systems for embedded products. It provides an overview of the key components of Yocto including Poky, BitBake, and metadata. It also summarizes how to get started with Yocto including downloading Poky, setting up the build environment, and building a minimal image that can be run in QEMU for testing purposes.
As a user, when you press the power button of system, you get the login screen on your monitor and after login starts working.
Have you ever think what is going between you press power button and login screen appear?
There is a boot process for any operating system which executed one-by-one, and finally, you get the Operating system’s login screen.
Here we go through the Linux boot process stage-by-stage.
There are six high-level stages for Linux boot process:
BIOS – Basic Input/Output System executes MBR
MBR – Master Boot Record execute GRUB
GRUB – Grand Unified Bootloader executes Kernel
Kernel – Kernel executes /sbin/init
Init – Init executes runlevel programs
Runlevel – Runlevel programs are executed from /etc/rc.d/rc*.d/
Read original article here: https://linuxconcept.com/linux-boot-process-step-by-step-explained/
Full table BGP on VyOS converge time in seconds
Routing on MikroTiks converges near-instantly
BCP38 (customers cannot spoof source address)
IRR filtering (only accept where route/route6 object)
RPKI (will not accept invalid routes from P/T)
Templated configuration (repeatable, automated)
Single source of truth (the docs become the config)
VyOS SaltStack YAML Netbox BGP OSPF FRR RPKI IRR XDP
bgpq3 UTRS RTBH NetFlow
RIPE NCC Update 2019-10-02
Linux Traffic Control allows administrators to control network traffic through mechanisms like shaping, scheduling, classifying, policing, dropping and marking. It uses components like queuing disciplines (qdiscs), classes, filters, and actions. The tc command can be used to configure these components by adding, changing or deleting traffic control settings on network interfaces.
GPORCA is query optimizer used inside Greenplum database, the first open source MPP solution based on PostgreSQL.
These are slides presented at the PGConf Seattle 2017. It introduced the internals of GPORCA, and provide OSS developers context to contribute back to the project.
This document provides an overview and agenda for a presentation on Yocto, an open source project for embedded Linux development. It discusses Yocto components, layers, recipes, classes, and the build system. It also outlines how Yocto can be used by both system developers and application developers.
Build your own embedded linux distributions by yocto projectYen-Chin Lee
The document discusses the Yocto Project, an open-source collaboration project that provides templates, tools, and methods for creating custom Linux-based systems for embedded products. It provides an overview of the key components of Yocto including Poky, BitBake, and metadata. It also summarizes how to get started with Yocto including downloading Poky, setting up the build environment, and building a minimal image that can be run in QEMU for testing purposes.
As a user, when you press the power button of system, you get the login screen on your monitor and after login starts working.
Have you ever think what is going between you press power button and login screen appear?
There is a boot process for any operating system which executed one-by-one, and finally, you get the Operating system’s login screen.
Here we go through the Linux boot process stage-by-stage.
There are six high-level stages for Linux boot process:
BIOS – Basic Input/Output System executes MBR
MBR – Master Boot Record execute GRUB
GRUB – Grand Unified Bootloader executes Kernel
Kernel – Kernel executes /sbin/init
Init – Init executes runlevel programs
Runlevel – Runlevel programs are executed from /etc/rc.d/rc*.d/
Read original article here: https://linuxconcept.com/linux-boot-process-step-by-step-explained/
Full table BGP on VyOS converge time in seconds
Routing on MikroTiks converges near-instantly
BCP38 (customers cannot spoof source address)
IRR filtering (only accept where route/route6 object)
RPKI (will not accept invalid routes from P/T)
Templated configuration (repeatable, automated)
Single source of truth (the docs become the config)
VyOS SaltStack YAML Netbox BGP OSPF FRR RPKI IRR XDP
bgpq3 UTRS RTBH NetFlow
RIPE NCC Update 2019-10-02
Linux Traffic Control allows administrators to control network traffic through mechanisms like shaping, scheduling, classifying, policing, dropping and marking. It uses components like queuing disciplines (qdiscs), classes, filters, and actions. The tc command can be used to configure these components by adding, changing or deleting traffic control settings on network interfaces.
The document discusses using TCP/IP for high-performance computing and describes how TCP performance is impacted by factors like round-trip time, bandwidth limitations, and window size. It provides measurements of bandwidth over TCP for different round-trip times and explains TCP congestion control algorithms and how they influence transmission speed.
Comparing Next-Generation Container Image Building ToolsAkihiro Suda
http://sched.co/EaYe
Until recently, running `docker build` against Dockerfile had been the only way to build container images.
However, lots of opensource software are being proposed as successors/alternatives to `docker build`:
- BuildKit (Moby Project / Docker)
- img (Jessica Frazelle / Microsoft)
- Buildah (Project Atomic / Red Hat)
- umoci & Orca (SUSE)
- Bazel (Google)
- OpenShift S2I (Red Hat)
Akihiro Suda compares these new tools' advantages and disadvantages.
His evaluation basis would include but not be limited to:
- Performance (Cache efficiency, Concurrency, Distributed Execution)
- Secret management, e.g. SSH and AWS keys
- Support for non-Dockerfile
- Non-root execution
- UI & UX
- Governance of the community
He also proposes a unified interface for using these tools with Kubernetes in a vendor-neutral way.
This document discusses Linux kernel debugging. It provides an overview of debugging techniques including collecting system information, handling failures, and using printk(), KGDB, and debuggers. Key points covered are the components of a debugger, how KGDB can be used with gdb to debug interactively, analyzing crash data, and some debugging tricks and print functions.
The beautiful thing about software engineering is that it gives you the warm and fuzzy illusion of total understanding: I control this machine because I know how it operates. This is the result of layers upon layers of successful abstractions, which hide immense sophistication and complexity. As with any abstraction, though, these sometimes leak, and that's when a good grounding in what's under the hood pays off.
The second talk in this series peels a few layers of abstraction and takes a look under the hood of our "car engine", the CPU. While hardly anyone codes in assembly language anymore, your C# or JavaScript (or Scala or...) application still ends up executing machine code instructions on a processor; that is why Java has a memory model, why memory layout still matters at scale, and why you're usually free to ignore these considerations and go about your merry way.
You'll come away knowing a little bit about a lot of different moving parts under the hood; after all, isn't understanding how the machine operates what this is all about?
(From a talk given at BuildStuff 2016 in Vilnius, Lithuania.)
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIAnne Nicolas
As the name would suggest, a Non-Maskable Interrupt (NMI) is an interrupt-like feature that is unaffected by the disabling of classic interrupts. In Linux, NMIs are involved in some features such as performance event monitoring, hard-lockup detector, on demand state dumping, etc… Their potential to fire when least expected can fill the most seasoned kernel hackers with dread.
AArch64 (aka arm64 in the Linux tree) does not provide architected NMIs, a consequence being that features benefiting from NMIs see their use limited on AArch64. However, the Arm Generic Interrupt Controller (GIC) supports interrupt prioritization and masking, which, among other things, provides a way to control whether or not a set of interrupts can be signaled to a CPU.
This talk will cover how, using the GIC interrupt priorities, we provide a way to configure some interrupts to behave in an NMI-like manner on AArch64. We’ll discuss the implementation, some of the complications that ensued and also some of the benefits obtained from it.
Julien Thierry
The Yocto Project is an open source project that provides tools and methods for creating custom Linux-based systems for embedded products regardless of CPU architecture. It uses a "layer" approach where components like the build system, core packages, and machine-specific files can be mixed and matched. The speaker demonstrates how to download a Yocto Project release, configure a build, and run the build process to generate root filesystem images and packages for target deployment. Potential applications mentioned include virtualization platforms and specialized subsystems in vehicles.
This document introduces Git Flow, a branching model for Git that supports parallel development and release management of projects. It recommends using separate branches for features, releases, hotfixes and support. The key branches are develop, which always holds the complete history and is used for integration, and master, which holds production-ready code. Feature branches are used for new development and merged into develop when ready. Release branches are used to prepare releases and merged into both develop and master. Hotfix branches address issues in master and merged into both. Visual diagrams and step-by-step examples are provided to demonstrate how to set up and use Git Flow for parallel development and releases.
In this video from the 2015 Stanford HPC Conference, Pavel Shamis from ORNL presents: Preparing OpenSHMEM for Exascale.
"OpenSHMEM is a partitioned global address space (PGAS) one-sided communications library that enables remote memory access (RMA) across processing elements (PEs). Its API allows data to be transferred from one PE memory space to another PE’s symmetric memory space; decoupling the data transfers from synchronizations. OpenSHMEM is useful for applications that are latency driven or that have irregular communication patterns, because its one-sided API can be mapped very efficiently to hardware (e.g. RDMA interconnects, etc), and its one-sided programming model helps the overlapping of communication with computation. Summit is Oak Ridge National Laboratory’s next high performance supercomputer system that will be based on a many core/GPU hybrid architecture. In order to prepare OpenSHMEM for future systems, it is important to enhance its programming model to enable efficient utilization of the new hardware capabilities (e.g. massive multithreaded systems, accesses different type memories, next generation of interconnects, etc). This session will present recent advances in the area of OpenSHMEM extensions, implementations, and tools.”
Watch the video: http://insidehpc.com/2015/02/video-preparing-openshmem-for-exascale/
See more talks in the Stanford HPC Conference Video Gallery: http://wp.me/P3RLHQ-dOO
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
This document provides an introduction to eBPF and XDP. It discusses the history of BPF and how it evolved into eBPF. Key aspects of eBPF covered include the instruction set, JIT compilation, verifier, helper functions, and maps. XDP is introduced as a way to program the data plane using eBPF programs attached early in the receive path. Example use cases and performance benchmarks for XDP are also mentioned.
GPORCA is newly open source advanced query optimizer that is a subproject of Greenplum Database open source project. GPORCA is the query optimizer used in commercial distributions of both Greenplum and HAWQ. In these distributions GPORCA has achieved 1000x performance improvement across TPC-DS queries by focusing on three distinct areas: Dynamic Partition Elimination, SubQuery Unnesting, and Common Table Expression.
Now that GPORCA is open source, we are looking for collaborators to help us realize the ultimate dream for GPORCA - to work with any database.
The new breed of data management systems in Big Data have to process so much data that optimization mistakes are magnified in traditional optimizers. Furthermore, coding and manual optimization of complex queries has proven to be hard.
In this session, Venkatesh will discuss:
- Overview of GPORCA
- How to add GPORCA to HAWQ with a build option
- How GPORCA could be made to work with any database
- Future vision for GPORCA and more immediate plans
- How to work with GPORCA, and how to contribute to GPORCA
Crude-Oil Scheduling Technology: moving from simulation to optimizationBrenno Menezes
Scheduling technology either commercial or homegrown in today’s crude-oil refining industries relies on a complex simulation of scenarios where the user is solely responsible for making many different decisions manually in the search for feasible solutions over some limited time-horizon i.e., trial-and-error heuristics. As a normal outcome, schedulers abandon these solutions and then return to their simpler spreadsheet simulators due to: (i) time-consuming efforts to configure and manage numerous scheduling scenarios, and (ii) requirements of updating premises and situations that are constantly changing. Moving to solutions based in optimization rather than simulation, the lecture describes the future steps in the refactoring of the scheduling technology in PETROBRAS considering in separate the graphic user interface (GUI) and data communication developments (non-modeling related), and the modeling and process engineering related in an automated decision-making with built-in problem representation facilities and integrated data handling features among other techniques in a smart scheduling frontline.
The document discusses using TCP/IP for high-performance computing and describes how TCP performance is impacted by factors like round-trip time, bandwidth limitations, and window size. It provides measurements of bandwidth over TCP for different round-trip times and explains TCP congestion control algorithms and how they influence transmission speed.
Comparing Next-Generation Container Image Building ToolsAkihiro Suda
http://sched.co/EaYe
Until recently, running `docker build` against Dockerfile had been the only way to build container images.
However, lots of opensource software are being proposed as successors/alternatives to `docker build`:
- BuildKit (Moby Project / Docker)
- img (Jessica Frazelle / Microsoft)
- Buildah (Project Atomic / Red Hat)
- umoci & Orca (SUSE)
- Bazel (Google)
- OpenShift S2I (Red Hat)
Akihiro Suda compares these new tools' advantages and disadvantages.
His evaluation basis would include but not be limited to:
- Performance (Cache efficiency, Concurrency, Distributed Execution)
- Secret management, e.g. SSH and AWS keys
- Support for non-Dockerfile
- Non-root execution
- UI & UX
- Governance of the community
He also proposes a unified interface for using these tools with Kubernetes in a vendor-neutral way.
This document discusses Linux kernel debugging. It provides an overview of debugging techniques including collecting system information, handling failures, and using printk(), KGDB, and debuggers. Key points covered are the components of a debugger, how KGDB can be used with gdb to debug interactively, analyzing crash data, and some debugging tricks and print functions.
The beautiful thing about software engineering is that it gives you the warm and fuzzy illusion of total understanding: I control this machine because I know how it operates. This is the result of layers upon layers of successful abstractions, which hide immense sophistication and complexity. As with any abstraction, though, these sometimes leak, and that's when a good grounding in what's under the hood pays off.
The second talk in this series peels a few layers of abstraction and takes a look under the hood of our "car engine", the CPU. While hardly anyone codes in assembly language anymore, your C# or JavaScript (or Scala or...) application still ends up executing machine code instructions on a processor; that is why Java has a memory model, why memory layout still matters at scale, and why you're usually free to ignore these considerations and go about your merry way.
You'll come away knowing a little bit about a lot of different moving parts under the hood; after all, isn't understanding how the machine operates what this is all about?
(From a talk given at BuildStuff 2016 in Vilnius, Lithuania.)
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIAnne Nicolas
As the name would suggest, a Non-Maskable Interrupt (NMI) is an interrupt-like feature that is unaffected by the disabling of classic interrupts. In Linux, NMIs are involved in some features such as performance event monitoring, hard-lockup detector, on demand state dumping, etc… Their potential to fire when least expected can fill the most seasoned kernel hackers with dread.
AArch64 (aka arm64 in the Linux tree) does not provide architected NMIs, a consequence being that features benefiting from NMIs see their use limited on AArch64. However, the Arm Generic Interrupt Controller (GIC) supports interrupt prioritization and masking, which, among other things, provides a way to control whether or not a set of interrupts can be signaled to a CPU.
This talk will cover how, using the GIC interrupt priorities, we provide a way to configure some interrupts to behave in an NMI-like manner on AArch64. We’ll discuss the implementation, some of the complications that ensued and also some of the benefits obtained from it.
Julien Thierry
The Yocto Project is an open source project that provides tools and methods for creating custom Linux-based systems for embedded products regardless of CPU architecture. It uses a "layer" approach where components like the build system, core packages, and machine-specific files can be mixed and matched. The speaker demonstrates how to download a Yocto Project release, configure a build, and run the build process to generate root filesystem images and packages for target deployment. Potential applications mentioned include virtualization platforms and specialized subsystems in vehicles.
This document introduces Git Flow, a branching model for Git that supports parallel development and release management of projects. It recommends using separate branches for features, releases, hotfixes and support. The key branches are develop, which always holds the complete history and is used for integration, and master, which holds production-ready code. Feature branches are used for new development and merged into develop when ready. Release branches are used to prepare releases and merged into both develop and master. Hotfix branches address issues in master and merged into both. Visual diagrams and step-by-step examples are provided to demonstrate how to set up and use Git Flow for parallel development and releases.
In this video from the 2015 Stanford HPC Conference, Pavel Shamis from ORNL presents: Preparing OpenSHMEM for Exascale.
"OpenSHMEM is a partitioned global address space (PGAS) one-sided communications library that enables remote memory access (RMA) across processing elements (PEs). Its API allows data to be transferred from one PE memory space to another PE’s symmetric memory space; decoupling the data transfers from synchronizations. OpenSHMEM is useful for applications that are latency driven or that have irregular communication patterns, because its one-sided API can be mapped very efficiently to hardware (e.g. RDMA interconnects, etc), and its one-sided programming model helps the overlapping of communication with computation. Summit is Oak Ridge National Laboratory’s next high performance supercomputer system that will be based on a many core/GPU hybrid architecture. In order to prepare OpenSHMEM for future systems, it is important to enhance its programming model to enable efficient utilization of the new hardware capabilities (e.g. massive multithreaded systems, accesses different type memories, next generation of interconnects, etc). This session will present recent advances in the area of OpenSHMEM extensions, implementations, and tools.”
Watch the video: http://insidehpc.com/2015/02/video-preparing-openshmem-for-exascale/
See more talks in the Stanford HPC Conference Video Gallery: http://wp.me/P3RLHQ-dOO
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
This document provides an introduction to eBPF and XDP. It discusses the history of BPF and how it evolved into eBPF. Key aspects of eBPF covered include the instruction set, JIT compilation, verifier, helper functions, and maps. XDP is introduced as a way to program the data plane using eBPF programs attached early in the receive path. Example use cases and performance benchmarks for XDP are also mentioned.
GPORCA is newly open source advanced query optimizer that is a subproject of Greenplum Database open source project. GPORCA is the query optimizer used in commercial distributions of both Greenplum and HAWQ. In these distributions GPORCA has achieved 1000x performance improvement across TPC-DS queries by focusing on three distinct areas: Dynamic Partition Elimination, SubQuery Unnesting, and Common Table Expression.
Now that GPORCA is open source, we are looking for collaborators to help us realize the ultimate dream for GPORCA - to work with any database.
The new breed of data management systems in Big Data have to process so much data that optimization mistakes are magnified in traditional optimizers. Furthermore, coding and manual optimization of complex queries has proven to be hard.
In this session, Venkatesh will discuss:
- Overview of GPORCA
- How to add GPORCA to HAWQ with a build option
- How GPORCA could be made to work with any database
- Future vision for GPORCA and more immediate plans
- How to work with GPORCA, and how to contribute to GPORCA
Crude-Oil Scheduling Technology: moving from simulation to optimizationBrenno Menezes
Scheduling technology either commercial or homegrown in today’s crude-oil refining industries relies on a complex simulation of scenarios where the user is solely responsible for making many different decisions manually in the search for feasible solutions over some limited time-horizon i.e., trial-and-error heuristics. As a normal outcome, schedulers abandon these solutions and then return to their simpler spreadsheet simulators due to: (i) time-consuming efforts to configure and manage numerous scheduling scenarios, and (ii) requirements of updating premises and situations that are constantly changing. Moving to solutions based in optimization rather than simulation, the lecture describes the future steps in the refactoring of the scheduling technology in PETROBRAS considering in separate the graphic user interface (GUI) and data communication developments (non-modeling related), and the modeling and process engineering related in an automated decision-making with built-in problem representation facilities and integrated data handling features among other techniques in a smart scheduling frontline.
Apache Tajo is a big data warehouse system that runs on Hadoop. It supports SQL standards and features powerful distributed processing, advanced query optimization, and the ability to handle long-running queries (hours) and interactive analysis queries (100 milliseconds). Tajo uses a master-slave architecture with a TajoMaster managing metadata and slave TajoWorkers running query tasks in a distributed fashion.
How YugaByte DB Implements Distributed PostgreSQLYugabyte
Building applications on PostgreSQL that require automatic data sharding and replication, fault tolerance, distributed transactions and geographic data distribution has been hard. In this 3 hour workshop, we will look at how to do this using a real-world example running on top of YugaByte DB, a distributed database that is fully wire-compatible with PostgreSQL and NoSQL APIs (Apache Cassandra and Redis). We will look at the design and architecture of YugaByte DB and how it reuses the PostgreSQL codebase to achieve full API compatibility. YugaByte DB support for PostgreSQL includes most data types, queries, stored procedures, etc. We will also take a look at how to build applications that are planet scale (requiring geographic distribution of data) and how to run them in cloud-native environments (for example, Kubernetes, hybrid or multi-cloud deployments).
The document is a presentation about new features in PostgreSQL 9.6. It discusses several major new features including parallel queries, avoiding VACUUM on all-frozen pages using freeze maps, monitoring the progress of VACUUM, phrase full text search, multiple synchronous replication, remote_apply synchronous commit, and improved capabilities of the postgres_fdw extension including pushing down sorts, joins, updates and deletes to remote servers.
“Quantum” Performance Effects: beyond the CoreC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2Sbd5Ws.
Sergey Kuksenko talks about how (and how much) CPU microarchitecture details may have an influence on applications performance. Could it be visible by end-users? How to avoid misjudgment when estimating code performance? CPU is huge (not in size) that is why the talk is limited to those parts which located out of computational core (mostly caches and memory access). Filmed at qconsf.com.
Sergey Kuksenko works as Java Performance Engineer at Oracle. His primary goal is making Oracle JVM faster digging into JVM runtime, JIT compilers, class libraries and etc. His favorite area is an interaction of Java with modern hardware what he is doing since 2005 when he worked at Intel in Apache Harmony Performance team.
The document is a presentation about examining Oracle GoldenGate trail files and using the LogDump utility. It provides information on what trail files are, how to create and manage them, and how to navigate and examine the contents of trail files using the LogDump utility commands. It includes an example of using LogDump commands to open a trail file, view the record header and details, and examine the contents of a record.
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018Gleb Otochkin
The presentation explain different use cases and topologies for Oracle GoldenGate Big Data adapters and show how we can offload our data to be analyzed in real time using modern Big Data technologies.
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter
Apache Tajo: A Big Data Warehouse System on Hadoop
- presented by Jae-hwaJeong, Apache Tajo committer and Gruter research engineer
at Gruter TECHDAY 2014 (Oct. 29 Seoul, Korea)
MySQL 8.0: What Is New in Optimizer and Executor?Norvald Ryeng
This document summarizes new features in MySQL 8.0 related to the optimizer and executor. It introduces common table expressions (CTEs) which provide better readability, allow references to the same CTE multiple times, and can refer to other CTEs. It also discusses window functions, UTF-8 support improvements, new geospatial features, the SKIP LOCKED and NOWAIT clauses for SELECT statements, expanded JSON functions and data type, index extensions, updates to the cost model, and query hints.
Performance Co-Pilot (PCP) is a system performance analysis framework with a fully distributed, plug-in based architecture. It includes tools like pmcd for collecting metrics, pmda for individual metrics agents, pmlogger for logging metrics, and pmchart for graphical analysis. PCP supports Linux, Mac OS X, Solaris and Windows and is used by companies like RedHat, Netflix, and universities for centralized performance monitoring and analysis.
The document discusses the Z Garbage Collector (ZGC) introduced in JDK 11. ZGC provides scalable low-latency garbage collection for heap sizes in the terabytes by using concurrent, tracing GC techniques. It aims to reduce GC pause times to less than 10ms even for very large heaps, with low overhead of around 15%. The document covers how ZGC works, its implementation details like colored pointers and load barriers, and its performance compared to Parallel GC and G1 GC based on SPECjbb2015 benchmarks.
The document compares various database companies and their products. It provides details on TmaxSoft, including its employee headcount, main products which include databases and middleware, and key technologies. It then discusses Tibero database specifically and how it compares to Oracle, including features, licensing model, compatibility, and security. Examples are also provided showing capabilities of Tibero compared to Oracle.
The document outlines how to develop graphical user interfaces (GUIs) for R. It discusses motivations for building R GUIs and provides examples of existing R GUI projects like IsoGeneGUI and neaGUI that were developed using tcl/tk. It also describes how to build independent and embedded R GUIs, with steps for creating main windows, dialog boxes, and buttons. Plugins like RcmdrPlugin.biclustGUI that extend the R Commander interface are presented as a way to embed additional statistical analyses into an existing GUI framework.
Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Proven at scale in a variety of use cases at Facebook, Airbnb, Netflix, Uber, Twitter, Bloomberg, and FINRA, Presto experienced an unprecedented growth in popularity in both on-premises and cloud deployments in the last few years.
Inspired by the increasingly complex SQL queries run by the Presto user community, engineers at Facebook and Starburst have recently focused on cost-based query optimization. In this talk we will present the initial design and implementation of the CBO, support for connector-provided statistics, estimating selectivity, and choosing efficient query plans. Then, our detailed experimental evaluation will illustrate the performance gains for several classes of queries achieved thanks to the optimizer. Finally, we will discuss our future work enhancing the initial CBO and present the general Presto roadmap for 2018 and beyond.
Speakers
Kamil Bajda-Pawlikowski, Starburst Data, CTO & Co-Founder
Martin Traverso
Managing large (and small) R based solutions with R SuiteWit Jakuczun
The presentation I gave at DataMass Gdańsk Summit in 2017:
R is a great tool for data scientist. Being very dynamic and popular is now one of the most important technology on the market. Unfortunately out-of-the-box R is not suited for large scale applications. I will present R Suite that is an open-source solution developed by us for us to manage R development process.
The document discusses optimizations made to Infinispan to improve performance and consistency for a trading application at the Chicago Board Options Exchange (CBOE). Some key points include:
- Early adoption of Infinispan led to some performance and consistency issues that required troubleshooting logs, cache contents, and test applications.
- Optimizations focused on asynchronous communication, queue flushing, and separating state provider/consumer roles to balance performance and consistency across cache tiers.
- Log analysis was important for detecting issues like out-of-order operations between active and passive nodes. Configuration tweaks like increasing queue size also helped.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Ukraine
Під час доповіді відповімо на питання, навіщо потрібно підвищувати продуктивність аплікації і які є найефективніші способи для цього. А також поговоримо про те, що таке кеш, які його види бувають та, основне — як знайти performance bottleneck?
Відео та деталі заходу: https://bit.ly/45tILxj
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
Must Know Postgres Extension for DBA and Developer during MigrationMydbops
Mydbops Opensource Database Meetup 16
Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration
Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting
Date & Time: 8th June | 10 AM - 1 PM IST
Venue: Bangalore International Centre, Bangalore
Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle.
Key Takeaways:
* Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities.
* Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom.
* Discover how these key extensions can empower both developers and DBAs during the migration process.
* Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends.
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Follow us on LinkedIn: https://in.linkedin.com/company/mydbops
For more details and updates, please follow up the below links.
Meetup Page : https://www.meetup.com/mydbops-databa...
Twitter: https://twitter.com/mydbopsofficial
Blogs: https://www.mydbops.com/blog/
Facebook(Meta): https://www.facebook.com/mydbops/
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving
What began over 115 years ago as a supplier of precision gauges to the automotive industry has evolved into being an industry leader in the manufacture of product branding, automotive cockpit trim and decorative appliance trim. Value-added services include in-house Design, Engineering, Program Management, Test Lab and Tool Shops.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
What is an RPA CoE? Session 2 – CoE RolesDianaGray10
In this session, we will review the players involved in the CoE and how each role impacts opportunities.
Topics covered:
• What roles are essential?
• What place in the automation journey does each role play?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems