Обзорный рассказ про новые возможности в мире PostgreSQL для митапа Big Data Minsk User Group 29 апреля 2016 г.: https://www.facebook.com/events/120784531655479/
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...Altinity Ltd
San Diego Cloud Native Computing Meetup, January 23, 2020
Presented by Robert Hodges, Altinity CEO
Data services are the latest wave of applications to catch the Kubernetes bug, but how many people would guess that includes data warehouses? We proved it works by developing the ClickHouse Kubernetes operator, which is now in production use at companies like Mux.com. It's an open source operator to stand up and run ClickHouse, a popular Apache 2.0 data warehouse that can return queries on trillions of rows in seconds or less. This talk introduces ClickHouse and shows why it's a 'cloud friendly' DBMS. We'll go mano-a-mano with the ClickHouse operator, showing how you can spin up data warehouses in 60 seconds or less. We'll cover issues like storage management, monitoring and upgrade. In short, everything you need to know to try running your own ClickHouse data warehouses on Kubernetes.
Right now postgres can't compress its data in many situations and that leads sometimes to increased storage overhead by the order of magnitude comparing with commercial DBMS. Common viewpoint that this task can be accomplished by file system level compression but most popular and well tested Linux file system can't do that. I will talk about our patches that implements page compression on disk or on disk + in memory; in what situation it is better to use what kind of compression; and also discuss experience of using compression in production.
Modern query optimisation features in MySQL 8.Mydbops
MySQL 8 (a huge leap forward), indexing capabilities, execution plan enhancements, optimizer improvements, and many other current query tweak features are covered in the slides.
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...Altinity Ltd
San Diego Cloud Native Computing Meetup, January 23, 2020
Presented by Robert Hodges, Altinity CEO
Data services are the latest wave of applications to catch the Kubernetes bug, but how many people would guess that includes data warehouses? We proved it works by developing the ClickHouse Kubernetes operator, which is now in production use at companies like Mux.com. It's an open source operator to stand up and run ClickHouse, a popular Apache 2.0 data warehouse that can return queries on trillions of rows in seconds or less. This talk introduces ClickHouse and shows why it's a 'cloud friendly' DBMS. We'll go mano-a-mano with the ClickHouse operator, showing how you can spin up data warehouses in 60 seconds or less. We'll cover issues like storage management, monitoring and upgrade. In short, everything you need to know to try running your own ClickHouse data warehouses on Kubernetes.
Right now postgres can't compress its data in many situations and that leads sometimes to increased storage overhead by the order of magnitude comparing with commercial DBMS. Common viewpoint that this task can be accomplished by file system level compression but most popular and well tested Linux file system can't do that. I will talk about our patches that implements page compression on disk or on disk + in memory; in what situation it is better to use what kind of compression; and also discuss experience of using compression in production.
Modern query optimisation features in MySQL 8.Mydbops
MySQL 8 (a huge leap forward), indexing capabilities, execution plan enhancements, optimizer improvements, and many other current query tweak features are covered in the slides.
This one is about advanced indexing in PostgreSQL. It guides you through basic concepts as well as through advanced techniques to speed up the database.
All important PostgreSQL Index types explained: btree, gin, gist, sp-gist and hashes.
Regular expression indexes and LIKE queries are also covered.
Accelerating Local Search with PostgreSQL (KNN-Search)Jonathan Katz
KNN-GiST indexes were added in PostgreSQL 9.1 and greatly accelerate some common queries in the geospatial and textual search realms. This presentation will demonstrate the power of KNN-GiST indexes on geospatial and text searching queries, but also their present limitations through some of my experimentations. I will also discuss some of the theory behind KNN (k-nearest neighbor) as well as some of the applications this feature can be applied too.
To see a version of the talk given at PostgresOpen 2011, please visit http://www.youtube.com/watch?v=N-MD08QqGEM
New Features
● Developer and SQL Features
● DBA and Administration
● Replication
● Performance
By Amit Kapila at India PostgreSQL UserGroup Meetup, Bangalore at InMobi.
http://technology.inmobi.com/events/india-postgresql-usergroup-meetup-bangalore
Unified Data Platform, by Pauline Yeung of Cisco SystemsAltinity Ltd
Presented on December ClickHouse Meetup. Dec 3, 2019
Our journey from using ClickHouse in an internal threat library web application, to experimenting with ClickHouse to migrating production data from Elasticsearch, Postgres, HBase, to trying ClickHouse for error metrics in a product under development.
Exploring Parallel Merging In GPU Based Systems Using CUDA C.Rakib Hossain
We present a program that implemented to execute Adaptive merge sort algorithm in parallel on a GPU based system. Parallel implementation is used to get better performance than serial implementation in runtime perspective. Parallel implementation executes independent executable operation in parallel using large number of cores in GPU based system. Results from a parallel implementation of the algorithm is given and compared with its serial implementation on run time basis. The parallel version is implemented with CUDA platform in a system based on NVIDIA GPU (GTX 650)
In 40 minutes the audience will learn a variety of ways to make postgresql database suddenly go out of memory on a box with half a terabyte of RAM.
Developer's and DBA's best practices for preventing this will also be discussed, as well as a bit of Postgres and Linux memory management internals.
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareAltinity Ltd
Presented on December ClickHouse Meetup. Dec 3, 2019
Concrete findings and "best practices" from building a cluster sized for 150 analytic queries per second on 100TB of http logs. Topics covered: hardware, clients (http vs native), partitioning, indexing, SELECT vs INSERT performance, replication, sharding, quotas, and benchmarking.
How to Avoid Pitfalls in Schema Upgrade with GaleraSveta Smirnova
Galera Cluster for MySQL is a 100% synchronized cluster in regards to data modification operations (DML). It is ensured by the optimistic locking model and ability to rollback a transaction, which cannot be applied on all nodes. However, schema changes (DDL operations) are not transactional in MySQL, which adds complexity when you need to perform an upgrade or change schema of the database.
Changes made by DDL may affect results of the queries. Therefore all modifications must replicate on all nodes prior next data access. For operations which run momentarily it can be easily achieved, but schema changes may take hours to apply. Therefore in addition to safest synchronous blocking schema upgrade method TOI Galera also supports more relaxed, thought not safe, method RSU.
In her talk Sveta will describe which pitfalls you can hit while performing the change using one or another method, why and how to avoid them.
Presented at MariaDB Day Brussels 0202 2020: https://mariadb.org/mariadb-day-brussels-0202-2020-provisional-schedule/
Es una iniciativa sin ánimo de lucro que pretende fomentar y divulgar la lectura, escritura y cultura gastronómica.
Siguenos en www.facebook.com/gastrorelatos o @gastrorelatos en twitter o instagram
Comparte tu #gastrorelato con el hastag
This one is about advanced indexing in PostgreSQL. It guides you through basic concepts as well as through advanced techniques to speed up the database.
All important PostgreSQL Index types explained: btree, gin, gist, sp-gist and hashes.
Regular expression indexes and LIKE queries are also covered.
Accelerating Local Search with PostgreSQL (KNN-Search)Jonathan Katz
KNN-GiST indexes were added in PostgreSQL 9.1 and greatly accelerate some common queries in the geospatial and textual search realms. This presentation will demonstrate the power of KNN-GiST indexes on geospatial and text searching queries, but also their present limitations through some of my experimentations. I will also discuss some of the theory behind KNN (k-nearest neighbor) as well as some of the applications this feature can be applied too.
To see a version of the talk given at PostgresOpen 2011, please visit http://www.youtube.com/watch?v=N-MD08QqGEM
New Features
● Developer and SQL Features
● DBA and Administration
● Replication
● Performance
By Amit Kapila at India PostgreSQL UserGroup Meetup, Bangalore at InMobi.
http://technology.inmobi.com/events/india-postgresql-usergroup-meetup-bangalore
Unified Data Platform, by Pauline Yeung of Cisco SystemsAltinity Ltd
Presented on December ClickHouse Meetup. Dec 3, 2019
Our journey from using ClickHouse in an internal threat library web application, to experimenting with ClickHouse to migrating production data from Elasticsearch, Postgres, HBase, to trying ClickHouse for error metrics in a product under development.
Exploring Parallel Merging In GPU Based Systems Using CUDA C.Rakib Hossain
We present a program that implemented to execute Adaptive merge sort algorithm in parallel on a GPU based system. Parallel implementation is used to get better performance than serial implementation in runtime perspective. Parallel implementation executes independent executable operation in parallel using large number of cores in GPU based system. Results from a parallel implementation of the algorithm is given and compared with its serial implementation on run time basis. The parallel version is implemented with CUDA platform in a system based on NVIDIA GPU (GTX 650)
In 40 minutes the audience will learn a variety of ways to make postgresql database suddenly go out of memory on a box with half a terabyte of RAM.
Developer's and DBA's best practices for preventing this will also be discussed, as well as a bit of Postgres and Linux memory management internals.
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareAltinity Ltd
Presented on December ClickHouse Meetup. Dec 3, 2019
Concrete findings and "best practices" from building a cluster sized for 150 analytic queries per second on 100TB of http logs. Topics covered: hardware, clients (http vs native), partitioning, indexing, SELECT vs INSERT performance, replication, sharding, quotas, and benchmarking.
How to Avoid Pitfalls in Schema Upgrade with GaleraSveta Smirnova
Galera Cluster for MySQL is a 100% synchronized cluster in regards to data modification operations (DML). It is ensured by the optimistic locking model and ability to rollback a transaction, which cannot be applied on all nodes. However, schema changes (DDL operations) are not transactional in MySQL, which adds complexity when you need to perform an upgrade or change schema of the database.
Changes made by DDL may affect results of the queries. Therefore all modifications must replicate on all nodes prior next data access. For operations which run momentarily it can be easily achieved, but schema changes may take hours to apply. Therefore in addition to safest synchronous blocking schema upgrade method TOI Galera also supports more relaxed, thought not safe, method RSU.
In her talk Sveta will describe which pitfalls you can hit while performing the change using one or another method, why and how to avoid them.
Presented at MariaDB Day Brussels 0202 2020: https://mariadb.org/mariadb-day-brussels-0202-2020-provisional-schedule/
Es una iniciativa sin ánimo de lucro que pretende fomentar y divulgar la lectura, escritura y cultura gastronómica.
Siguenos en www.facebook.com/gastrorelatos o @gastrorelatos en twitter o instagram
Comparte tu #gastrorelato con el hastag
Important scene in Hamlet ( Gravedigge's scene and Nunnery scene)Mital Raval
This presentation is a part of my academic presentation M.A. Sem 1of The Renaissance Literature Department of M. A. English M. k. Bhavnagar university and it is submitted to Pro. Dr. Dilip Barad.
La salud del branding en España 2015 (II Barómetro Aebrand)Branward®
El informe promovido por Aebrand y desarrollado por el Brand Institute de ESADE es una herramienta que aporta información regular sobre el estado de la práctica del branding en España y ayuda a la evaluación de la relevancia que tiene la marca en la gestión empresarial.
Lección 7 - Curso de Marketing Digital - La Web 1.0SM Digital
Lección 7 del Curso de Marketing Digital de SM Digital: La Web 1.0 - El inicio del internet masivo.
Conoce más y suscríbete al curso de mercadeo digital en nuestro sitio web: www.smdigital.com.co
Estudio del impacto en Twitter del #MWC16SocialBrains
Un repaso a los principales KPIs monitorizados en el Mobile World Congress de Barcelona por SocialBrains, con partner tecnológico Oraquo.
Factores tanto exógenos como endógenos, cuantitativos y cualitativos, del mayor evento de electrónica de consumo de España.
Estudio del impacto de los Papeles de Panamá 2 meses despuésSocialBrains
Casi dos meses han pasado desde que saltara a la opinión pública la noticia.
En su momento, en SocialBrains hicimos público un informe sobre el estado de arte cuanti y cualitativo del que es hasta ahora la mayor filtración de datos sobre corrupción de la historia.
11 millones de documentos que siguen escupiendo respuestas incómodas, y que afectan a prácticamente todos los países del mundo.
Dos meses después, y pasado ese interés mediático por la noticia, la monitorización que tenemos activa en Oraquo (ES) aglutina la nada despreciable cifra de 1.338.316 menciones, siendo el universo de Twitter de 1.241.262 menciones, con 341.567 tweets originales.
Seguir leyendo en: http://www.socialbrains.es/panama-papers-2-meses/
New Tuning Features in Oracle 11g - How to make your database as boring as po...Sage Computing Services
One of the key problems that have haunted Oracle sites since the introduction of the cost based optimiser is the ability to provide a stable level of performance over time. The very responsiveness of the CBO to factors such as changes in statistics and initialisation parameters can lead to sudden changes in performance levels. Oracle 11g is set to introduce a number of features that will assist the DBA in providing a stable environment for mission critical applications. Excitement is for out of work time, (and for developers). The aim of most database administrators is to have as boring a working life as possible. Oracle 11g may help us achieve those aims.
This presentation discusses some of those features including:
Capture and replay of workload
Automatic SGA tuning
Managing and fixing plans
The 11g Automatic Tuning Advisor
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?SegFaultConf
Wyobraź sobie, że w twojej aplikacji zachodzą jakieś zmiany (domain eventy). Chcielibyśmy te zmiany wystawić na zewnątrz, żebyśmy mogli na ich podstawie robić sobie raporty, read modele, sagi, synchronizować dane. Czy to zadanie okaże się być trudne czy proste, jeśli użyjemy bazy danych SQL. Co zyskaliśmy dzięki temu, że używam RDBMS/SQL a co utraciliśmy, być może, bezpowrotnie. W tej prezentacji opowiem wam jak chciałem zbudować pewną funkcjonalność dla biblioteki Rails Event Store, dlaczego okazało być się to trudniejsze niż myślałem, o modelu MVCC w PostgreSQL, czy jest sposób, żeby go obejść i uzyskać emulację trybu READ UNCOMMITTED. A może możnaby do całego problemu podejśc zupełnie inaczej i podłączyć się pod Write-Ahead-Log (WAL) i wygrać świat w ten sposób? Pokażę też jak moim zdaniem, korzystając z dokładnie tych samych konceptów, które stoją za Event Sourcingiem i bazami danych moglibyśmy budować API, tak bym za każdym razem pisząc integrację z serwisem X nie musiał się zastanawiać czy jego autorzy rozumieją pojęcie idempotent czy nie. Albo jak moglibyśmy osiągnąć prostotę dzięki używaniu Convergent Replicated Data Types (CRDT). Być może jako community stać nas na więcej niż REST nad CRUDem. Zastanowimy się, czy sprzedawcy SQLa zlasowali nam mózgi, sprawili, że zapomnieliśmy o najprostszym sposobie, który może działać i wprowadzili nas w maliny, w których aktualnie się znajdujemy. A może sami jesteśmy sobie winni? TLDR: Czy nasze aplikacje nie mogłyby działać tak jak pod spodem działają bazy danych? Czy to wszystko musi być takie ciężkie i skomplikowane jeśli chcemy mieć mikro-serwisy, zwłaszcza w małym zespole, który niekoniecznie lubi dostawiać 5 bazę danych do stacku technologicznego.
This webinar will give an overview of CREATE STATISTICS in PostgreSQL. This command allows the database to collect multi-column statistics, helping the optimizer understand dependencies between columns, produce more accurate estimates, and better query plans.
The following key topics will be covered during the webinar:
- Why CREATE STATISTICS may be needed at all
- How the command works
- Which cases CREATE STATISTICS already addresses
- What improvements are in the queue for future PostgreSQL versions (either already committed to PostgreSQL 13 or beyond)
Basic Query Tuning Primer - Pg West 2009mattsmiley
Intro to query tuning in Postgres, for beginners or intermediate software developers. Lists your basic toolkit, common problems, a series of examples. Assumes the audience knows basic SQL but has little or no experience with reading or adjusting execution plans. Accompanies 45-90 minute talk; meant to encourage Q/A.
Matt Smiley
This is a basic primer aimed primarily at developers or DBAs new to Postgres. The format is a Q/A style tour with examples, based on common questions and pitfalls. Begin with a quick tour of relevant parts of the postgres catalog, with an aim to answer simple but important questions like:
How many rows does the optimizer think my table has?
When was it last analyzed?
Which other tables also have a column named "foo"?
How often is this index used?
MySQL® 5.7 is a great release which has a lot to offer, especially in the development and replication areas. It provides a lot of new optimizer features for developers to take advantage of, a much more powerful GIS function and high performance JSON data type, allowing for a more powerful store for semi-structured data. It also features dramatically improved Performance Schema, Parallel and Multi-Source replication, allowing you to scale much further than ever before, just to give you a taste. In this webinar, we will provide an overview of the most important MySQL 5.7 features.
This webinar will be part of a 3-part series which will include MySQL 5.7 for Developers and MySQL 5.7 for DBAs.
Como analisar planos de execução e estatísticas no PostgreSQL.
- Rastreamento de consultas lentas
- Uso do EXPLAIN
- Métodos de acesso
- Junções
- Parâmetros relevantes para o otimizador
Lightweight Transactions at Lightning SpeedScyllaDB
This talk will outline the Scylla implementation of Lightweight Transactions (LWT) that brings us to parity with Apache Cassandra. We will cover how to use it, what is working, and what is left to be done. We will also cover what other improvements are in store to improve Scylla's transactional capabilities and why it matters.
Beyond php - it's not (just) about the codeWim Godden
Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
Everything has its pros and cons, and machine learning algorithms are not an exception. We will discuss several cases where classical methods of machine learning prove to be more effective than neural networks. We will emphasize on the importance of a comprehensive view of machine learning, and rational approach to each problem.
В докладе попытка ответить на вопрос «Кто такой data scientist». На самом деле этим словом разные IT-компании называют принципиально разных специалистов, совместить работу которых сможет в себе лишь утопический data scientist. Рассматривается структура специальностей и академических инициатив в области анализа данных, которые есть в мире, обсудим их программы. На уровне компетенций объясняется, почему читаются именно такие курсы, соответствует ли их содержание ожиданиям индустрии.
Разработка интеллектуальных информационных систем: взгляд изнутриDzianis Pirshtuk
Каждый день в мире обсуждаются новые идеи и алгоритмы анализа быстрорастущих данных, рассказываются, как искусственные нейронные сети все больше захватывают мир и помогают людям. Обычно обсуждений так много, что голова идет кругом. Мы же пробуем рассмотреть на примерах, в сторону каких технологий в каком случае следует смотреть, и на какой список вопросов следует самому себе ответить, планируя разработку новых Data Science-фичей.
Обзорный рассказ про СУБД PostgreSQL, ее место в мире RDBMS и архитектурные особенности для митапа Big Data Minsk User Group 17 марта 2016 г.: https://www.facebook.com/events/1551967968434009/
Видеозапись: https://www.youtube.com/watch?v=_-sdVNwKcEA
Ключевые идеи алгоритмов обучения по прецедентам и почему про них следует помнить при выборе алгоритма и его настройке, поиске оптимальных параметров. Какие подходы позволяют повысить качество модели, какие программные средства удобно использовать при проведении экспериментов и где спрятаны любимые “грабли”.
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppGoogle
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-fusion-buddy-review
AI Fusion Buddy Review: Key Features
✅Create Stunning AI App Suite Fully Powered By Google's Latest AI technology, Gemini
✅Use Gemini to Build high-converting Converting Sales Video Scripts, ad copies, Trending Articles, blogs, etc.100% unique!
✅Create Ultra-HD graphics with a single keyword or phrase that commands 10x eyeballs!
✅Fully automated AI articles bulk generation!
✅Auto-post or schedule stunning AI content across all your accounts at once—WordPress, Facebook, LinkedIn, Blogger, and more.
✅With one keyword or URL, generate complete websites, landing pages, and more…
✅Automatically create & sell AI content, graphics, websites, landing pages, & all that gets you paid non-stop 24*7.
✅Pre-built High-Converting 100+ website Templates and 2000+ graphic templates logos, banners, and thumbnail images in Trending Niches.
✅Say goodbye to wasting time logging into multiple Chat GPT & AI Apps once & for all!
✅Save over $5000 per year and kick out dependency on third parties completely!
✅Brand New App: Not available anywhere else!
✅ Beginner-friendly!
✅ZERO upfront cost or any extra expenses
✅Risk-Free: 30-Day Money-Back Guarantee!
✅Commercial License included!
See My Other Reviews Article:
(1) AI Genie Review: https://sumonreview.com/ai-genie-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
#AIFusionBuddyReview,
#AIFusionBuddyFeatures,
#AIFusionBuddyPricing,
#AIFusionBuddyProsandCons,
#AIFusionBuddyTutorial,
#AIFusionBuddyUserExperience
#AIFusionBuddyforBeginners,
#AIFusionBuddyBenefits,
#AIFusionBuddyComparison,
#AIFusionBuddyInstallation,
#AIFusionBuddyRefundPolicy,
#AIFusionBuddyDemo,
#AIFusionBuddyMaintenanceFees,
#AIFusionBuddyNewbieFriendly,
#WhatIsAIFusionBuddy?,
#HowDoesAIFusionBuddyWorks
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Software Engineering, Software Consulting, Tech Lead, Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Transaction, Spring MVC, OpenShift Cloud Platform, Kafka, REST, SOAP, LLD & HLD.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
4. СОЗДАНИЕ ИНДЕКСА
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ]
[ [ IF NOT EXISTS ] name ] ON table_name [ USING method ]
( { column_name | ( expression ) } [ COLLATE collation ]
[ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ]
[, ...] ) [ WITH ( storage_parameter = value [, ... ] ) ]
[ TABLESPACE tablespace_name ] [ WHERE predicate ]
4
5. ВЫБОРКА БЕЗ ИНДЕКСА
meetup_demo=# EXPLAIN ANALYZE SELECT repo_id FROM github_events
WHERE 3488850707 < event_id AND event_id < 3488880707;
------------------------------------------------------------------
Seq Scan on github_events (cost=0.00..265213.33 rows=13185
width=8) (actual time=0.008..495.324 rows=12982 loops=1)
Filter: (('3488850707'::bigint < event_id) AND (event_id <
'3488880707'::bigint))
Rows Removed by Filter: 2040200
Planning time: 0.189 ms
Execution time: 504.053 ms
5
6. ПРОСТОЙ ИНДЕКС
CREATE UNIQUE INDEX event_id_idx ON github_events(event_id);
meetup_demo=# EXPLAIN ANALYZE SELECT repo_id FROM github_events
WHERE 3488850707 < event_id AND event_id < 3488880707;
------------------------------------------------------------------
Index Scan using event_id_idx on github_events
(cost=0.43..1921.28 rows=13187 width=8) (actual time=0.024..12.544
rows=12982 loops=1)
Index Cond: (('3488850707'::bigint < event_id) AND (event_id <
'3488880707'::bigint))
Planning time: 0.190 ms
Execution time: 21.130 ms
6
7. ОБЫЧНЫЙ ИНДЕКС
CREATE UNIQUE INDEX event_id_idx ON github_events(event_id);
--------------------------------
Index Scan using event_id_idx on github_events
(cost=0.43..1921.28 rows=13187 width=8) (actual
time=0.037..12.485 rows=12982 loops=1)
Index Cond: (('3488850707'::bigint < event_id) AND
(event_id < '3488880707'::bigint))
Planning time: 0.186 ms
Execution time: 21.222 ms
7
9. ПОКРЫВАЮЩИЙ ИНДЕКС
• Меньше размер индекса
• Меньше издержек на обновление
• Быстрее планирование и поиск
• Для включенных столбцов не нужен opclass
• Фильтр по включенным столбцам
CREATE UNIQUE INDEX event_id_idx2 ON
github_events(event_id) INCLUDING (repo_id);
https://pgconf.ru/media/2016/02/19/4_Lubennikova_B-
tree_pgconf.ru_3.0%20(1).pdf
9
10. ПОКРЫВАЮЩИЙ ИНДЕКС
meetup_demo=# EXPLAIN ANALYZE SELECT repo_id FROM
github_events WHERE 3488850707 < event_id AND event_id < 3488880707;
---------------------------------------
Index Only Scan using event_id_idx2 on github_events
(cost=0.43..23764.29 rows=13187 width=8) (actual time=0.032..12.533
rows=12982 loops=1)
Index Cond: ((event_id > '3488850707'::bigint) AND (event_id <
'3488880707'::bigint))
Heap Fetches: 12982
Planning time: 0.178 ms
Execution time: 21.147 ms
10
11. BRIN-ИНДЕКС
CREATE INDEX event_id_brin_idx ON github_event USING(event_id);
--------------------------------
Bitmap Heap Scan on github_events (cost=175.16..42679.52 rows=13187 width=8) (actual
time=0.824..1
5.489 rows=12982 loops=1)
Recheck Cond: (('3488850707'::bigint < event_id) AND (event_id < '3488880707'::bigint))
Rows Removed by Index Recheck: 13995
Heap Blocks: lossy=3072
-> Bitmap Index Scan on event_id_brin_idx (cost=0.00..171.87 rows=13187 width=0) (actual
time=0
.698..0.698 rows=30720 loops=1)
Index Cond: (('3488850707'::bigint < event_id) AND (event_id < '3488880707'::bigint))
Planning time: 0.094 ms
Execution time: 24.421 ms
11
13. CSTORE_FDW
• Inspired by Optimized Row Columnar (ORC) format
developed by Hortonworks.
• Compression: Reduces in-memory and on-disk data size
by 2-4x. Can be extended to support different codecs.
• Column projections: Only reads column data relevant to
the query. Improves performance for I/O bound queries.
• Skip indexes: Stores min/max statistics for row groups,
and uses them to skip over unrelated rows.
13
14. CSTORE_FDW
CREATE FOREIGN TABLE cstored_github_events (
event_id bigint,
event_type text,
event_public boolean,
repo_id bigint,
payload jsonb,
repo jsonb, actor jsonb,
org jsonb,
created_at timestamp
)
SERVER cstore_server
OPTIONS(compression 'pglz');
INSERT INTO cstored_github_events (SELECT * FROM github_events);
ANALYZE cstored_github_events;
14
15. ТИПИЧНЫЙ ЗАПРОС
meetup_demo=# EXPLAIN ANALYZE SELECT repo_id, count(*) FROM cstored_github_events WHERE created_at BETWEEN timestamp
'2016-01-02 01:00:00' AND timestamp '2016-01-02 23:00:00' GROUP BY repo_id ORDER BY 2 DESC;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Sort (cost=75153.59..75221.43 rows=27137 width=8) (actual time=950.085..1030.283 rows=106145 loops=1)
Sort Key: (count(*)) DESC
Sort Method: quicksort Memory: 8048kB
-> HashAggregate (cost=72883.86..73155.23 rows=27137 width=8) (actual time=772.445..861.162 rows=106145 loops=1)
Group Key: repo_id
-> Foreign Scan on cstored_github_events (cost=0.00..70810.84 rows=414603 width=8) (actual time=4.762..382.302
rows=413081 loops=1)
Filter: ((created_at >= '2016-01-02 01:00:00'::timestamp without time zone) AND (created_at <= '2016-01-02
23:00:00'::timestamp without time zone))
Rows Removed by Filter: 46919
CStore File: /var/lib/pgsql/9.5/data/cstore_fdw/18963/1236161
CStore File Size: 1475036725
Planning time: 0.126 ms
Execution time: 1109.248 ms
15
16. НЕ ВСЕГДА КАК В РЕКЛАМЕ
SELECT
pg_size_pretty(cstore_table_size('cstored_github_events'));
1407 MB
SELECT pg_size_pretty(pg_table_size('github_events'));
2668 MB
16
17. POSTGRESQL 9.5:
FOREIGN TABLE INHERITANCE
• Fast INSERT and look-ups into current table.
• Periodically move data to archive table for compression.
• Query both via main table.
• Combined row-based and columnar store
17
18. КЛАСТЕРИЗАЦИЯ
SELECT retweet_count FROM contest WHERE "user.id" =
13201312;
Time: 120.743 ms
CREATE INDEX user_id_post_id ON contest("user.id"
ASC, "id" DESC);
CLUSTER contest USING user_id_post_id;
VACUUM contest;
Time: 4.128 ms
18
https://github.com/reorg/pg_repack
There is
no CLUSTER statement
in the SQL standard.
bloating
19. ЧТО ЕЩЕ?
• UPSERT: INSERT… ON CONFLICT DO
NOTHING/UPDATE (9.5)
• Частичные индексы (9.2)
• Материализованные представления (9.3)
19
20. ПРОФИЛИРОВАНИЕ И DBA
• pg_stat_statements, pg_stat_activity, pg_buffercache
• https://github.com/PostgreSQL-Consulting/pg-utils
• https://github.com/ankane/pghero
• Множество полезных запросов на wiki PostgreSQL
• https://wiki.postgresql.org/wiki/Show_database_bloat
20
22. JSONB
CREATE INDEX login_idx ON github_events USING btree((org->>'login'));
CREATE INDEX login_idx2 ON github_events USING gin(org jsonb_value_path_ops);
jsonb_path_value_ops
(hash(path_item_1.path_item_2. ... .path_item_n); value)
jsonb_value_path_ops
(value; bloom(path_item_1) | bloom(path_item_2) | ... | bloom(path_item_n))
22
23. JSQUERY
CREATE TABLE js (
id serial,
data jsonb,
CHECK (data @@ '
name IS STRING AND
similar_ids.#: IS NUMERIC AND
points.#:(x IS NUMERIC AND y IS NUMERIC)':: jsquery));
23