How to Find Patterns in Your Data with SQLChris Saxon
The document discusses various SQL techniques for finding patterns in data, including identifying consecutive dates and dates that fall within the same week. It provides examples of using regular expressions, window functions, and Oracle Database 12c's MATCH_RECOGNIZE clause to analyze a sample running log dataset and determine consecutive runs, runs within the same week, and consecutive weeks with a minimum number of runs. The document compares different approaches like MATCH_RECOGNIZE versus the Tabibitosan method.
18(ish) Things You'll Love About Oracle Database 18cChris Saxon
An overview of the latest SQL & PL/SQL features in Oracle Database 18c, including:
- Polymorphic Table Functions
- Inline External Tables
- JSON improvements
Added in Oracle Database 18c, Polymorphic Table Functions (PTFs) allow you to change the shape of a result set at runtime. So you can add or remove columns from your results based on input parameters.
This presentation gives an overview of the why & how of PTFs.
Solving Performance Problems Using MySQL Enterprise MonitorOracleMySQL
The document discusses using MySQL Enterprise Monitor to diagnose performance problems in a Group Replication cluster. It demonstrates creating a 3-node replication cluster using sandbox instances, loading sample data, and issuing problematic queries. It then shows how the Monitor identifies different types of performance issues from the query analyzer and detected events, including queries with full table scans, sorts, long-running queries, and replication outages.
Perth APAC Groundbreakers tour - SQL TechniquesConnor McDonald
This document discusses using SQL to perform various tasks like summarizing and transforming data, handling errors, and self-documenting queries. It provides examples of using pivot/unpivot clauses to change data orientation, query block naming to add descriptive comments to queries, and DBMS_ERRLOG to log errors from DML statements. It also discusses using partitioned outer joins to report bookings by hour including empty timeslots.
This document provides an overview and introduction to Oracle's Autonomous Database 18c. It begins with a brief history of database automation at Oracle, from early features like automatic undo management and storage management, to more advanced current capabilities like automatic SQL tuning and statistics gathering. It then discusses key aspects of the 18c Autonomous Database, including how it is fully managed as a service, incorporates artificial intelligence and machine learning capabilities, and aims to make databases easy to use with capabilities like one-click provisioning and elastic scaling. The document walks through the basic steps to create an Autonomous Database and load data, emphasizing security and ease of use. It also outlines available database services and how to configure client connectivity and create users.
How to Find Patterns in Your Data with SQLChris Saxon
The document discusses various SQL techniques for finding patterns in data, including identifying consecutive dates and dates that fall within the same week. It provides examples of using regular expressions, window functions, and Oracle Database 12c's MATCH_RECOGNIZE clause to analyze a sample running log dataset and determine consecutive runs, runs within the same week, and consecutive weeks with a minimum number of runs. The document compares different approaches like MATCH_RECOGNIZE versus the Tabibitosan method.
18(ish) Things You'll Love About Oracle Database 18cChris Saxon
An overview of the latest SQL & PL/SQL features in Oracle Database 18c, including:
- Polymorphic Table Functions
- Inline External Tables
- JSON improvements
Added in Oracle Database 18c, Polymorphic Table Functions (PTFs) allow you to change the shape of a result set at runtime. So you can add or remove columns from your results based on input parameters.
This presentation gives an overview of the why & how of PTFs.
Solving Performance Problems Using MySQL Enterprise MonitorOracleMySQL
The document discusses using MySQL Enterprise Monitor to diagnose performance problems in a Group Replication cluster. It demonstrates creating a 3-node replication cluster using sandbox instances, loading sample data, and issuing problematic queries. It then shows how the Monitor identifies different types of performance issues from the query analyzer and detected events, including queries with full table scans, sorts, long-running queries, and replication outages.
Perth APAC Groundbreakers tour - SQL TechniquesConnor McDonald
This document discusses using SQL to perform various tasks like summarizing and transforming data, handling errors, and self-documenting queries. It provides examples of using pivot/unpivot clauses to change data orientation, query block naming to add descriptive comments to queries, and DBMS_ERRLOG to log errors from DML statements. It also discusses using partitioned outer joins to report bookings by hour including empty timeslots.
This document provides an overview and introduction to Oracle's Autonomous Database 18c. It begins with a brief history of database automation at Oracle, from early features like automatic undo management and storage management, to more advanced current capabilities like automatic SQL tuning and statistics gathering. It then discusses key aspects of the 18c Autonomous Database, including how it is fully managed as a service, incorporates artificial intelligence and machine learning capabilities, and aims to make databases easy to use with capabilities like one-click provisioning and elastic scaling. The document walks through the basic steps to create an Autonomous Database and load data, emphasizing security and ease of use. It also outlines available database services and how to configure client connectivity and create users.
This document discusses MySQL partitioning, including when and why to use partitioning, different types of partitioning, and how to manage partitions. It covers using partitioning for faster deletion of data by dropping partitions, faster queries through partition pruning, and making some operations like adding indexes faster by performing them on individual partitions rather than entire tables. It provides examples and best practices for using partitioning for both short and long term rolling of data.
Pattern Matching with SQL - APEX World Rotterdam 2019Connor McDonald
The document appears to be a presentation on pattern matching in SQL. It includes examples of different techniques for grouping and aggregating employee names by department over time, from older procedural approaches to newer analytic functions. It also discusses challenges that remain and provides a hypothetical example of analyzing customer transaction data to identify periods of fast or slow growth.
This document provides an overview and agenda for a presentation on Python and the MySQL Document Store. The presentation introduces JSON and how MySQL works with JSON documents, the MySQL Document Store and X DevAPI, and provides code examples for interacting with document collections using Connector/Python, including creating a collection, adding, finding, modifying, and removing documents.
Slides from the Oracle ANZ workshop held in Sydney and Melbourne. We look at the killer features that will make 18c and 19c great productivity upgrades for DBAs
This document discusses second-level caching in Java Persistence API (JPA). It begins with an introduction to the speaker's background and experience. The presentation agenda is then outlined, covering why caching is important, the differences between first-level and second-level caches in JPA, JPA configuration parameters and API for caching, and specifics of Hibernate and EclipseLink second-level caching implementations. Examples are provided that demonstrate performance improvements when utilizing caching strategies like fetching relationships, projections, and aggregation queries.
This document discusses restricting and sorting data in Oracle. It covers limiting rows using the WHERE clause, sorting rows using the ORDER BY clause, and using ampersand substitution in iSQL*Plus to restrict and sort output at runtime. Comparison operators, logical operators, NULL conditions, character strings, dates and BETWEEN/IN/LIKE conditions for the WHERE clause are explained. Best practices for ORDER BY are also provided.
Dependency Injection is now part of nearly every Java project. But what is the difference between DI and CDI. How to decide what I could use better, what frameworks are available and what are the differences for me as a programmer? What could be an option for the IoT-, Desktop- or Webproject?
In this talk we will get an overview over different frameworks and how they are working. We are not checking the well known big one only, but we are looking at some small sometimes specialized implementations.
Everybody knows the pattern proxy, but how can you use it effectively?
What kind of proxy patterns are available, and how can you
build patterns more effectively with it? Why is reflection needed for this?
Importantly, we need only the core JDK in most cases.
This tutorial starts from the basics and continues on to
DynamicProxies, DynamicObjectAdapter and DynamicStaticProxies at runtime, StaticObjectAdapters, and more.
The session, based on the German book Dynamic Proxies,
by Heinz Kabutz and the session’s presenter, takes a deep dive into this pattern group.
How Big Data platform scaled from zero to billions of data within 6 months at ISCPIF (CNRS).
This talk contains our use of Elasticsearch, MongoDB, Redis, RabbitMQ and scalable/high available Web services built over Big Data architecture.
This presentation was presented at Université Paris-Sud, LAL, Bâtiment 200 organized by ARGOS. https://indico.mathrice.fr/event/2/overview
ISCPIF: http://iscpif.fr
Big Data at ISCPIF: http://bigdata.iscpif.fr
Climate at ISCPIF: http://climate.iscpif.fr
Playground for climate: http://climate.iscpif.fr/playground
Tweetoscope: http://tweetoscope.iscpif.fr
This document discusses storing product and order data as JSON in a database to support an agile development process. It describes creating tables with JSON columns to store this data, and using JSON functions like JSON_VALUE and JSON_TABLE to query and transform the JSON data. Examples are provided of indexing JSON columns for performance and updating product JSON to include unit costs by joining external data. The goal is to enable flexible and rapid evolution of the application through storing data in JSON.
The document discusses strategies for loading data in JPA, specifically lazy vs eager loading. It provides examples of when each strategy would be preferred, noting that eager loading is generally better for loading a small amount of related data to avoid additional database queries, while lazy loading is preferred when the related data is large or its needed fields are unknown. It also discusses using lazy loading for large objects like blobs and clobs.
Flashback is not only for those "Oh No!" moments when we make a mistake. It enables benefits for developers ranging from data consistency to continuous integration and data auditing. Tucked away in Enterprise Edition are six independent and powerful technologies that might just save your career—they will also open up a myriad of other benefits of well.
The document discusses locking and concurrency control in databases, demonstrating how table locks, row locks, and multi-version concurrency control work through examples of a database being backed up while concurrent changes are made. It shows how different locking strategies, like those used in MyISAM and InnoDB, allow for concurrent access to data while maintaining consistency and isolation. A live demo then highlights deadlocks and lock waits that can occur with concurrent access and how they are handled.
MySQL Goes to 8! FOSDEM 2020 Database Track, January 2nd, 2020Geir Høydalsvik
Here are the basic steps to clone a MySQL instance using the new CLONE command directly from SQL:
1. Connect to the source instance you want to clone from.
2. Issue the CLONE statement to create a new instance from the source. For example:
CLONE INSTANCE FROM 'mysql://user:password@source_host:3306/' TO 'mysql://user:password@target_host:3306/';
3. The clone operation will copy over the data files, redo logs and configuration from the source to the target instance.
4. Once complete, the new cloned instance is ready for use as a read replica or independent instance as needed.
By automating the provisioning
The document discusses setting up MySQL high availability using InnoDB Cluster. It provides instructions for installing MySQL 8.0, configuring three MySQL instances for the cluster, creating the cluster, and running the MySQL Router for load balancing and failover. The Router is configured to route reads from two read-only slaves, while writes go to a single read-write master. Connections are tested from each instance to verify the Router is load balancing correctly.
MySQL NoSQL JSON JS Python "Document Store" demoKeith Hollman
A Demo of the NoSQL JSON schemaless solution that is native in MySQL aka Document Store.
Presented at SPOUG Technical Conference 25-Sep-18, Madrid, Spain.
Top 10 SQL Performance tips & tricks for Java Developersgvenzl
This slide deck contains some of the most common database performance tips and tricks that developers can use to tune their applications or systems. It also highlights some anti-patterns and shows the impact of these anti-patterns in regard to performance.
This slide deck does not aim to be a complete list of all possibilities and techniques to achieve better performance but just highlights some very commonly seen mistakes and how to avoid them.
The document discusses MySQL 5.7 which integrates both SQL and NoSQL capabilities. It provides instructions for a workshop on using MySQL Shell to interact with MySQL 5.7 and its document store functionality. The workshop covers installing the MySQL X Plugin, loading sample data, querying and modifying collections and tables, and handling errors.
This document summarizes new features in MySQL replication introduced in versions 5.6 and 5.7. Key features discussed include binary log group commit for improved performance, optimized row-based replication with partial binary logging, multi-threaded slave replication, global transaction identifiers for topologies with multiple masters, transactional metadata storage, and binary log event checksums. The document provides examples and explanations of how these features improve high availability, scalability and reliability of MySQL replication deployments.
Performance Schema and Sys Schema in MySQL 5.7Mark Leith
MySQL 5.7 now includes the Sys Schema by default, which builds upon the awesome instrumentation framework laid by Performance Schema.
Performance Schema has had 23 worklogs completed in 5.7 alone, such as memory instrumentation, tying in transactions and stored programs in to the current statement/stage/wait instruments and wait graph, prepared statement instruments, metadata lock information, improved session status and variable reporting, the new structured replication tables, and more.
The Sys schema builds upon this strong foundation with easy reporting views and functions, as well as procedures to help both set up and manage the configuration of Performance Schema, and help diagnose performance issues with your database instances on the whole.
Come along and hear from the original developer of the Sys schema about all of these exciting improvements in MySQL instrumentation for the upcoming MySQL 5.7 release!
This document discusses the Performance Schema in MySQL, which records instrumentation data to help profile and monitor database activity. It provides an overview of the Performance Schema's components and tables, how it has evolved between MySQL versions to include more metrics and functionality, and examples of how to query the tables to analyze wait events, statements, stages and other performance data.
This document discusses MySQL partitioning, including when and why to use partitioning, different types of partitioning, and how to manage partitions. It covers using partitioning for faster deletion of data by dropping partitions, faster queries through partition pruning, and making some operations like adding indexes faster by performing them on individual partitions rather than entire tables. It provides examples and best practices for using partitioning for both short and long term rolling of data.
Pattern Matching with SQL - APEX World Rotterdam 2019Connor McDonald
The document appears to be a presentation on pattern matching in SQL. It includes examples of different techniques for grouping and aggregating employee names by department over time, from older procedural approaches to newer analytic functions. It also discusses challenges that remain and provides a hypothetical example of analyzing customer transaction data to identify periods of fast or slow growth.
This document provides an overview and agenda for a presentation on Python and the MySQL Document Store. The presentation introduces JSON and how MySQL works with JSON documents, the MySQL Document Store and X DevAPI, and provides code examples for interacting with document collections using Connector/Python, including creating a collection, adding, finding, modifying, and removing documents.
Slides from the Oracle ANZ workshop held in Sydney and Melbourne. We look at the killer features that will make 18c and 19c great productivity upgrades for DBAs
This document discusses second-level caching in Java Persistence API (JPA). It begins with an introduction to the speaker's background and experience. The presentation agenda is then outlined, covering why caching is important, the differences between first-level and second-level caches in JPA, JPA configuration parameters and API for caching, and specifics of Hibernate and EclipseLink second-level caching implementations. Examples are provided that demonstrate performance improvements when utilizing caching strategies like fetching relationships, projections, and aggregation queries.
This document discusses restricting and sorting data in Oracle. It covers limiting rows using the WHERE clause, sorting rows using the ORDER BY clause, and using ampersand substitution in iSQL*Plus to restrict and sort output at runtime. Comparison operators, logical operators, NULL conditions, character strings, dates and BETWEEN/IN/LIKE conditions for the WHERE clause are explained. Best practices for ORDER BY are also provided.
Dependency Injection is now part of nearly every Java project. But what is the difference between DI and CDI. How to decide what I could use better, what frameworks are available and what are the differences for me as a programmer? What could be an option for the IoT-, Desktop- or Webproject?
In this talk we will get an overview over different frameworks and how they are working. We are not checking the well known big one only, but we are looking at some small sometimes specialized implementations.
Everybody knows the pattern proxy, but how can you use it effectively?
What kind of proxy patterns are available, and how can you
build patterns more effectively with it? Why is reflection needed for this?
Importantly, we need only the core JDK in most cases.
This tutorial starts from the basics and continues on to
DynamicProxies, DynamicObjectAdapter and DynamicStaticProxies at runtime, StaticObjectAdapters, and more.
The session, based on the German book Dynamic Proxies,
by Heinz Kabutz and the session’s presenter, takes a deep dive into this pattern group.
How Big Data platform scaled from zero to billions of data within 6 months at ISCPIF (CNRS).
This talk contains our use of Elasticsearch, MongoDB, Redis, RabbitMQ and scalable/high available Web services built over Big Data architecture.
This presentation was presented at Université Paris-Sud, LAL, Bâtiment 200 organized by ARGOS. https://indico.mathrice.fr/event/2/overview
ISCPIF: http://iscpif.fr
Big Data at ISCPIF: http://bigdata.iscpif.fr
Climate at ISCPIF: http://climate.iscpif.fr
Playground for climate: http://climate.iscpif.fr/playground
Tweetoscope: http://tweetoscope.iscpif.fr
This document discusses storing product and order data as JSON in a database to support an agile development process. It describes creating tables with JSON columns to store this data, and using JSON functions like JSON_VALUE and JSON_TABLE to query and transform the JSON data. Examples are provided of indexing JSON columns for performance and updating product JSON to include unit costs by joining external data. The goal is to enable flexible and rapid evolution of the application through storing data in JSON.
The document discusses strategies for loading data in JPA, specifically lazy vs eager loading. It provides examples of when each strategy would be preferred, noting that eager loading is generally better for loading a small amount of related data to avoid additional database queries, while lazy loading is preferred when the related data is large or its needed fields are unknown. It also discusses using lazy loading for large objects like blobs and clobs.
Flashback is not only for those "Oh No!" moments when we make a mistake. It enables benefits for developers ranging from data consistency to continuous integration and data auditing. Tucked away in Enterprise Edition are six independent and powerful technologies that might just save your career—they will also open up a myriad of other benefits of well.
The document discusses locking and concurrency control in databases, demonstrating how table locks, row locks, and multi-version concurrency control work through examples of a database being backed up while concurrent changes are made. It shows how different locking strategies, like those used in MyISAM and InnoDB, allow for concurrent access to data while maintaining consistency and isolation. A live demo then highlights deadlocks and lock waits that can occur with concurrent access and how they are handled.
MySQL Goes to 8! FOSDEM 2020 Database Track, January 2nd, 2020Geir Høydalsvik
Here are the basic steps to clone a MySQL instance using the new CLONE command directly from SQL:
1. Connect to the source instance you want to clone from.
2. Issue the CLONE statement to create a new instance from the source. For example:
CLONE INSTANCE FROM 'mysql://user:password@source_host:3306/' TO 'mysql://user:password@target_host:3306/';
3. The clone operation will copy over the data files, redo logs and configuration from the source to the target instance.
4. Once complete, the new cloned instance is ready for use as a read replica or independent instance as needed.
By automating the provisioning
The document discusses setting up MySQL high availability using InnoDB Cluster. It provides instructions for installing MySQL 8.0, configuring three MySQL instances for the cluster, creating the cluster, and running the MySQL Router for load balancing and failover. The Router is configured to route reads from two read-only slaves, while writes go to a single read-write master. Connections are tested from each instance to verify the Router is load balancing correctly.
MySQL NoSQL JSON JS Python "Document Store" demoKeith Hollman
A Demo of the NoSQL JSON schemaless solution that is native in MySQL aka Document Store.
Presented at SPOUG Technical Conference 25-Sep-18, Madrid, Spain.
Top 10 SQL Performance tips & tricks for Java Developersgvenzl
This slide deck contains some of the most common database performance tips and tricks that developers can use to tune their applications or systems. It also highlights some anti-patterns and shows the impact of these anti-patterns in regard to performance.
This slide deck does not aim to be a complete list of all possibilities and techniques to achieve better performance but just highlights some very commonly seen mistakes and how to avoid them.
The document discusses MySQL 5.7 which integrates both SQL and NoSQL capabilities. It provides instructions for a workshop on using MySQL Shell to interact with MySQL 5.7 and its document store functionality. The workshop covers installing the MySQL X Plugin, loading sample data, querying and modifying collections and tables, and handling errors.
This document summarizes new features in MySQL replication introduced in versions 5.6 and 5.7. Key features discussed include binary log group commit for improved performance, optimized row-based replication with partial binary logging, multi-threaded slave replication, global transaction identifiers for topologies with multiple masters, transactional metadata storage, and binary log event checksums. The document provides examples and explanations of how these features improve high availability, scalability and reliability of MySQL replication deployments.
Performance Schema and Sys Schema in MySQL 5.7Mark Leith
MySQL 5.7 now includes the Sys Schema by default, which builds upon the awesome instrumentation framework laid by Performance Schema.
Performance Schema has had 23 worklogs completed in 5.7 alone, such as memory instrumentation, tying in transactions and stored programs in to the current statement/stage/wait instruments and wait graph, prepared statement instruments, metadata lock information, improved session status and variable reporting, the new structured replication tables, and more.
The Sys schema builds upon this strong foundation with easy reporting views and functions, as well as procedures to help both set up and manage the configuration of Performance Schema, and help diagnose performance issues with your database instances on the whole.
Come along and hear from the original developer of the Sys schema about all of these exciting improvements in MySQL instrumentation for the upcoming MySQL 5.7 release!
This document discusses the Performance Schema in MySQL, which records instrumentation data to help profile and monitor database activity. It provides an overview of the Performance Schema's components and tables, how it has evolved between MySQL versions to include more metrics and functionality, and examples of how to query the tables to analyze wait events, statements, stages and other performance data.
The document discusses NoSQL APIs in MySQL. It provides an overview of the memcached caching system and the history of the HandlerSocket protocol. It then describes the NoSQL interface introduced in MySQL 5.6, which allows for memcached-style operations on MySQL data. It notes that MySQL 5.7 further improved the performance and scalability of this interface.
MySQL Troubleshooting with the Performance SchemaSveta Smirnova
This document discusses using the Performance Schema in MySQL to troubleshoot performance issues. It provides an overview of the Performance Schema and what information it collects. It then discusses how to use specific Performance Schema tables like events_statements_history_long, events_stages_history_long, and others to identify statements that examine too many rows, issues with index usage, and which internal operations are taking a long time. The document provides examples of queries to run and what to look for in the Performance Schema output to help troubleshoot and optimize SQL statements.
This document summarizes new features and improvements in MySQL 8.0. Key highlights include utf8mb4 becoming the default character set to support Unicode 9.0, performance improvements for utf8mb4 of up to 1800%, continued enhancements to JSON support including new functions, expanded GIS functionality including spatial reference system support, and new functions for working with UUIDs and bitwise operations. It also provides a brief history of MySQL and outlines performance improvements seen in benchmarks between MySQL versions.
Slides from the APEX Connect conference. This session covered the background of parsing a SQL statement, the risks and best practices, and an introduction to the read-consistency feature in the Oracle Database
The document provides an overview of using Python to connect to and query a MySQL database configured for high availability. It discusses MySQL replication, group replication, and connectors that enable multi-host connections. It also demonstrates connecting to MySQL from Python, executing queries, handling errors, and implementing a simple web application using Flask that connects to MySQL to call a stored procedure.
20190615 hkos-mysql-troubleshootingandperformancev2Ivan Ma
MySQL Troubleshooting in Hong Kong Open Source Conference 2019 - how to use sys.diagnostics(...) and using the dimitri (http://dimitrik.free.fr/) Tools for performance analysis.
If you use scripting languages to power web, mobile, and/or enterprise applications, this session will show you how to use Oracle Database efficiently. It demonstrates how PHP can take full advantage of new performance, scalability, availability, and security features in Oracle Database 12c and Oracle Linux. Many of these features are available to other C-based scripting languages too, like Ruby, Python and Perl. DBA's will also benefit by seeing how applications can be monitored.
Graal is a dynamic meta-circular research compiler for Java that is designed for extensibility and modularity. One of its main distinguishing elements is the handling of optimistic assumptions obtained via profiling feedback and the representation of deoptimization guards in the compiled code. Truffle is a self-optimizing runtime system on top of Graal that uses partial evaluation to derive compiled code from interpreters. Truffle is suitable for creating high-performance implementations for dynamic languages with only moderate effort. The presentation includes a description of the Truffle multi-language API and performance comparisons within the industry of current prototype Truffle language implementations (JavaScript, Ruby, and R). Both Graal and Truffle are open source and form themselves research platforms in the area of virtual machine and programming language implementation (http://openjdk.java.net/projects/graal/).
Ruby on Rails is a web application framework built using the Ruby programming language. It is designed to make web development simpler and more productive. Some key principles of Ruby on Rails include convention over configuration, don't repeat yourself (DRY), and opinionated software. Ruby on Rails integrates with Oracle databases using various Oracle adapters and gems that allow access to Oracle data from Ruby and Rails applications.
This document discusses various topics related to MySQL administration including access control, diagnostic data, log files, and backups. It provides examples of how to configure user accounts and privileges using commands like CREATE USER, ALTER USER, GRANT, and REVOKE. It also explains how to view diagnostic information using SHOW commands, the INFORMATION_SCHEMA, and the SYS schema. Finally, it covers MySQL's different log files and various options for logical and physical backups.
MySQL 8 High Availability with InnoDB ClustersMiguel Araújo
MySQL’s InnoDB cluster provides a high-level, easy-to-use solution for MySQL high availability. Combining MySQL Group Replication with MySQL Router and the MySQL Shell into an integrated solution, InnoDB clusters offer easy setup and management of MySQL instances into a fault-tolerant database service. In this session learn how to set up a basic InnoDB cluster, integrate it with applications, and recognize and react to common failure scenarios that would otherwise lead to a database outage.
- Workshop presentation
Similar to Regular Expressions with full Unicode support (20)
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
I am
Worked with MySQL since time immemorial, MySQL AB.
Work from Uppsala Sweden, former head office for MySQL. Swedish is my native tongue. Makes a differrence as you will see.
I am
Worked with MySQL since time immemorial, MySQL AB.
Work from Uppsala Sweden, former head office for MySQL. Swedish is my native tongue. Makes a differrence as you will see.
So what’s all this about? We switched our regex library in 8.0.4. At the time I blogged about it here.
The old one was written by HS in 1986. called regexp very good regex library. Has been used widely in the Unix realm, is part of POSIX standard. Also called “the book regexp library” because he updated it for the book Software Solutions in C in 1994. .
Made its way to Tcl, Postgres and even early perl. Apparently Postgres still use it.
Really good. Great performance. But ASCII only. Worked byte-by-byte. Lacks many features.
Not safe – Easy to put in infinite loop.
You can only do boolean search, not do matching doesn’t have a pattern buffer out of the box. Hence doesn’t support search-replace.
And this was a quite popular request. Four FR bugs against getting the matched substring alone.
We had 51 “Affects me” in total. CTE had 59, but that’s a really popular feature.
Now we have four functions
Instr → position, before or after
Like → boolean
Replace → replaces a match, capture
Substr → the matched substring
So here’s the agenda. On top, we have security, which is why we chose ICU. Perhaps not the obvious choice given the candidates. It also has close ties to Unicode.
What I won’t cover here is all the features of regular expressions. These are documented in our manual and if you can always head to the ICU documentations.
My ambition is to teach you about how to work efficiently and securely with unicode and to give some insight into where common wisdom breaks down.
I presented here 3 years ago and I had a really good time, so I wanted to go again. I told my boss what am I going to talk about, I haven’t really added anything new since last time. All I can think of is the new regular expression. “Tell’em about that, he said” They’ll love that. So, I submitted this talk as a 20 minute presentation. Not only did it get accepted but it got upgraded to 30 minutes. I couldn’t think of much to say, so I asked around. “What do YOU want to know about regular expressions with Unicode?”. Nobody had a clue. So that’s why I just picked some common pitfalls that I consider tricky.
The way a malicious user can exploit regex matching is by exhausting the memory or creating an infinite loop, consuming all the cpu time.
Out of the box there’s always cap on runtime.
Runtime is specified in “steps of the match engine. A bit vacuous. Correspondence with actual processor time will depend on the speed of the processor and the details of the specific pattern, but will typically be on the order of milliseconds.
Match the first A, capture, then repeat that match. Backtrack, match 2nd, repeat that and so on. Eventually fail because of the C.
Set conservatively to 32 (secure by default)
Here I’m trying to run out of memory. Really have to provoke here. Reach the time limit first.
Match empty string 120 times, repeat that 11 times, repeat that 11 times, etc.
Backtracking stack used by engine.
Bytes. Choking to 239 bytes
Default size 80 MB. Never managed to DOS server.
So… about the icu library
What is ICU library. Set of I18n libs. What they provide is Globalization support and Unicode for software applications. They have an open source license. From what I gather compatible with GNU, but IANAL.
Used by Java, Apple, Amazon, IBM…
Unicode consortium mostly known for emoji nowadays. New releases of Unicode typically contain new emojis. And so you have to be able search for them. Haha-papa a.k.a. Sushi-beer bug. And so regexp have to suport them.
💬 5 billion emojis are sent daily on Facebook Messenger
📸 By mid-2015, half of all comments on Instagram included an emoji
🍑 Only 7% of people use the peach emoji as a fruit
The rest mostly use it as a butt or for other non-fruit uses
According to emojipedia
In a sense ICU is Unicode. Support for all of Unicode
We ship ICU with MySQL, and optionally build bundled. We ship 59.1. I notice Ubuntu 18.04 ships 60.
There’s the internationalization library which contains regexp and charsets. All we use right now. All we bundle. The common library contains things like the breakiterator which helps work with grapheme clusters. I won’t go into grapheme clusters in this presentation. We don’t handle those yet.
The data library is not used currently. Don’t ship. Fairly big, not needed for regexp.
Tell you a bit about Unicode
Specifies three encodings.
+ constant size
+ maps 1-to-1 to unicode codepoints
- space consuming
+ Optimized for Western ASCII
+ Small (for Western)
+ Self-synchronizing (what isn’t???)
- Variable size
De-facto standard for the web 92.9%
Generally regarded Worst of both worlds
- Bigger than UTF-8
- Not fixed like UTF-32
+ More is constant (what? Which planes?)
+ Also self-synchronizing
Surrogate pairs
Broken in Java. How?
Alas, used by ICU
So they way we use ICU is, unless you start on the first character, we count the code points before. Convert the rest to UTF-16, search with ICU. We use ICU’s C API. There is C++ API.
So, I have two examples how to work with Unicode.
You can specify case sensitivity in three ways. Mode modifiers Inside the regexp have the highest priorority.
If there are no mode modifiers, match_paramete is used. String of modifiers. ‘c’ means case-sensitive, ‘i’ means in-sensitive.
If there are none of those, we look at the collation. There are rules for computing which collation should be used in any comparison. Apply here.
Case insensitivity seems simple at first. Text is normalized by transforming to the same case. Then compare.
On the next slide we see how such a case mapping could look.
Totally obvious, right? One character maps to exactly one character. This is called simple case insensitive matching.
Well there are some trickier cases.
The german Ess-zet is generally understood to be equivalent to two s’es. So in full case insensitive matching they should be equal. Since there is no esszet in any other language, this folding is part of the default.
I could go on all day about case mapping, it’s a 61-page document in the Unicode standard. But these are the essentials.
This example is a little more complicated for me.
Here one letter obviously maps to two letters. Actually letters. Not just code points.
If you paste them and press backspace, the little I goes away.
In this case they’re different. Works the same way.
It’s all greek to me.
Full case folding used when the pattern contains anything looks like a character string, even just one char.
A match can never start within an expanded character.
The anchors here enforce a match that would 1) start in the middle 2) end in the middle
This is consistent with how collations work with the equals predicate.
Hard to read collation name
Charset, language code, pb – don’t know, accent sensitive, case sensitive.
Case folding can also be language dependent. In the default case folding, capital I folds to small I with dot. However, in the turkish case folding, a dotted capital I is case folded to dotted lowercase I. Dotless capital I folds to dotless lowercase I.
In Turkish locale, actually wrong.
Another problem with full Unicode and regexp. You need to be careful when you send non-ASCII data from a client. Here is a cautionary tale. Here I changed the variables character_set_connection, character_set_client and character_set_results. What SET NAMES does.
So, I create a table.
I populate it. Swedish letter å. Pronounced
Read back.
Check with a regular expression match.
So, everything is fine, right? Let’s do a “trust but verify” here. I want to see what’s actualy in the table. The problem is that it will always be converted to my character set. I want to apply a function to it on the server side. Problem is, all functions will also convert their arguments. What to do? All functions save one: The hex() function. It will tell the truth.
So here we have .. what? Is this really a w/ring ? Let’s check.
This is not å in any encoding. What is going on
My terminal is UTF-8. So, when I type å on my Swedish keyboard, it sends c3a5 to the server. Now, when I set character_set_client, what I really said “interpret as latin1”. Fine c3a5 thats a-wave yen. Stores that. But the table stores utf8 so let’s convert.
And that becomes
Now, when I do select, it reads character_set_results, oh yeah, you speak latin-1. Let me translate for ya.
And so we’re back full circle.
Especially tricky with latin-1 since anything is valid latin-1. No check fails.
So here’s a power tip for troubleshooting your multilinguas regexps. If you use hex codes and character set introducers, it’s totally unambiguous. As you see here.