The document discusses a study comparing the SQL optimizer in Oracle and Hive query execution. It aims to understand how the SQL optimizer works in Oracle by generating query plans using Explain and comparing performance to queries executed on Hive. Various query types including single relations, joins, aggregates, and subqueries are executed on both Oracle and Hive and their plans and performance are analyzed and compared to understand how each system optimizes queries and executes them efficiently.
Exploring Advanced SQL Techniques Using Analytic FunctionsZohar Elkayam
Session from BGOUG I presented in June, 2016
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multi-dimension aggregation and analytic functions are still relatively remain unknown. In this session, we will explore some of the common real-world usages for analytic function, and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups; understand aggregation of multiple dimensions without a group by; see how to do inter-row calculations, and much-much more…
Together we will see how we can unleash the power of analytics using Oracle 11g best practices and Oracle 12c new features.
Exploring Advanced SQL Techniques Using Analytic FunctionsZohar Elkayam
Session from ILOUG I presented in May, 2016
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multi-dimension aggregation and analytic functions are still relatively remain unknown. In this session, we will explore some of the common real-world usages for analytic function, and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups; understand aggregation of multiple dimensions without a group by; see how to do inter-row calculations, and much-much more…
Together we will see how we can unleash the power of analytics using Oracle 11g best practices and Oracle 12c new features.
Oracle Week 2015 presentation (Presented on November 15, 2015)
Agenda:
Aggregative and advanced grouping options
Analytic functions, ranking and pagination
Hierarchical and recursive queries
Oracle 12c new rows pattern matching feature
XML and JSON handling with SQL
Regular Expressions
SQLcl – a new replacement tool for SQL*Plus from Oracle
Oracle Advanced SQL and Analytic FunctionsZohar Elkayam
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multidimension aggregation and analytic functions still remain relatively unknown. In this session, we will explore some of the common real-world usages for analytic function and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups, understand aggregation of multiple dimensions without a group by, see how to do inter-row calculations, and much more.
This is the presentation slides which was presented in Kscope 17 on June 28, 2017.
This document provides guidelines for conducting practical exercises for an Advanced Database Management Systems course. It includes 11 exercises covering concepts like SQL statements, functions, normalization, joins, views, and PL/SQL programming. Students are expected to complete the exercises over 12 sessions in 7 days under faculty guidance. Exercises are assessed and students must score a minimum of 40% combined on guided and unguided assessments to pass. The document outlines software and hardware requirements and provides instructions for completing the exercises and documenting the work.
This document discusses cube, rollup and materialized views in Oracle databases. It provides an overview of how cube and rollup extend the GROUP BY clause to automatically calculate subtotals and totals. It also discusses how materialized views can store the results of a query to improve performance for frequent or complex queries. The document includes examples demonstrating how to use cube, rollup, and materialized views.
An Oracle database consists of objects like tables, views, and programs owned by user accounts. SQL is used to perform operations on database objects like creating, modifying, viewing, and deleting them. There are two main types of SQL commands: DDL for defining objects and DML for manipulating data. Users have privileges like creating tables or inserting data that are assigned by the database administrator. Database objects must follow naming conventions and can be created and modified using SQL commands in tools like SQL*Plus.
A talk given by Julian Hyde at DataCouncil SF on April 18, 2019
How do you organize your data so that your users get the right answers at the right time? That question is a pretty good definition of data engineering — but it is also describes the purpose of every DBMS (database management system). And it’s not a coincidence that these are so similar.
This talk looks at the patterns that reoccur throughout data management — such as caching, partitioning, sorting, and derived data sets. As the speaker is the author of Apache Calcite, we first look at these patterns through the lens of Relational Algebra and DBMS architecture. But then we apply these patterns to the modern data pipeline, ETL and analytics. As a case study, we look at how Looker’s “derived tables” blur the line between ETL and caching, and leverage the power of cloud databases.
Exploring Advanced SQL Techniques Using Analytic FunctionsZohar Elkayam
Session from BGOUG I presented in June, 2016
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multi-dimension aggregation and analytic functions are still relatively remain unknown. In this session, we will explore some of the common real-world usages for analytic function, and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups; understand aggregation of multiple dimensions without a group by; see how to do inter-row calculations, and much-much more…
Together we will see how we can unleash the power of analytics using Oracle 11g best practices and Oracle 12c new features.
Exploring Advanced SQL Techniques Using Analytic FunctionsZohar Elkayam
Session from ILOUG I presented in May, 2016
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multi-dimension aggregation and analytic functions are still relatively remain unknown. In this session, we will explore some of the common real-world usages for analytic function, and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups; understand aggregation of multiple dimensions without a group by; see how to do inter-row calculations, and much-much more…
Together we will see how we can unleash the power of analytics using Oracle 11g best practices and Oracle 12c new features.
Oracle Week 2015 presentation (Presented on November 15, 2015)
Agenda:
Aggregative and advanced grouping options
Analytic functions, ranking and pagination
Hierarchical and recursive queries
Oracle 12c new rows pattern matching feature
XML and JSON handling with SQL
Regular Expressions
SQLcl – a new replacement tool for SQL*Plus from Oracle
Oracle Advanced SQL and Analytic FunctionsZohar Elkayam
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multidimension aggregation and analytic functions still remain relatively unknown. In this session, we will explore some of the common real-world usages for analytic function and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups, understand aggregation of multiple dimensions without a group by, see how to do inter-row calculations, and much more.
This is the presentation slides which was presented in Kscope 17 on June 28, 2017.
This document provides guidelines for conducting practical exercises for an Advanced Database Management Systems course. It includes 11 exercises covering concepts like SQL statements, functions, normalization, joins, views, and PL/SQL programming. Students are expected to complete the exercises over 12 sessions in 7 days under faculty guidance. Exercises are assessed and students must score a minimum of 40% combined on guided and unguided assessments to pass. The document outlines software and hardware requirements and provides instructions for completing the exercises and documenting the work.
This document discusses cube, rollup and materialized views in Oracle databases. It provides an overview of how cube and rollup extend the GROUP BY clause to automatically calculate subtotals and totals. It also discusses how materialized views can store the results of a query to improve performance for frequent or complex queries. The document includes examples demonstrating how to use cube, rollup, and materialized views.
An Oracle database consists of objects like tables, views, and programs owned by user accounts. SQL is used to perform operations on database objects like creating, modifying, viewing, and deleting them. There are two main types of SQL commands: DDL for defining objects and DML for manipulating data. Users have privileges like creating tables or inserting data that are assigned by the database administrator. Database objects must follow naming conventions and can be created and modified using SQL commands in tools like SQL*Plus.
A talk given by Julian Hyde at DataCouncil SF on April 18, 2019
How do you organize your data so that your users get the right answers at the right time? That question is a pretty good definition of data engineering — but it is also describes the purpose of every DBMS (database management system). And it’s not a coincidence that these are so similar.
This talk looks at the patterns that reoccur throughout data management — such as caching, partitioning, sorting, and derived data sets. As the speaker is the author of Apache Calcite, we first look at these patterns through the lens of Relational Algebra and DBMS architecture. But then we apply these patterns to the modern data pipeline, ETL and analytics. As a case study, we look at how Looker’s “derived tables” blur the line between ETL and caching, and leverage the power of cloud databases.
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
The document discusses how to analyze and tune queries for better performance in MySQL. It covers topics like cost-based query optimization in MySQL, tools for monitoring, analyzing and tuning queries, data access and index selection, the join optimizer, subqueries, sorting, and influencing the optimizer. The program agenda outlines these topics and their order.
Oracle 12c New Features For Better PerformanceZohar Elkayam
This document discusses new features in Oracle 12c that improve database performance. It begins with an introduction of the speaker and their company Brillix. The document then covers Oracle Database In-Memory Column Store introduced in 12.1, which allows both row and column format data access. Oracle 12.2 introduced Sharded Database Architecture for horizontal scaling across multiple databases. Additional optimizer changes in 12c such as adaptive query optimization and dynamic statistics are also summarized.
This document discusses various techniques for optimizing SQL queries in SQL Server, including:
1) Using parameterized queries instead of ad-hoc queries to avoid compilation overhead and improve plan caching.
2) Ensuring optimal ordering of predicates in the WHERE clause and creating appropriate indexes to enable index seeks.
3) Understanding how the query optimizer works by estimating cardinality based on statistics and choosing low-cost execution plans.
4) Avoiding parameter sniffing issues and non-deterministic expressions that prevent accurate cardinality estimation.
5) Using features like the Database Tuning Advisor and query profiling tools to identify optimization opportunities.
Single-Row Functions in orcale Data baseSalman Memon
This document provides an overview of single-row functions in SQL. It describes how single-row functions manipulate data on each row returned and can modify data types. The document outlines different categories of single-row functions including character, number, date, and general functions. It provides examples of how to use various single-row functions in SELECT statements.
Spark SQL allows users to perform relational operations on Spark's RDDs using a DataFrame API. It addresses challenges in existing systems like limited optimization and data sources by providing a DataFrame API that can query both external data and RDDs. Spark SQL leverages a highly extensible optimizer called Catalyst to optimize logical query plans into efficient physical query plans using features of Scala. It has been part of the Spark core distribution since version 1.0 in 2014.
The document provides an introduction to the R programming language. It discusses that R is an open-source programming language for statistical analysis and graphics. It can run on Windows, Unix and MacOS. The document then covers downloading and installing R and R Studio, the R workspace, basics of R syntax like naming conventions and assignments, working with data in R including importing, exporting and creating calculated fields, using R packages and functions, and resources for R help and tutorials.
Dynamic Publishing with Arbortext Data MergeClay Helberg
Dynamic Publishing with Arbortext Data Merge allows authors to insert database queries into documents and automatically update the published results. It provides advantages over manual cut-and-paste by avoiding errors and ensuring updates. The process involves configuring an ODBC data source, defining queries with parameters, and setting preferences to control updating. Queries can output data as tables or through arbitrary XSL formatting.
This document provides an overview of creating and working with various database objects in Oracle including views, sequences, indexes, and synonyms. It describes how to create simple and complex views to restrict data access and present different views of data. It also covers how to generate unique numbers with sequences, create indexes to improve query performance, and use synonyms to provide alternative names for objects. The key goals are to learn how to create, maintain, and use these different database objects to logically represent and retrieve data from tables.
This document provides an overview of Module 5: Optimize query performance in Azure SQL. The module contains 3 lessons that cover analyzing query plans, evaluating potential improvements, and reviewing table and index design. Lesson 1 explores generating and comparing execution plans, understanding how plans are generated, and the benefits of the Query Store. Lesson 2 examines database normalization, data types, index types, and denormalization. Lesson 3 describes wait statistics, tuning indexes, and using query hints. The lessons aim to help administrators optimize query performance in Azure SQL.
This is a presentation from Oracle Week 2016 (Israel). This is a newer version from last year with new 12cR2 features and demo.
In the agenda:
Aggregative and advanced grouping options
Analytic functions, ranking and pagination
Hierarchical and recursive queries
Regular Expressions
Oracle 12c new rows pattern matching
XML and JSON handling with SQL
Oracle 12c (12.1 + 12.2) new features
SQL Developer Command Line tool
The document discusses algorithms and techniques for query processing and optimization in relational database management systems. It covers translating SQL queries into relational algebra, algorithms for operations like selection, projection, join and sorting, using heuristics and cost estimates for optimization, and an overview of query optimization in Oracle databases.
The document discusses new features and improvements in the MySQL 8.0 optimizer. Key highlights include:
- New SQL syntax like SELECT...FOR UPDATE SKIP LOCKED and NOWAIT to handle row locking contention.
- Support for common table expressions to improve readability and allow referencing derived tables multiple times.
- Enhancements to the cost model to produce more accurate estimates based on factors like data location.
- Better support for data types like UUID and IPv6, including optimized storage formats and new functions.
This document discusses execution plans in Oracle Database. It begins by explaining what an execution plan is and how it shows the steps needed to execute a SQL statement. It then covers how to generate an execution plan using EXPLAIN PLAN or querying V$SQL_PLAN. The document discusses what the optimizer considers a "good" plan in terms of cost and performance. It also explores key elements of an execution plan like cardinality, access paths, join methods, and join order.
This document provides an overview and agenda for an Oracle Database 12c R2 SQL workshop, including objectives, prerequisites, roadmap, tables used, and development environments. It covers topics such as restricting data, sorting data, functions, subqueries, and managing tables using DML statements. The document is intended for internal Oracle and Oracle Academy use only.
Optimizer is the component of the DB2 SQL compiler responsible for selecting an optimal access plan for an SQL statement. The optimizer works by calculating the execution cost of many alternative access plans, and then choosing the one with the minimal estimated cost. Understanding how the optimizer works and knowing how to influence its behaviour can lead to improved query performance and better resource usage.
This presentation was created for the workshop delivered at the CASCON 2011 conference. Its aim is to introduce basic optimizer and related concepts, and to serve as a starting point for further study of the optimizer techniques.
In this first of a series of presentations, we'll overview the differences between SQL and PL/SQL, and the first steps in optimization, as understanding RULE vs. COST, and how to slash 90% response time in data extractions running in SQL*Plus.
This document provides an overview of database security concepts including confidentiality, integrity, and availability. It defines database security as protecting the confidentiality, integrity, and availability of data. Key concepts discussed include authentication, authorization, access control, data encryption, data privacy, auditing, and logging. The document also outlines security problems such as non-fraudulent threats from errors or disasters and fraudulent threats from authorized users abusing privileges or hostile agents attacking the system.
Antes de migrar de 10g a 11g o 12c, tome en cuenta las siguientes consideraciones. No es tan sencillo como simplemente cambiar de motor de base de datos, se necesita hacer consideraciones a nivel del aplicativo.
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...Julian Hyde
What if Looker saw the queries you just executed and could predict your next query? Could it make those queries faster, by smarter caching, or aggregate navigation? Could it read your past SQL queries and help you write your LookML model? Those are some of the reasons to add relational algebra into Looker’s query engine, and why Looker hired Julian Hyde, author of Apache Calcite, to lead the effort. In this talk about the internals of Looker’s query engine, Julian Hyde will describe how the engine works, how Looker queries are described in Calcite’s relational algebra, and some features that it makes possible.
A talk by Julian Hyde at JOIN 2019 in San Francisco.
Web Cloud Computing SQL Server - Ferrara Universityantimo musone
The document provides a summary of an individual's background and experience. It includes the following information in Italian:
1. The individual graduated from the University of Ferrara in 2014 and is an engineer from the University of Naples. They have worked at Avanade since 2006 as a Technical Architect focusing on Cloud and Mobile.
2. They speak at events as a Microsoft Student Partner and are a co-founder of the Fifth Element Project.
3. Their areas of expertise include applications, storage, servers, networking, operating systems, databases, virtualization, runtimes, middleware, and infrastructure as a service, platform as a service and software as a service.
4. They provide a link to
This presentation discusses the following topics:
Introduction to Query Processing
Need for Query processing
Architecture of Query Processing
Query Processing Steps
Phases in a typical query processing
Represented in relational structures
Translating SQL Queries into Relational Algebra
Query Optimization
Importance of Query Optimization
Actions of Query Optimization
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
The document discusses how to analyze and tune queries for better performance in MySQL. It covers topics like cost-based query optimization in MySQL, tools for monitoring, analyzing and tuning queries, data access and index selection, the join optimizer, subqueries, sorting, and influencing the optimizer. The program agenda outlines these topics and their order.
Oracle 12c New Features For Better PerformanceZohar Elkayam
This document discusses new features in Oracle 12c that improve database performance. It begins with an introduction of the speaker and their company Brillix. The document then covers Oracle Database In-Memory Column Store introduced in 12.1, which allows both row and column format data access. Oracle 12.2 introduced Sharded Database Architecture for horizontal scaling across multiple databases. Additional optimizer changes in 12c such as adaptive query optimization and dynamic statistics are also summarized.
This document discusses various techniques for optimizing SQL queries in SQL Server, including:
1) Using parameterized queries instead of ad-hoc queries to avoid compilation overhead and improve plan caching.
2) Ensuring optimal ordering of predicates in the WHERE clause and creating appropriate indexes to enable index seeks.
3) Understanding how the query optimizer works by estimating cardinality based on statistics and choosing low-cost execution plans.
4) Avoiding parameter sniffing issues and non-deterministic expressions that prevent accurate cardinality estimation.
5) Using features like the Database Tuning Advisor and query profiling tools to identify optimization opportunities.
Single-Row Functions in orcale Data baseSalman Memon
This document provides an overview of single-row functions in SQL. It describes how single-row functions manipulate data on each row returned and can modify data types. The document outlines different categories of single-row functions including character, number, date, and general functions. It provides examples of how to use various single-row functions in SELECT statements.
Spark SQL allows users to perform relational operations on Spark's RDDs using a DataFrame API. It addresses challenges in existing systems like limited optimization and data sources by providing a DataFrame API that can query both external data and RDDs. Spark SQL leverages a highly extensible optimizer called Catalyst to optimize logical query plans into efficient physical query plans using features of Scala. It has been part of the Spark core distribution since version 1.0 in 2014.
The document provides an introduction to the R programming language. It discusses that R is an open-source programming language for statistical analysis and graphics. It can run on Windows, Unix and MacOS. The document then covers downloading and installing R and R Studio, the R workspace, basics of R syntax like naming conventions and assignments, working with data in R including importing, exporting and creating calculated fields, using R packages and functions, and resources for R help and tutorials.
Dynamic Publishing with Arbortext Data MergeClay Helberg
Dynamic Publishing with Arbortext Data Merge allows authors to insert database queries into documents and automatically update the published results. It provides advantages over manual cut-and-paste by avoiding errors and ensuring updates. The process involves configuring an ODBC data source, defining queries with parameters, and setting preferences to control updating. Queries can output data as tables or through arbitrary XSL formatting.
This document provides an overview of creating and working with various database objects in Oracle including views, sequences, indexes, and synonyms. It describes how to create simple and complex views to restrict data access and present different views of data. It also covers how to generate unique numbers with sequences, create indexes to improve query performance, and use synonyms to provide alternative names for objects. The key goals are to learn how to create, maintain, and use these different database objects to logically represent and retrieve data from tables.
This document provides an overview of Module 5: Optimize query performance in Azure SQL. The module contains 3 lessons that cover analyzing query plans, evaluating potential improvements, and reviewing table and index design. Lesson 1 explores generating and comparing execution plans, understanding how plans are generated, and the benefits of the Query Store. Lesson 2 examines database normalization, data types, index types, and denormalization. Lesson 3 describes wait statistics, tuning indexes, and using query hints. The lessons aim to help administrators optimize query performance in Azure SQL.
This is a presentation from Oracle Week 2016 (Israel). This is a newer version from last year with new 12cR2 features and demo.
In the agenda:
Aggregative and advanced grouping options
Analytic functions, ranking and pagination
Hierarchical and recursive queries
Regular Expressions
Oracle 12c new rows pattern matching
XML and JSON handling with SQL
Oracle 12c (12.1 + 12.2) new features
SQL Developer Command Line tool
The document discusses algorithms and techniques for query processing and optimization in relational database management systems. It covers translating SQL queries into relational algebra, algorithms for operations like selection, projection, join and sorting, using heuristics and cost estimates for optimization, and an overview of query optimization in Oracle databases.
The document discusses new features and improvements in the MySQL 8.0 optimizer. Key highlights include:
- New SQL syntax like SELECT...FOR UPDATE SKIP LOCKED and NOWAIT to handle row locking contention.
- Support for common table expressions to improve readability and allow referencing derived tables multiple times.
- Enhancements to the cost model to produce more accurate estimates based on factors like data location.
- Better support for data types like UUID and IPv6, including optimized storage formats and new functions.
This document discusses execution plans in Oracle Database. It begins by explaining what an execution plan is and how it shows the steps needed to execute a SQL statement. It then covers how to generate an execution plan using EXPLAIN PLAN or querying V$SQL_PLAN. The document discusses what the optimizer considers a "good" plan in terms of cost and performance. It also explores key elements of an execution plan like cardinality, access paths, join methods, and join order.
This document provides an overview and agenda for an Oracle Database 12c R2 SQL workshop, including objectives, prerequisites, roadmap, tables used, and development environments. It covers topics such as restricting data, sorting data, functions, subqueries, and managing tables using DML statements. The document is intended for internal Oracle and Oracle Academy use only.
Optimizer is the component of the DB2 SQL compiler responsible for selecting an optimal access plan for an SQL statement. The optimizer works by calculating the execution cost of many alternative access plans, and then choosing the one with the minimal estimated cost. Understanding how the optimizer works and knowing how to influence its behaviour can lead to improved query performance and better resource usage.
This presentation was created for the workshop delivered at the CASCON 2011 conference. Its aim is to introduce basic optimizer and related concepts, and to serve as a starting point for further study of the optimizer techniques.
In this first of a series of presentations, we'll overview the differences between SQL and PL/SQL, and the first steps in optimization, as understanding RULE vs. COST, and how to slash 90% response time in data extractions running in SQL*Plus.
This document provides an overview of database security concepts including confidentiality, integrity, and availability. It defines database security as protecting the confidentiality, integrity, and availability of data. Key concepts discussed include authentication, authorization, access control, data encryption, data privacy, auditing, and logging. The document also outlines security problems such as non-fraudulent threats from errors or disasters and fraudulent threats from authorized users abusing privileges or hostile agents attacking the system.
Antes de migrar de 10g a 11g o 12c, tome en cuenta las siguientes consideraciones. No es tan sencillo como simplemente cambiar de motor de base de datos, se necesita hacer consideraciones a nivel del aplicativo.
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...Julian Hyde
What if Looker saw the queries you just executed and could predict your next query? Could it make those queries faster, by smarter caching, or aggregate navigation? Could it read your past SQL queries and help you write your LookML model? Those are some of the reasons to add relational algebra into Looker’s query engine, and why Looker hired Julian Hyde, author of Apache Calcite, to lead the effort. In this talk about the internals of Looker’s query engine, Julian Hyde will describe how the engine works, how Looker queries are described in Calcite’s relational algebra, and some features that it makes possible.
A talk by Julian Hyde at JOIN 2019 in San Francisco.
Web Cloud Computing SQL Server - Ferrara Universityantimo musone
The document provides a summary of an individual's background and experience. It includes the following information in Italian:
1. The individual graduated from the University of Ferrara in 2014 and is an engineer from the University of Naples. They have worked at Avanade since 2006 as a Technical Architect focusing on Cloud and Mobile.
2. They speak at events as a Microsoft Student Partner and are a co-founder of the Fifth Element Project.
3. Their areas of expertise include applications, storage, servers, networking, operating systems, databases, virtualization, runtimes, middleware, and infrastructure as a service, platform as a service and software as a service.
4. They provide a link to
This presentation discusses the following topics:
Introduction to Query Processing
Need for Query processing
Architecture of Query Processing
Query Processing Steps
Phases in a typical query processing
Represented in relational structures
Translating SQL Queries into Relational Algebra
Query Optimization
Importance of Query Optimization
Actions of Query Optimization
Database and application performance vivek sharmaaioughydchapter
The document provides an overview of database and application design concepts. It discusses the importance of understanding the underlying database, development tools, and application data. Specific concepts covered include the system global area, locking and concurrency, optimizer statistics and transformations, database objects like tables and indexes, and Oracle waits. Examples are provided around query plans, bind peeking, multi-block reads, and optimizer evolution. Testing, inefficient queries, statistics, caching effects, and functions in predicates are identified as potential causes of performance issues.
This presentation features the fundamentals of SQL tunning like SQL Processing, Optimizer and Execution Plan, Accessing Tables, Performance Improvement Consideration Partition Technique. Presented by Alphalogic Inc : https://www.alphalogicinc.com/
Java Database Connectivity with JDBC.pptxtakomatiesucy
This document discusses how Java programs can connect to and query databases using JDBC (Java Database Connectivity). It explains that JDBC provides a standard interface for connecting to different database types and allows programs to send SQL statements to a database. The document also provides an example Java program that uses JDBC to connect to a database, execute a query to retrieve all records from the authors table, and display the results.
The art of querying – newest and advanced SQL techniquesZohar Elkayam
Presentation from Oracle Week 2017.
Agenda:
Aggregative and advanced grouping options
Analytic functions, ranking and pagination
Hierarchical and recursive queries
Regular Expressions
Oracle 12c new rows pattern matching
XML and JSON handling with SQL
Oracle 12c (12.1 + 12.2) new features
SQL Developer Command Line tool (if time allows)
Oracle 18c
Performance Stability, Tips and Tricks and UnderscoresJitendra Singh
This document provides an overview of upgrading to Oracle Database 19c and ensuring performance stability after the upgrade. It discusses gathering statistics before the upgrade to speed up the process, using AutoUpgrade for upgrades, and various testing tools like AWR Diff Reports and SQL Performance Analyzer to check for performance regressions after the upgrade. Maintaining good statistics and thoroughly testing upgrades are emphasized as best practices for a successful upgrade.
The presentation helps to introduce the key aspects of the Oracle Optimizer and how you find out what it's up to and how you can influence its decisions.
Using Query Store in Azure PostgreSQL to Understand Query PerformanceGrant Fritchey
Microsoft has added an excellent new extension in PostgreSQL on their Azure Platform. This session, presented at Posette 2024, covers what Query Store is and the types of information you can get out of it.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesQuickdice ERP
Explore the seamless transition to e-invoicing with this comprehensive guide tailored for Saudi Arabian businesses. Navigate the process effortlessly with step-by-step instructions designed to streamline implementation and enhance efficiency.
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
Unveiling the Advantages of Agile Software Development.pdfbrainerhub1
Learn about Agile Software Development's advantages. Simplify your workflow to spur quicker innovation. Jump right in! We have also discussed the advantages.
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Crescat
Crescat is industry-trusted event management software, built by event professionals for event professionals. Founded in 2017, we have three key products tailored for the live event industry.
Crescat Event for concert promoters and event agencies. Crescat Venue for music venues, conference centers, wedding venues, concert halls and more. And Crescat Festival for festivals, conferences and complex events.
With a wide range of popular features such as event scheduling, shift management, volunteer and crew coordination, artist booking and much more, Crescat is designed for customisation and ease-of-use.
Over 125,000 events have been planned in Crescat and with hundreds of customers of all shapes and sizes, from boutique event agencies through to international concert promoters, Crescat is rigged for success. What's more, we highly value feedback from our users and we are constantly improving our software with updates, new features and improvements.
If you plan events, run a venue or produce festivals and you're looking for ways to make your life easier, then we have a solution for you. Try our software for free or schedule a no-obligation demo with one of our product specialists today at crescat.io
WWDC 2024 Keynote Review: For CocoaCoders AustinPatrick Weigel
Overview of WWDC 2024 Keynote Address.
Covers: Apple Intelligence, iOS18, macOS Sequoia, iPadOS, watchOS, visionOS, and Apple TV+.
Understandable dialogue on Apple TV+
On-device app controlling AI.
Access to ChatGPT with a guest appearance by Chief Data Thief Sam Altman!
App Locking! iPhone Mirroring! And a Calculator!!
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Most important New features of Oracle 23c for DBAs and Developers. You can get more idea from my youtube channel video from https://youtu.be/XvL5WtaC20A
Mobile App Development Company In Noida | Drona InfotechDrona Infotech
Drona Infotech is a premier mobile app development company in Noida, providing cutting-edge solutions for businesses.
Visit Us For : https://www.dronainfotech.com/mobile-application-development/
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
3. Project Goal
● Understand how SQL Optimizer works
● Generate query plans using Oracle Explain
● Understand the basic principles of Hive
● Execute queries on Hive
● Compare query execution using Oracle and Hive
3
4. The SQL Optimizer
● Why do we need the optimizer?
Select * from Books where author = ‘Ernest Hemingway’;
Two ways to execute it –
• Full table scan
• Index on author
Is there a difference?
• 10 rows
• 10 million rows
4
5. The SQL Optimizer
● SQL is a declarative language
○ Query specifies what, the SQL engine decides how
○ How does understanding SQL optimizer help?
5
7. Data Set Up
● Queries
○ single relation
○ join ( 2-way and 3-way join)
○ aggregate function
○ Aggregates with grouping
○ Set function – Union, Except
○ Sub queries
○ Sub queries using with clause
○ Update and Delete 7
8. Project Execution
● Set up Oracle database
● Generate query optimizer plan using Oracle
Explain
● Set up tables and insert data in Hive
● Execute queries on Hive
8
9. Oracle Query Plan Results - 1
Query using single relation
SELECT title FROM course WHERE dept_name = 'Comp. Sci.' AND credits = 3;
9
10. Oracle Query Plan Results - 1
Query using single relation
SELECT title FROM course WHERE dept_name = 'Comp. Sci.' AND credits = 3;
10
11. Oracle Query Plan Results - 2
Query using 2-way join
SELECT DISTINCT ID FROM takes WHERE (takes.course_id , takes.sec_id, takes.semester,
takes.year) IN (SELECT course_id, sec_id,semester, year FROM teaches NATURAL JOIN
instructor WHERE name = 'Einstein');
11
12. Oracle Query Plan Results - 2
Query using 2-way join
SELECT DISTINCT ID FROM takes WHERE (takes.course_id , takes.sec_id, takes.semester,
takes.year) IN (SELECT course_id, sec_id,semester, year FROM teaches NATURAL JOIN
instructor WHERE name = 'Einstein');
Relational Algebra Expression:
12
Based on Oracle generated query Plan Self created query Plan
13. Oracle Query Plan Results - 3
Query using 3-way join
SELECT name, title FROM (instructor NATURAL JOIN teaches) JOIN course USING
(course_id);
13
14. Oracle Query Plan Results - 3
Query using 3-way join
SELECT name, title FROM (instructor NATURAL JOIN teaches) JOIN course USING
(course_id);
Relational Algebra Expression:
Equivalent Expression:
Based on Oracle generated query Plan
instructor(ID, name, dept_name,salary)
teaches(ID, course_id, sec_id, semester, year)
course(course_id, title, dept_name, credits) 14
Self Created Query Plan
15. Oracle Query Plan Results - 4
Query using aggregate function
SELECT MAX(salary) FROM instructor;
Relational Algebra Expression:
15
16. Oracle Query Plan Results - 5
Query for aggregate with grouping
SELECT COUNT(ID), course_id, sec_id FROM section NATURAL JOIN takes
WHERE semester='Fall' AND year=2009 GROUP BY course_id, sec_id;
16
17. Oracle Query Plan Results - 5
Query for aggregate with grouping
17
SELECT COUNT(ID), course_id, sec_id FROM section NATURAL JOIN takes
WHERE semester='Fall' AND year=2009 GROUP BY course_id, sec_id;
18. Oracle Query Plan Results - 6
Query using union operation
(SELECT course_id FROM section WHERE semester = 'Fall' AND year = 2009) UNION
(SELECT course_id FROM section WHERE semester='Spring' AND year=2010);
18
19. Oracle Query Plan Results - 6
Query using union operation (Expected plan)
(SELECT course_id FROM section WHERE semester = 'Fall' AND year = 2009) UNION
(SELECT course_id FROM section WHERE semester='Spring' AND year=2010);
19
20. Oracle Query Plan Results - 6
Query using union operation (Oracle plan)
(SELECT course_id FROM section WHERE semester = 'Fall' AND year = 2009) UNION
(SELECT course_id FROM section WHERE semester='Spring' AND year=2010);
20
21. Oracle Query Plan Results - 7
Query using except (intersect) operation
(SELECT course_id FROM section WHERE semester = 'Fall' AND year = 2009)
INTERSECT (SELECT course_id FROM section WHERE semester='Spring' AND
year=2010);
21
22. Oracle Query Plan Results - 7
Query using except (intersect) operation (Expected Plan)
(SELECT course_id FROM section WHERE semester = 'Fall' AND year = 2009)
INTERSECT (SELECT course_id FROM section WHERE semester='Spring' AND
year=2010);
22
23. Oracle Query Plan Results - 7
Query using except (intersect) operation (Oracle Plan)
(SELECT course_id FROM section WHERE semester = 'Fall' AND year = 2009)
INTERSECT (SELECT course_id FROM section WHERE semester='Spring' AND
year=2010);
23
24. Oracle Query Plan Results - 8
Query using a subquery
SELECT name FROM instructor WHERE salary = (SELECT MAX(salary) FROM
instructor);
24
25. Oracle Query Plan Results - 8
Query using a subquery
SELECT name FROM instructor WHERE salary = (SELECT MAX(salary) FROM
instructor);
25
Expected
Oracle
26. Oracle Query Plan Results # 9
Query using subquery and rename operation
SELECT MAX(enrollment), course_id FROM (SELECT Count(ID) as enrollment, sec_id, course_id
FROM takes WHERE year=2009 and semester='Fall' GROUP BY sec_id, course_id) GROUP BY
course_id;
26
27. Query Plan # 9 - using subquery
SELECT MAX(enrollment),
course_id
FROM (SELECT Count(ID) as
enrollment, sec_id, course_id FROM
takes
WHERE year=2009 and
semester='Fall'
GROUP BY sec_id, course_id)
GROUP BY course_id;
27
Matches with Oracle’s plan
28. Oracle Query Plan Results - 10
Find the maximum enrollment across all sections in Fall 2009
WITH enrollment(course_id, sec_id, total) AS (SELECT course_id, sec_id, COUNT(ID) FROM
section NATURAL JOIN takes WHERE semester='Fall' and year='2009' GROUP BY course_id,
sec_id) SELECT MAX(total) FROM enrollment;
28
29. Query # 10 subquery and aggregation
SELECT COUNT(ID) as id FROM
section NATURAL JOIN takes
WHERE semester='Fall' and
year=2009 GROUP BY course_id,
sec_id
select max(id)
29
Matches with Oracle’s plan
30. Oracle Query Plan Results -11
Increase salary of each instructor in comp. sci dept. by 10%
UPDATE instructor SET salary = salary * 1.10 WHERE dept_name = 'Comp. Sci.';
30
32. Oracle Query Plan Results -12
Delete all courses that have never been offered
DELETE FROM course
WHERE course_id IN (SELECT course_id FROM course MINUS SELECT course_id FROM course
NATURAL JOIN section);
32
34. Oracle Optimizer - Summary
The purpose of the Oracle Optimizer is to determine the most efficient
execution plan for the queries
Explain plan is the most efficient tool to see why the current plan was chosen
It chooses the best plan by reviewing four key elements of queries:
cardinality, access methods, join methods, and join orders
34
35. Hive
● Why Hive?
Rapidly increasing size of datasets - 700TB data set
Warehouse built using RDBMS failed to scale
Need for scalable analysis on large data sets
Hadoop was not easy for the end users
Need for improved querying capability
Need for diverse applications and users
35
36. Hive is NOT
A relational database
A design for OnLine Transaction Processing (OLTP)
A language for real-time queries and row-level updates
36
37. Hive - Features
● Features of Hive
○ It stores schema in a database and processed data into HDFS.
○ It is designed for OLAP.
○ It provides SQL type language for querying called HiveQL or HQL.
○ It is familiar, fast, scalable, and extensible.
37
38. HiveQL - Query Language
Query Language (HiveQL)
subset of SQL queries - SQL like language
metadata browsing capabilities
explain plan capabilities (naive rule based optimizer)
seamless plugging in of map-reduce programs
eg. FROM(
MAP doctext USING ‘python wc_mapper.py’ AS (word,cnt)
FROM docs
CLUSTER BY word
) a
REDUCE word, cnt USING ‘python wc_reduce.py’;
38
39. Data Model and Query Language
HiveQL - Limitations
No support for where clause subqueries (not in the initial version)
Only equality predicates supported for join
Does not support inserting into an existing table (UPDATE, DELETE
or INSERT INTO are not supported)
Why is this not a problem at FB?
Almost all queries can be expressed using equi-join
Data is loaded in separate partitions
No Complex locking protocol required
39
40. Hive Query Execution
Parse the query
Type Checking and Semantic Analysis
Optimization
performs a chain of transformations
Walks the DAG, checks for Rule condition fulfillment, rule execution
40
41. Hive - Query Optimizer
Query Optimizer - Transformations
Column Pruning
Predicate Pushdown
Partition pruning
Map side joins
small tables kept in all mappers memory
minimizes cost of sorting and merging
Join Reordering 41
42. Hive: Comparison with RDBMS
● Hive
designed for analytics performed on static data
lack of record level update/delete functionality
Write once read many times
process massive amount of data
supports subset of sql queries
● RDBMS
designed for transaction processing and analytics on dynamic data
does support record level update/delete
Read and write many times 42
43. Hive Query Execution Results
(Simple Select Query)
43
SELECT title FROM course WHERE dept_name = 'Comp. Sci.' AND credits = 3
46. Hive Query Execution Results
(subquery)
46
SELECT name,salary FROM instructor i WHERE salary = (SELECT MAX(salary) FROM
instructor)
47. Hive Query Execution Inference
● Queries which include subqueries in Where or Having clause, e.g.
SELECT t.sec_id, t.course_id FROM takes t WHERE t.year=2009 AND
t.semester='Fall' HAVING count(t.ID) IN (SELECT MAX(enrollment) FROM
(SELECT COUNT(tin.ID) AS enrollment, tin.sec_id, tin.course_id FROM takes
tin WHERE tin.year=2009 AND tin.semester='Fall' GROUP BY
tin.sec_id,tin.course_id))
Queries which include subqueries in From clause, e.g.,
SELECT MAX(enrollment), s.course_id FROM (SELECT Count(t.ID) as
enrollment, t.sec_id, t.course_id FROM takes t WHERE t.year=2009 and
t.semester='Fall' GROUP BY t.sec_id,t.course_id) s GROUP BY s.course_id")
47
48. Hive - Use cases
● Hive should be used for analytical querying of data collected over a period of
time - for instance, to calculate trends or website logs.
● Hive should not be used for real-time querying
● It provides us data warehousing facilities on top of an existing Hadoop
cluster. Along with that it provides an SQL like interface which makes work
easier.
● create tables in Hive and store data there. Along with that, an existing HBase
tables can be mapped to Hive and operate on them.
48
49. Hive Query execution inference
Data Size: 20MB
49
HADOOP ORACLE
Hardware
Configuration
Environment: Cloudera CDH-5.6 -
YARN (MapReduce v2) and
Spark (1.5)
Worker Nodes: 24
Cores: 96 (4 cores per node)
Threads: 192
RAM: 768GB
● AMD A8-4555M APU
with Radeon HD
Graphics 1.60 GHz
● 4 cores
● 8GB Ram
● 64-bit operating
system
Average
Execution time
of queries
31.85 seconds 1 second
50. Hive Query execution inference
50
Executed Queries Failed Queries
● simple SELECT queries
● join
● subqueries within FROM
clause
● Union
● Intersection (sub-queries
within FROM clause)
● Aggregation with grouping
● Update
● Delete
● Queries with ‘WITH’ clause
● Sub queries within WHERE
clause