The document introduces database tuning and various performance tuning techniques. It discusses tuning the database at different levels, including the schema, queries, indexes, materialized views, statistics, concurrency control, memory, data partitioning, and hardware. Specific examples are provided for query tuning, index access methods, and materialized view tuning. References are also included for additional reading on database system architectures and memory tuning.
Hybrid TSR-PSR in nonlinear EH half duplex network: system performance analy...IJECEIAES
Nowadays, harvesting energy (EH) from green environmental sources and converting this energy into the electrical energy used in purpose to supply the communication network devices is considered the main research direction. In this research, we investigate the hybrid TSR-PSR Nonlinear Energy Harvesting (EH) Half-duplex (HD) Relaying network in terms of the Success Probability (SP). For this purpose, we derive the integral-form of the system SP. In addition, we use the Monte Carlo simulation for verifying the correctness of the analytical expression. We can see in the research results that all the simulation and analytical values are the same in connection with all primary system parameters.
Public Cloud Platforms for .NET DevelopersSvetlin Nakov
Public clouds platforms are rapidly growing and many businesses move partially or fully their IT infrastructure to the cloud. The big players like Microsoft, Google, Oracle and Amazon operate their own public cloud platforms while the smaller players provide cloud services and PaaS platforms and on top of the larger. What about the .NET developers and the cloud?
In this talk the speaker Svetlin Nakov introduces the public .NET clouds and compares the leading .NET PaaS clouds: Windows Azure, AppHarbor, Uhuru and AWS Elastic Beanstalk for .NET. The .NET public clouds are compared in terms of architecture, programming model, pricing, development stack, available services, deployment model and tools for administration and monitoring. A live demo shows how to deploy and run a typical .NET application (based on ASP.NET MVC and MS SQL Server) in AppHarbor and Uhuru.
Table of contents:
- Public Cloud Platforms
- Typical Cloud Architecture
- Public .NET Cloud Platforms
- Cloud Types: IaaS vs. PaaS
- Windows Azure
- Amazon AWS (+ Beanstalk for .NET)
- AppHarbor
- Uhuru
- Choosing a .NET Cloud
The presentation was delivered at DevReach 2012 (www.devreach.com) by Svetlin Nakov (www.nakov.com) in Sofiа on 4 October 2012.
YouTube video for this presentation: http://youtu.be/H2Jjiu8VyCk
This presentation introduces the principles of high-quality programming code construction during the software development process. The quality of the code is discussed in its most important characteristics – correctness, readability and maintainability. The principles of construction of high-quality class hierarchies, classes and methods are explained. Two fundamental concepts – “loose coupling” and “strong cohesion” are defined and their effect on the construction of classes and subroutines is discussed. Some advices for correctly dealing with the variables and data are given, as well as directions for correct naming of the variables and the rest elements of the program. Best practices for organization of the logical programming constructs are explained. Attention is given also to the “refactoring” as a technique for improving the quality of the existing code. The principles of good formatting of the code are defined and explained. The concept of “self-documenting code” as a programming style is introduced.
Since the manageability of RMAN backup, restore and recovery operations are nearly identical for nonclustered and clustered databases, the objective of this presentation is summarize you how RMAN can be best utilized in a RAC database.
Defining Simple Classes
Using Own Classes and Objects
Access Modifiers
Constructors and Initializers
Defining Fields
Defining Properties, Getters and Setters
Defining Methods
Exercises: Defining and Using Own Classes
“A new multitenant architecture that easily deploy and manage database clouds. Innovations such as Oracle Multitenant for consolidating multiple databases, Automatic Data Optimization for compressing and tiering data at a higher density also maximize resource efficiency and flexibility. These unique advancements, combined with major enhancements in availability, security, and big data support, ideal platform for private and public cloud deployments.”
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
NoSQL includes a wide range of different database technologies and were developed as a result of surging volume of data stored. Relational databases are not capable of coping with this huge volume and faces agility challenges. This is where NoSQL databases have come in to play and are popular because of their features. The session covers the following topics to help you choose the right NoSQL databases:
Traditional databases
Challenges with traditional databases
CAP Theorem
NoSQL to the rescue
A BASE system
Choose the right NoSQL database
Hybrid TSR-PSR in nonlinear EH half duplex network: system performance analy...IJECEIAES
Nowadays, harvesting energy (EH) from green environmental sources and converting this energy into the electrical energy used in purpose to supply the communication network devices is considered the main research direction. In this research, we investigate the hybrid TSR-PSR Nonlinear Energy Harvesting (EH) Half-duplex (HD) Relaying network in terms of the Success Probability (SP). For this purpose, we derive the integral-form of the system SP. In addition, we use the Monte Carlo simulation for verifying the correctness of the analytical expression. We can see in the research results that all the simulation and analytical values are the same in connection with all primary system parameters.
Public Cloud Platforms for .NET DevelopersSvetlin Nakov
Public clouds platforms are rapidly growing and many businesses move partially or fully their IT infrastructure to the cloud. The big players like Microsoft, Google, Oracle and Amazon operate their own public cloud platforms while the smaller players provide cloud services and PaaS platforms and on top of the larger. What about the .NET developers and the cloud?
In this talk the speaker Svetlin Nakov introduces the public .NET clouds and compares the leading .NET PaaS clouds: Windows Azure, AppHarbor, Uhuru and AWS Elastic Beanstalk for .NET. The .NET public clouds are compared in terms of architecture, programming model, pricing, development stack, available services, deployment model and tools for administration and monitoring. A live demo shows how to deploy and run a typical .NET application (based on ASP.NET MVC and MS SQL Server) in AppHarbor and Uhuru.
Table of contents:
- Public Cloud Platforms
- Typical Cloud Architecture
- Public .NET Cloud Platforms
- Cloud Types: IaaS vs. PaaS
- Windows Azure
- Amazon AWS (+ Beanstalk for .NET)
- AppHarbor
- Uhuru
- Choosing a .NET Cloud
The presentation was delivered at DevReach 2012 (www.devreach.com) by Svetlin Nakov (www.nakov.com) in Sofiа on 4 October 2012.
YouTube video for this presentation: http://youtu.be/H2Jjiu8VyCk
This presentation introduces the principles of high-quality programming code construction during the software development process. The quality of the code is discussed in its most important characteristics – correctness, readability and maintainability. The principles of construction of high-quality class hierarchies, classes and methods are explained. Two fundamental concepts – “loose coupling” and “strong cohesion” are defined and their effect on the construction of classes and subroutines is discussed. Some advices for correctly dealing with the variables and data are given, as well as directions for correct naming of the variables and the rest elements of the program. Best practices for organization of the logical programming constructs are explained. Attention is given also to the “refactoring” as a technique for improving the quality of the existing code. The principles of good formatting of the code are defined and explained. The concept of “self-documenting code” as a programming style is introduced.
Since the manageability of RMAN backup, restore and recovery operations are nearly identical for nonclustered and clustered databases, the objective of this presentation is summarize you how RMAN can be best utilized in a RAC database.
Defining Simple Classes
Using Own Classes and Objects
Access Modifiers
Constructors and Initializers
Defining Fields
Defining Properties, Getters and Setters
Defining Methods
Exercises: Defining and Using Own Classes
“A new multitenant architecture that easily deploy and manage database clouds. Innovations such as Oracle Multitenant for consolidating multiple databases, Automatic Data Optimization for compressing and tiering data at a higher density also maximize resource efficiency and flexibility. These unique advancements, combined with major enhancements in availability, security, and big data support, ideal platform for private and public cloud deployments.”
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
NoSQL includes a wide range of different database technologies and were developed as a result of surging volume of data stored. Relational databases are not capable of coping with this huge volume and faces agility challenges. This is where NoSQL databases have come in to play and are popular because of their features. The session covers the following topics to help you choose the right NoSQL databases:
Traditional databases
Challenges with traditional databases
CAP Theorem
NoSQL to the rescue
A BASE system
Choose the right NoSQL database
Here is the easy presentation of Software Requirements Specification Model on "Payroll Management System" for employees of a company. It is important for Developing the software for mentioned system. More information you can find after opening the document.
Traditionally database systems were optimized either for OLAP either for OLTP workloads. Such mainstream DBMSes like Postgres,MySQL,... are mostly used for OLTP, while Greenplum, Vertica, Clickhouse, SparkSQL,... are oriented on analytic queries. But right now many companies do not want to have two different data stores for OLAP/OLTP and need to perform analytic queries on most recent data. I want to discuss which features should be added to Postgres to efficiently handle HTAP workload.
Quantum Annealing for Dirichlet Process Mixture Models with Applications to N...Shu Tanaka
Our paper entitled “Quantum Annealing for Dirichlet Process Mixture Models with Applications to Network Clustering" was published in Neurocomputing. This work was done in collaboration with Dr. Issei Sato (Univ. of Tokyo), Dr. Kenichi Kurihara (Google), Professor Seiji Miyashita (Univ. of Tokyo), and Prof. Hiroshi Nakagawa (Univ. of Tokyo).
http://www.sciencedirect.com/science/article/pii/S0925231213005535
The preprint version is available:
http://arxiv.org/abs/1305.4325
佐藤一誠さん(東京大学)、栗原賢一さん(Google)、宮下精二教授(東京大学)、中川裕志教授(東京大学)との共同研究論文 “Quantum Annealing for Dirichlet Process Mixture Models with Applications to Network Clustering" が Neurocomputing に掲載されました。
http://www.sciencedirect.com/science/article/pii/S0925231213005535
プレプリントバージョンは
http://arxiv.org/abs/1305.4325
からご覧いただけます。
Presented at the First openCypher Implementers Meeting in Walldorf, Germany, February 2017 @ http://www.opencypher.org/blog/2017/03/31/first-ocim-blog/
Presentation of the paper:
Szymon Klarman and Thomas Meyer. Querying Temporal Databases via OWL 2 QL (with appendix). In Proceedings of the 8th International Conference on Web Reasoning and Rule Systems (RR-14), 2014.
Any change in a system that allows it to perform better the second time on repetition of the
same task or on another task drawn from the same population (Simon, 1983).
The technology for building knowledge-based systems by inductive inference from examples has
been demonstrated successfully in several practical applications. This paper summarizes an approach to
synthesizing decision trees that has been used in a variety of systems, and it describes one such system,
ID3, in detail. Results from recent studies show ways in which the methodology can be modified to deal
with information that is noisy and/or incomplete. A reported shortcoming of the basic algorithm is
discussed and two means of overcoming it are compared. The paper concludes with illustrations of current
research directions.
The success of data-driven solutions to dicult problems,
along with the dropping costs of storing and processing mas-
sive amounts of data, has led to growing interest in large-
scale machine learning. This paper presents a case study
of Twitter's integration of machine learning tools into its
existing Hadoop-based, Pig-centric analytics platform. We
begin with an overview of this platform, which handles \tra-
ditional" data warehousing and business intelligence tasks
for the organization. The core of this work lies in recent Pig
extensions to provide predictive analytics capabilities that
incorporate machine learning, focused specically on super-
vised classication. In particular, we have identied stochas-
tic gradient descent techniques for online learning and en-
semble methods as being highly amenable to scaling out to
large amounts of data. In our deployed solution, common
machine learning tasks such as data sampling, feature gen-
eration, training, and testing can be accomplished directly
in Pig, via carefully crafted loaders, storage functions, and
user-dened functions. This means that machine learning
is just another Pig script, which allows seamless integration
with existing infrastructure for data management, schedul-
ing, and monitoring in a production environment, as well
as access to rich libraries of user-dened functions and the
materialized output of other scripts.
A Few Useful Things to Know about Machine Learningnep_test_account
Machine learning algorithms can figure out how to perform
important tasks by generalizing from examples. This is often feasible and cost-effective where manual programming
is not. As more data becomes available, more ambitious
problems can be tackled. As a result, machine learning is
widely used in computer science and other fields. However,
developing successful machine learning applications requires
a substantial amount of “black art” that is hard to find in
textbooks. This article summarizes twelve key lessons that
machine learning researchers and practitioners have learned.
These include pitfalls to avoid, important issues to focus on,
and answers to common questions.
2. Database Tuning
◮ Make a database application run more quickly
◮ higher throughput
◮ lower response time
◮ Auto-tuning / self-tuning
◮ Better performance
◮ Easier manageability
◮ Query workload
◮ Online transaction processing (OLTP)
◮ Decision support systems (DSS) / Online analytical
processing (OLAP) / Data warehousing
CS5226: Sem 2, 2012/13 Introduction 2
3. Anatomy of DBMS
(Hellerstein, Stonebraker, Hamilton, 2007)
CS5226: Sem 2, 2012/13 Introduction 3
4. Query Optimization
πX , Y , Z project
X,Y,Z
sort-merge join
⊲⊳R .B=T .B
select R.X, S.Y, T.Z R.B = T.B
from R, S, T
hash join scan table
where R.A = S.A ⊲⊳R .A=S.A T
R.A = S.A for T
and R.B = T.B
R S scan index
scan table
on (A,Y)
for R
for S
Internal Query Physical
Query
Representation Query Plan
CS5226: Sem 2, 2012/13 Introduction 4
5. Performance Tuning Knobs
◮ Schema tuning
◮ Query tuning
◮ Index & materialized view selection
◮ Statistics tuning
◮ Concurrency control tuning
◮ Data partitioning
◮ Memory tuning
◮ Hardware tuning
CS5226: Sem 2, 2012/13 Introduction 5
6. Schema Tuning
CourseInfo
Module Prof Room Building Time
CS101 Turing LT 1 CS 0800
CS400 Turing LT 1 CS 1400
MU300 Bach LT 2 Math 1400
MA200 Newton LT 2 Math 1000
CS101 Turing LT 2 Math 1200
Schedule
Course
Room Time Module
Facility Module Prof
LT 1 0800 CS101
Room Building CS101 Turing
LT 1 1400 CS400
LT 1 CS CS400 Turing
LT 2 1400 MU200
LT 2 Math MU300 Bach
LT 2 1000 MA200
MA200 Newton
LT 2 1200 CS101
CS5226: Sem 2, 2012/13 Introduction 6
7. Query Tuning
Q1: select c.cname
from Customer c
where 1000 < (select sum(o.totalprice)
from Order o
where o.cust# = c.cust#)
Q2: select c.cname
from Customer c join Order o
on c.cust# = o.cust#
group by c.cust#, c.cname
having 1000 < sum(o.totalprice)
CS5226: Sem 2, 2012/13 Introduction 7
8. Query Tuning (cont.)
Q3: select c.cust#, c.cname, sum(o.totalprice) as T
from Customer c join Order o
on c.cust# = o.cust#
group by c.cust#, cname
Q4: select c.cust#, cname, T
from customer c,
(select cust#, sum(totalprice) as T
from Order
group by cust#) as o
where c.cust# = o.cust#
CS5226: Sem 2, 2012/13 Introduction 8
9. Query Tuning (cont.)
Q5: select distinct R.A, S.X
from R, S
where R.B = S.Y
Q6: select R.A, S.X
from R, S
where R.B = S.Y
CS5226: Sem 2, 2012/13 Introduction 9
10. Index Tuning
select A, B, C
from R
where 10 < A < 20
Access methods for selection queries:
◮ Table scan
◮ Use one or more indexes
CS5226: Sem 2, 2012/13 Introduction 10
11. B+-tree index
18
12 17 22 24
(Moe,10,55,180) (Larry,12,70,175) (Alice,17,48,175) (John,18,59,182) (Marcie,22,50,165) (Sally,24,48,169)
(Curly,10,65,171) (Bob,15,60,178) (Lucy,17,45,170) (Charlie,20,69,173) (Linus,23,60,166) (Tom,25,56,176)
Clustered index on (age)
Relation R
59
name age weight height
Moe 10 55 180
Curly 10 65 171
50 65
Larry 12 70 175
Bob 15 60 178
Alice 17 48 175
Lucy 17 45 170 (45, RID32) (50, RID51) (59, RID41) (65, RID12)
John 18 59 182 (48, RID31) (55, RID11) (60, RID22) (69, RID42)
Charlie 20 69 173 (48, RID61) (56, RID62) (60, RID52) (70, RID21)
Marcie 22 50 165
Linus 23 60 166
Sally
Tom
24
25
48
56
169
176
Unclustered index on (weight)
CS5226: Sem 2, 2012/13 Introduction 11
12. Index access methods
◮ Index scan
◮ Index seek [+ RID lookup ]
◮ Index intersection [+ RID lookup ]
CS5226: Sem 2, 2012/13 Introduction 12
13. Index scan
select height
from Student
(175,48)
(170,45) (178,60)
(165,50, RID51) (170,45, RID32) (175,48, RID31) (178,60, RID22)
(166,60, RID52) (171,65, RID12) (175,70, RID21) (180,55, RID11)
(169,48, RID61) (173,69, RID42) (176,56, RID62) (182,59, RID41)
Index on (height,weight)
CS5226: Sem 2, 2012/13 Introduction 13
14. Index seek
select weight
from Student
where weight between 55 and 65
59
50 65
(45, RID32) (50, RID51) (59, RID41) (65, RID12)
(48, RID31) (55, RID11) (60, RID22) (69, RID42)
(48, RID61) (56, RID62) (60, RID52) (70, RID21)
Index on (weight)
CS5226: Sem 2, 2012/13 Introduction 14
15. Index seek + RID lookups
select weight
select name
from Student
from Student
where weight between 55 and 59
where weight between 55 and 59
and age ≥ 20
59
50 65 Index on (weight)
(45, RID32) (50, RID51) (59, RID41) (65, RID12)
(48, RID31) (55, RID11) (60, RID22) (69, RID42)
(48, RID61) (56, RID62) (60, RID52) (70, RID21)
(Moe,10,55,180) (Larry,12,70,175) (Alice,17,48,175)
(Curly,10,65,171) (Bob,15,60,178) (Lucy,17,45,170)
Data
(Sally,24,48,169) (Marcie,22,50,165) (John,18,59,182)
(Tom,25,56,176) (Linus,23,60,166) (Charlie,20,69,173)
Pages
CS5226: Sem 2, 2012/13 Introduction 15
16. Index intersection
select height, weight from Student
where height between 164 and 170
and weight between 50 and 59
(175,48)
170 178
Index on (height)
(165, RID51) (170, RID32) (175, RID31) (178, RID22)
(166, RID52) (171, RID12) (175, RID21) (180, RID11)
(169, RID61) (173, RID42) (176, RID62) (182, RID41)
59
50 65
Index on (weight)
(45, RID32) (50, RID51) (59, RID41) (65, RID12)
(48, RID31) (55, RID11) (60, RID22) (69, RID42)
(48, RID61) (56, RID62) (60, RID52) (70, RID21)
CS5226: Sem 2, 2012/13 Introduction 16
17. Index Tuning
Q1: select A, B, C
from R
where 10 < A < 20
and 20 < B < 100
Q2: select B, C, D
from R
where 50 < B < 100
and 60 < D < 80
CS5226: Sem 2, 2012/13 Introduction 17
18. Materialized View Tuning
Q1: select R.B
from R, S
where R.A = S.X
and S.Y > 100
MV1: select R.A, R.B, S.X, S.Y
from R, S
where R.A = S.X
Q1’: select B
from MV1
where Y > 100
CS5226: Sem 2, 2012/13 Introduction 18
19. Tuning of indexes & materialized views
Given a query workload and a disk space constraint, what
is the optimal configuration of indexes & materialized views
to optimize the performance of the workload?
CS5226: Sem 2, 2012/13 Introduction 19
20. Tuning of statistics
Examples of statistics:
◮ table cardinality
◮ statistics for each column:
◮ number of distinct values
◮ highest & lowest values
◮ frequent values
◮ data distribution statistics
◮ multi-column statistics
Issues
◮ What statistics to collect?
◮ When to collect/refresh statistics?
CS5226: Sem 2, 2012/13 Introduction 20
21. Tuning of concurrency control
◮ Concurrency control protocols
◮ Two-phase locking
◮ Snapshot isolation
◮ Consistency vs concurrency tradeoff
◮ ANSI SQL isolation levels
Dirty Unrepeatable Phantom
Isolation Level Read Read Read
READ UNCOMMITTED possible possible possible
READ COMMITTED not possible possible possible
REPEATABLE READ not possible not possible possible
SERIALIZABLE not possible not possible not possible
CS5226: Sem 2, 2012/13 Introduction 21
22. Tuning of memory
◮ How to optimize memory allocation?
Oracle Memory Model (Dageville & Zait, VLDB 2002)
CS5226: Sem 2, 2012/13 Introduction 22
23. Data partitioning
◮ Increase data availability
◮ Decrease administrative cost
◮ Improve query performance
Shared-nothing parallel DBMS (Hellerstein, et al., 2007)
CS5226: Sem 2, 2012/13 Introduction 23
24. References
Additional Readings:
◮ J.M. Hellerstein, M. Stonebraker, J. Hamilton, Architecture of a
Database System, Foundations and Trends in Databases, 1(2),
2007, 141-259.
CS5226: Sem 2, 2012/13 Introduction 24