MySQL Basics


Published on

Published in: Technology, Sports
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

MySQL Basics

  1. 1. Normalization and auto_increment <ul><li>Avoid adding duplicate data - Normalize </li></ul><ul><li>Use auto_increment column available only to mysql to solve this problem. Auto incremented id will differentiate the unique rows from each other. </li></ul>
  2. 2. multiple – column AUTO_INCREMENT index <ul><li>PRIMARY KEY(name, surname, city, id) OR </li></ul><ul><li>UNIQUE KEY(name, surname, city, id) </li></ul><ul><li>The AUTO_INCREMENT column must be named last, or the multiple-sequence mechanism will not work. </li></ul><ul><li>A PRIMARY KEY cannot contain null values, but a unique index can. </li></ul><ul><li>You may need to drop and recreate the auto_increment column in order to get the expected results. </li></ul>
  3. 3. second auto_increment column <ul><li>Only one auto_increment column is allowed per table. If you want one more auto_increment column, use variables to create the sequence. </li></ul><ul><li>SET @t=0; </li></ul><ul><li>UPDATE mytable SET second_id=(@T:=(@T+1)) </li></ul>
  4. 4. Data Types <ul><li>INT[(SIZE)] [UNSIGNED] [ZEROFILL] A simple integer between -2147483648 and +2147483647 or, if the UNSIGNED attribute is provided, between zero and 429 crore. The ZEROFILL attribute indicates that the number should be prefixed with zeros until the number is SIZE digits in length. </li></ul><ul><li>DECIMAL[(M[,D])] [UNSIGNED] [ZEROFILL] M is the total number of digits and D is the number of decimals. If D is omitted, the default is 0. If M is omitted, the default is 10. decimal(4,2) means numbers upto 99.99 (and NOT 9999.99 as you may expect) can be saved. Four digits with the last 2 reserved for decimal. </li></ul><ul><li>VARCHAR[(SIZE)] [BINARY] A variable-length string that is a maximum of SIZE characters in length (where SIZE cannot exceed 255). Unless the BINARY attribute is provided, this data type is considered case-insensitive and obviously cannot hold binary data. </li></ul><ul><li>TEXT A case-sensitive string that is a maximum of 65,535 characters in length. </li></ul><ul><li>DATE Similar to the DATETIME data type, except without the time in YYYY-MM-DD format. </li></ul>
  5. 5. Data Types II <ul><li>INT(n) Specifying an n value has no effect whatsoever. Regardless of a supplied value for n, maximum (unsigned) value stored is 429 crores. By default all numeric datatypes are SIGNED (allow negative values). When you add the keyword UNSIGNED, it will cause negative values to be disallowed. If you attempt to store a negative value in an UNSIGNED column, MySQL stores zero instead. </li></ul><ul><li>If you don't specify a DEFAULT value for a column, MySQL chooses a default for you. The value is NULL if the column may contain NULL; otherwise, the value depends on the column type. For numeric columns, the default is zero. For string columns other than ENUM, the default is the empty string. For ENUM columns, the default is the first enumeration member. </li></ul><ul><li>BLOB or TEXT column with a maximum length of 65,535 characters. </li></ul><ul><li>A SET datatype can hold any number of strings from a predefined list of strings specified during table creation. The SET datatype is similar to the ENUM datatype in that they both work with predefined sets of strings, but where the ENUM datatype restricts you to a single member of the set of predefined strings, the SET datatype allows you to store any of the values together, from none to all of them. </li></ul>
  6. 6. Data Types III <ul><li>VARCHAR(10) column can hold a string with a maximum length of 10 characters. The actual storage required is the length of the string (L), plus 1 byte to record the length of the string. For the string 'abcd', L is 4 and the storage requirement is 5 bytes. </li></ul><ul><li>A varchar will take up less disk space than a char. A char is padded to fill it's length, so a index on char will be much larger than a varchar, depending on content. </li></ul><ul><li>varchar(n) VARCHAR is shorthand for CHARACTER VARYING. 'n' represents the maximum column length (upto 255 characters) char(n) is similar to varchar(n) with the only difference that char will occupy fixed length of space in the database whereas varchar will need the space to store the actual text. For example, a VARCHAR(10) column can hold a string with a maximum length of 10 characters. The actual storage required is the length of the string (L), plus 1 byte to record the length of the string. For the string 'abcd', L is 4 and the storage requirement is 5 bytes </li></ul>
  7. 7. Date and Time <ul><li>Out of the three types DATE, DATETIME, and TIMESTAMP, the DATE type is used when you need only a date value, without a time part. MySQL retrieves and displays DATE values in 'YYYY-MM-DD' format. </li></ul><ul><li>The DATETIME type is used when you need values that contain both date and time information. </li></ul><ul><li>Defaults must be constants, not functions. If you want a DATETIME which defaults to NOW(), then you probably really need a TIMESTAMP </li></ul><ul><li>STR_TO_DATE() is available as of MySQL 4.1.1. </li></ul><ul><li>As of 4.1.2 ... you have better control over TIMESTAMP columns! </li></ul><ul><li>The 'WEEK' type is only supported from MySQL 5.0.0 onwards. SELECT * FROM table2 WHERE compdate > DATE_SUB(NOW(), INTERVAL 12 WEEK) ORDER BY compdate </li></ul>
  8. 8. Date and Time II <ul><li>DATETIME needs 8 bytes per record </li></ul><ul><li>TIMESTAMP uses only 4 </li></ul><ul><li>DATE uses only 3, so does TIME </li></ul><ul><li>YEAR is the smallest... 1 byte. </li></ul><ul><li>If you make the column a TIMESTAMP data type and leave the default as NULL, it will automatically use the current date/time if no value is entered. Please note, this will only work this way for the first TIMESTAMP column in the table. Also, if ever you update a row containing TIMESTAMP columns, the first TIMESTAMP column will automatically be updated to use the new current date/time. </li></ul><ul><li>The first TIMESTAMP column of a table is updated automatically when any column is changed in the row. Another issue is the range: a TIMESTAMP column can store values between 1970 and 2037, while a DATETIME column can store values in the range from '1000-01-01 00:00:00' to '9999-12-31 23:59:59'. </li></ul>
  9. 9. Date - Time examples <ul><li>SELECT * FROM mytable WHERE datetimecol >= (CURDATE() - INTERVAL 1 YEAR) AND datetimecol < (CURDATE() - INTERVAL 1 YEAR) INTERVAL 1 DAY; </li></ul><ul><li>SELECT IF(DAYOFMONTH(CURDATE()) <= 15, DATE_FORMAT(CURDATE(), '%Y-%m-15'), DATE_FORMAT(CURDATE() + INTERVAL 1 MONTH, '%Y-%m-15')) AS next15 FROM table; </li></ul><ul><li>SELECT YEAR('2002-05-10'), MONTH('2002-05-10'), DAYOFMONTH('2002-05-10') </li></ul><ul><li>SELECT PurchaseDate FROM table WHERE YEAR(PurchaseDate) <= YEAR(CURDATE()) </li></ul><ul><li>SELECT columns FROM table WHERE start_time >= '2004-06-01 10:00:00' AND end_time <= '2004-06-03 18:00:00' </li></ul><ul><li>SELECT * FROM t1 WHERE DATE_FORMAT(datetime_column, '%T') BETWEEN 'HH:MM:SS' AND 'HH:MM:SS' </li></ul><ul><li>SELECT Start_time, End_time FROM Table WHERE Start_time >= NOW() - INTERVAL 4 HOUR </li></ul><ul><li>SELECT NOW() + INTERVAL 60 SECOND </li></ul>
  10. 10. Alias - basics <ul><li>A select expression may be given an alias using AS. The alias is used as the expression’s column name and can be used with group by, order by or having clauses. For ex. </li></ul><ul><li>Select concat (last_name, ‘, ‘, first_name) as full_name from mytable ORDER BY full_name </li></ul><ul><li>SQL doesn't allow you to refer to a column alias in a WHERE clause. This is because when the WHERE code is executed, the column value may not yet be determined. </li></ul>
  11. 11. Alias - advance <ul><li>A table name can have a shorter name for reference using AS. You can omit the AS word and still use aliasing. For e.g. SELECT COUNT(B.Booking_ID), U.User_Location FROM Users U LEFT OUTER JOIN Bookings B ON U.User_ID = B.Rep_ID AND B.Project_ID = '10' GROUP BY(U.User_Location) </li></ul><ul><li>Aliasing plays a crucial role while you are using self joins. For e.g. people table has been referred to as p and c aliases! SELECT as parent, as child, MIN((TO_DAYS(NOW())-TO_DAYS(c.dob))/365) as minage FROM people as p LEFT JOIN people as c ON WHERE IS NOT NULL GROUP BY parent HAVING minage > 50 ORDER BY p.dob; </li></ul>
  12. 13. alter <ul><li>alter table command can be used when you want to add / delete the columns or change the data type. </li></ul><ul><li>ALTER [IGNORE] TABLE <tbl_name> <action_list> Where action_list is… </li></ul><ul><li>ADD COLUMN [column name and definiton] / DROP COLUMN [column name] </li></ul><ul><li>ADD PRIMARY KEY (index_columns) / DROP PRIMARY KEY </li></ul><ul><li>ADD INDEX [index_name] (index_columns) / ADD UNIQUE [index_name] (index_columns) / DROP INDEX index_name </li></ul><ul><li>ALTER COLUMN col_name {set default value | drop default } </li></ul><ul><li>CHANGE COLUMN old_col_name new_col_name_and_declaration </li></ul><ul><li>MODIFY COLUMN [column declaration] </li></ul><ul><li>RENAME AS new_tbl_name </li></ul><ul><li>TABLE_OPTIONS e.g. ALTER TABLE score TYPE = InnoDB </li></ul>
  13. 14. alter table examples <ul><li>ALTER TABLE awards ADD COLUMN AwardCode int(2) </li></ul><ul><li>ALTER TABLE awards ALTER COLUMN AwardCode VARCHAR(2) NOT NULL </li></ul><ul><li>ALTER TABLE awards DROP COLUMN AwardCode </li></ul>
  14. 15. create table Part I <ul><li>Create table syntax is: CREATE TABLE tablename ( FieldName1 DataType, FieldName2 DataType) </li></ul><ul><li>The rows returned by the “select” query can be saved as a new table. The datatype will be the same as the old table. For e.g. </li></ul><ul><li>CREATE TABLE LearnHindi SELECT english.tag, english.Inenglish AS english, hindi.Inhindi AS hindi FROM english, hindi WHERE english.tag = hindi.tag </li></ul><ul><li>The only drawback is that Indexes are not copied from the old table to the new table. </li></ul>
  15. 16. rename table <ul><li>Rename Table syntax is: RENAME TABLE tbl_name TO new_name </li></ul><ul><li>If you want to swap two table names, you can do so like this: </li></ul><ul><li>RENAME TABLE old_table TO tmp_table , new_table TO old_table , tmp_table TO new_table ; </li></ul>
  16. 17. “ create table” using selected rows OR “insert into” the “select”ed records <ul><li>The rows returned by the “select” query can be saved as a new table. For e.g. CREATE TABLE LearnHindi SELECT english.tag, AS english, AS hindi FROM english, hindi WHERE english.tag = hindi.tag </li></ul><ul><li>If you want an empty table just change the select to return no rows like so.. </li></ul><ul><li>CREATE TABLE emps2 AS SELECT * FROM emps WHERE 1=2; </li></ul><ul><li>If the table is already present, you can use INSERT INTO table1 (feld1, field2) SELECT field1, field2 FROM table2 </li></ul><ul><li>INSERT INTO Persons_backup SELECT * FROM Persons </li></ul><ul><li>INSERT INTO Persons_backup SELECT LastName, Firstname FROM Persons WHERE City=‘Delhi‘ </li></ul><ul><li>INSERT INTO Empl_Ord_backup SELECT Employees.Name, Orders.Product FROM Employees INNER JOIN Orders ON Employees.Employee_ID=Orders.Employee_ID </li></ul>
  17. 18. Duplicating table with or without keys <ul><li>CREATE TABLE emps2 AS SELECT emp_id, emp_name FROM emps; </li></ul><ul><li>This method will copy the column types and length but not key declarations. You will have to use alter table or create index command to add the keys. You can, however, duplicate the table with keys in a single command like this... </li></ul><ul><li>CREAET TABLE tblnumbers2 LIKE tblnumbers </li></ul><ul><li>But what if you want only a few columns from the original table? </li></ul><ul><li>CREATE TEMPORARY TABLE tmpRatings(KEY(map)) SELECT map , avg(rating) as rating , count(id) as votes FROM maps_rating GROUP BY map </li></ul><ul><li>CREATE TABLE support_cost ( serial INT NOT NULL PRIMARY KEY, sumcost INT, KEY sumcostix(sumcost) ) SELECT serial, SUM(cost) AS sumcost FROM conveyor GROUP BY serial </li></ul><ul><li>You can change the column definition if you do not want the same column types as the select table. </li></ul><ul><li>CREATE TABLE tblphone3(fname VARCHAR(30) DEFAULT NULL, lname VARCHAR(30) NOT NULL DEFAULT ‘’, KEY fname(fname)) SELECT fname, lname FROM tblphone; </li></ul>
  18. 19. auto_increment <ul><li>auto_increment column </li></ul><ul><li>SET @counter = 0; CREATE TABLE mytimetest4 SELECT @counter := @counter + 1 AS c, mytime, mynumber FROM mytimetest3; </li></ul>
  19. 20. Merge Table <ul><li>A MERGE table is a collection of identical MyISAM tables that can be used as one. </li></ul><ul><li>Identical means that all tables have identical column and index information. </li></ul><ul><li>CREATE TABLE mumbai (first_name VARCHAR(30), amount INT(10)) TYPE=MyISAM </li></ul><ul><li>CREATE TABLE delhi (first_name VARCHAR(30), amount INT(10)) TYPE=MyISAM </li></ul><ul><li>CREATE TABLE total (first_name VARCHAR(30), amount INT(10)) UNION=(mumbai,delhi) INSERT_METHOD=LAST </li></ul><ul><li>If you are using version 4 you can insert a record in a merged table. The INSERT_METHOD can be NO, FIRST or LAST </li></ul>
  20. 21. Merge Records <ul><li>drop table if exists `hindi`; </li></ul><ul><li>CREATE TABLE `hindi` ( `tag` int(99) default NULL, `name` varchar(99) default NULL ) TYPE=MyISAM; </li></ul><ul><li>insert into hindi (tag, name) values (2, 'do'); insert into hindi values (3, 'teen'); insert into hindi values (4, 'char'), (5, 'paanch'); </li></ul><ul><li>drop table if exists `english`; </li></ul><ul><li>CREATE TABLE `english` ( `tag` int(99) default NULL, `name` varchar(99) default NULL ) TYPE=MyISAM; </li></ul><ul><li>insert into english (tag, name) values (1, 'one'); insert into english values (2, 'two'), (3, 'three'); </li></ul><ul><li>CREATE TABLE `learn` ( `tag` int(99) default NULL, `name` varchar(99) default NULL ) TYPE=MRG_MyISAM UNION=(hindi,english) </li></ul>
  21. 22. Views <ul><li>A method for saving an SQL query to appear a table </li></ul><ul><li>Part of the SQL 92 standard </li></ul><ul><li>In every other DBMS </li></ul><ul><li>Part of MySQL 5 </li></ul><ul><li>View data can be pre-cached (i.e. maintained similar to an index). </li></ul><ul><ul><li>Currently this is not the case in MySQL. </li></ul></ul><ul><li>Some of the view definitions are impractical for everyday use (sub selects are often inefficient in MySQL). </li></ul><ul><li>Views on a single table can be READ/WRITE. </li></ul><ul><li>Views on multiple tables are READ only. </li></ul>
  22. 23. What’s really cool with Views? <ul><li>They can be used to cache common queries. </li></ul><ul><ul><li>ANSI SQL but not in MySQL :-) </li></ul></ul><ul><li>They can be used to apply access privileges. </li></ul><ul><li>They can simplify or breakup complex problems. </li></ul><ul><ul><li>i.e. SELECT * from view_customers instead of SELECT * from customers WHERE permission_to_contact = TRUE; </li></ul></ul><ul><li>You can use VIEWs of VIEWs to simplify problems. </li></ul><ul><li>You can do a lot of stuff by joining a table on itself. </li></ul><ul><li>Storing derivable data in the database </li></ul><ul><ul><li>i.e. total_cos </li></ul></ul>
  23. 24. Views - chance to correct poor design <ul><li>Views can be used to correct poor table design. </li></ul><ul><li>Non atomic values. </li></ul><ul><ul><li>i.e. a field called name which has “Amar Patil”. In an ideal database, we would have two fields, with Name and Surname. or </li></ul></ul><ul><ul><li>SELECT RTRIM(CONCAT('Morgan Tocker',' ','')) as name; </li></ul></ul><ul><ul><li>SELECT RTRIM(CONCAT('Morgan’, ' ', 'Tocker')) as name; </li></ul></ul><ul><li>No indexes on tables </li></ul><ul><li>A lack or primary keys </li></ul><ul><li>Certain fields had either been deprecated, never used, or had remained on the end of one table, when a new table had been created for that specific purpose. </li></ul>
  24. 25. Poor table design (cont’d) <ul><li>Inconsistent use of naming conventions: </li></ul><ul><ul><li>There are two common approaches for field naming: </li></ul></ul><ul><ul><ul><li>TheFirstLetterCapital </li></ul></ul></ul><ul><ul><ul><li>all_in_lowercase_use_underscore </li></ul></ul></ul><ul><li>Poor use of the relational model </li></ul><ul><ul><li>contact_person_1_name, contact_person_2_name, contact_person_1_phone, contact_person_2_phone, contact_person_1_email, contact_person_2_email </li></ul></ul><ul><ul><li>Should use a table called customer_contacts with a 1 to N relationship </li></ul></ul>
  25. 26. Using an updatable VIEW as a constraint <ul><li>CREATE TABLE con (i INT) INSERT INTO con VALUES (89) - Success SELECT * FROM con i 14 89 </li></ul><ul><li>CREATE VIEW viewcon AS </li></ul><ul><li>SELECT * FROM con </li></ul><ul><li>WHERE i BETWEEN 10 AND 20 </li></ul><ul><li>WITH CASCADED CHECK OPTION </li></ul><ul><li>INSERT INTO viewcon VALUES (14) - Success </li></ul><ul><li>INSERT INTO viewcon VALUES (45) - Fail SELECT * FROM viewcon i 14 </li></ul>
  26. 27. Table Types: myISAM and InnoDB <ul><li>MyISAM does table level locking, while InnoDB does row level locking. In addition to foreign keys, InnoDB offers transaction support, which is absolutely critical when dealing with larger applications. Speed does suffer though because all this Foreign Key / Transaction stuff takes lots of overhead. </li></ul>
  27. 28. Default table type <ul><li>Default table type is MyISAM </li></ul><ul><li>Starting from MySQL 4.1.5, the new Windows installer makes InnoDB the MySQL default table type on Windows </li></ul><ul><li>You can always specify the engine explicitly: create table mytest1 (col1 int, col2 int) ENGINE = InnoDB; create table mytest1 (col1 int, col2 int) ENGINE = MyISAM; </li></ul><ul><li>Or you can alter table later ALTER TABLE yourtable ENGINE = MyISAM; </li></ul><ul><li>You can influence the default storage engine at runtime via the storage_engine option: set storage_engine = 'MyISAM'; </li></ul><ul><li>To determine the current default, do: select @@storage_engine; SHOW CREATE TABLE yourtable; </li></ul><ul><li>You can also set the default at startup using the --default-storage-engine option </li></ul>
  28. 29. ACID <ul><li>If you are using InnoDB tables (or BerkeleyDB tables), MySQL is ACID compliant. Using the transaction syntax gives you atomicity. Transactions and foreign key constraints give you consistency. You can choose the level of isolation that transactions have from one another. The binary log and repair tools provide durability. (Using replication, you can have a highly durable system without any single point of failure.) </li></ul><ul><li>ACID stands for Atomicity, Consistency, Isolation, and Durability. </li></ul>
  29. 30. 1 st 2 nd and 3 rd normal forms <ul><li>Each column must contain only one value (atomic or indivisible) </li></ul><ul><li>All columns whose values are the same across multiple rows must be turned into their own table and related back. (Primary and Foreign Keys) </li></ul><ul><li>Every non-key column is dependent upon the Primary Key. </li></ul>
  30. 31. 1 st Normal form Each column must contain only one value Delhi 120 Sen Shantanu Calcutta 30 Sen Shantanu Calcutta 20 Oak Shantanu Mumbai 10, 90 Oak Shantanu city ordered surname name Delhi 120 Sen Shantanu Mumbai 90 Oak Shantanu Calcutta 30 Sen Shantanu Calcutta 20 Oak Shantanu Mumbai 10 Oak Shantanu city ordered surname name
  31. 32. 2 nd Normal form All columns whose values are the same across multiple rows must be turned into their own table Delhi 120 Sen Shantanu Mumbai 90 Oak Shantanu Calcutta 30 Sen Shantanu Calcutta 20 Oak Shantanu Mumbai 10 Oak Shantanu city ordered surname name 4 3 2 1 ID Delhi Sen Shantanu Calcutta Sen Shantanu Calcutta Oak Shantanu Mumbai Oak Shantanu city surname name 120 4 90 1 30 3 20 2 10 1 Ordered ID
  32. 33. Normalization <ul><li>Normalization is the process of removing redundant data from your tables in order to improve storage efficiency, data integrity and scalability. </li></ul><ul><li>This improvement is balanced against an increase in complexity and potential performance losses from the joining of the normalized tables at query-time. </li></ul><ul><li>ACID stands for Atomicity, Consistency, Isolation, and Durability. </li></ul><ul><li>Need for Normalization of data </li></ul><ul><ul><li>Avoid repetition </li></ul></ul><ul><ul><li>Save space </li></ul></ul><ul><ul><li>Easy to maintain data </li></ul></ul><ul><li>“ Divide and Rule” </li></ul><ul><ul><li>Divide the data in it’s logical order </li></ul></ul><ul><ul><li>Scalability should be considered </li></ul></ul><ul><ul><li>Portable data is better while transferring or backing up the data </li></ul></ul>
  33. 34. Normalization Example <ul><li>Download the Non – Normalized excel spreadsheet mentioned below and answer the questions… </li></ul><ul><li> </li></ul><ul><li>How do I alter column Value from CHAR to INT? </li></ul><ul><li>How to I update Value column if the value is entered as text NULL instead of NULL value? </li></ul><ul><li>What is the grand total of Primary sales of all products? </li></ul><ul><li>What is the total Primary sales of all products for the month of April, May, June? </li></ul><ul><li>What is the total sales of products whose name starts with A? </li></ul><ul><li>How many values are left blank? </li></ul><ul><li>What Products have blank as their primary values? </li></ul><ul><li>How many primary, secondary and closing values are left blank? </li></ul><ul><li>What is the product wise monthly total? (i.e. Pri + Sec + Closing for each month per product) </li></ul>
  34. 35. Normalization Answers <ul><li>ALTER TABLE sales MODIFY Value INT; </li></ul><ul><li>UPDATE sales SET Value = NULL WHERE Value = 'NULL‘; UPDATE sales SET Value = NULL WHERE Value = '‘; </li></ul><ul><li>SELECT SUM(Value) FROM sales WHERE Type = &quot;Pri“; </li></ul><ul><li>SELECT SUM(Value) FROM sales WHERE Type = &quot;Pri&quot; AND `Month` IN ('April','May','June'); </li></ul><ul><li>SELECT SUM(Value) FROM sales WHERE sales.Code = ? SELECT master.code FROM master WHERE master.Product_Name LIKE &quot;A%“; SELECT SUM(Value) FROM sales WHERE sales.Code IN (SELECT master.code FROM master WHERE master.Product_Name LIKE &quot;A%&quot;); </li></ul><ul><li>SELECT COUNT(*) FROM sales WHERE Value IS NULL; </li></ul><ul><li>SELECT COUNT(*) FROM sales WHERE Type = &quot;Pri&quot; AND Value IS NULL </li></ul><ul><li>SELECT (SELECT COUNT(*) FROM sales WHERE Type = &quot;Pri&quot; AND Value IS NULL) AS PRI_BLANK, (SELECT COUNT(*) FROM sales WHERE Type = &quot;Sec&quot; AND Value IS NULL) AS SEC_BLANK, (SELECT COUNT(*) FROM sales WHERE Type = &quot;Closing&quot; AND Value IS NULL) AS CLO_BLANK; </li></ul><ul><li>SELECT master.Product_Name, SUM(sales.Value) FROM sales LEFT JOIN `master` ON sales.Code = `master`.Code GROUP BY sales.Code; </li></ul>
  35. 36. explain <ul><li>When you precede a SELECT statement with the keyword EXPLAIN, MySQL explains how it would process the SELECT, providing information about how tables are joined and in which order. EXPLAIN is for query optimization analysis for e.g. EXPLAIN SELECT * FROM students </li></ul><ul><li>The output from EXPLAIN shows ALL in the type column when MySQL uses a table scan to resolve a query. The possible types are, from best to worst: system, const, eq_ref, ref, range, index and ALL. </li></ul><ul><li>Only index in the “Extra” column indicates that information will be retrieved from index file without using the dta file. </li></ul><ul><li>SELECT * FROM t1, t2 FORCE INDEX (index_for_column) WHERE t1.col_name=t2.col_name; </li></ul>
  36. 37. PROCEDURE ANALYSE() <ul><li>SELECT * FROM mytable PROCEDURE ANALYSE() </li></ul><ul><li>Explain gives more information about indexes and keys but procedure analyse() gives you more information on data returned. </li></ul>ENUM('34','232') NOT NULL Optimal_fieldtype 133 Avg_value 0 Nulls 0 Empties_or_zeros 232 Max_value 34 Min_value
  37. 38. Performance Tips Part I <ul><li>Optimize WHERE clauses by using the rule of &quot;column operator constant&quot; </li></ul><ul><li>Slow query: `birthdate` + INTERVAL 16 YEAR < NOW()) </li></ul><ul><li>Fast query: `birthdate` < NOW() - INTERVAL 16 YEAR </li></ul><ul><li>The following 3 queries are better in that order… WHERE TO_DAYS(date_col) – TO_DAYS(CURRENT_DATE) < 30 WHERE TO_DAYS(date_col) < 30 + TO_DAYS(CURRENT_DATE) WHERE date_col < DATE_ADD(CURRENT_DATE, INTERVAL 30 DAY) </li></ul><ul><li>The following query will use indexes on date column. SELECT * FROM bills WHERE due_date BETWEEN CONCAT(YEAR(CURDATE()),'-',MONTH(CURDATE())-1,'-1') AND LAST_DAY(CURDATE()) </li></ul><ul><li>Keep in mind EXISTS/NOT EXISTS are for SQL (Parent) Heavy queries otherwise you should use IN/NOT IN </li></ul><ul><li>Use of OPTIMIZE TABLE on a regular basis helps keep performance on the table from degrading. </li></ul>
  38. 39. Performance Tips Part II <ul><li>SELECT price FROM fedex_zones z INNER JOIN fedex_rates r ON AND 94947 WHERE r.weight = 25 ; </li></ul><ul><li>Although correct, many people consider this bad style - </li></ul><ul><li>the ON clause of the JOIN should contain only the join condition(s) and the comparison with the joined table. </li></ul><ul><li>So it would be better to say </li></ul><ul><li>SELECT price FROM fedex_zones z JOIN fedex_rates r ON = AND r.weight = 25 WHERE = 94947 </li></ul>
  39. 40. Performance Tips Part III <ul><li>Index will not be used if % sign is on both sides of the string. Where col_name like “%Mac%” </li></ul><ul><li>Index will be used if the % sign is used at the end of the string. Where col_name like “Mac%” </li></ul><ul><li>You may try using STRAIGHT_JOIN to force a join to be done using tables in a particular order. </li></ul><ul><li>The SQL_CALC_FOUND_ROWS keyword tells MySQL to calculate the total number of rows matching the query. This total number can then be retrieved via a call to the FOUND_ROWS() function. </li></ul><ul><li>To retrieve all records from the specified offset to the end of the table, specify -1 as the number of rows to return. For e.g. SELECT * FROM tbl1 LIMIT 18, -1 </li></ul>
  40. 41. Choose the right Data Type <ul><li>Day is in the range 1 to 31. If so, you could save 3 bytes per row by changing day from INT (4 bytes) to TINYINT (1 byte). Similarly, you could save 1 byte per row by changing yearmonth from INT to MEDIUMINT. 4 bytes per row * 38 million rows = about 150 Mb saved. Smaller rows make disk reads faster, and require less memory to process and cache. Also, smaller columns make for smaller indexes. </li></ul><ul><li>Make all character columns CHAR rather than VARCHAR. The tradeoff is that your table will use more space, but if you can afford the extra space, fixed – length rows can be processed more quickly than variable – length rows. </li></ul><ul><li>if you are only storing positive numbers, make it unsigned, you essentially double your capacity to store positive numbers without changing the column type. </li></ul><ul><li>Declare columns to be NOT NULL so that the query will be faster since it need not check for NULL as a special case. </li></ul><ul><li>Consider using ENUM columns since ENUM values are represented as numeric values internally. </li></ul>
  41. 42. Constraints: Primary, Unique and Index keys <ul><li>A table can have only 1 primary key, but multiple unique constraints. </li></ul><ul><li>Columns in primary keys must be NOT NULL. </li></ul><ul><li>Columns in unique keys can be NULL (if they are NOT NULL, then the unique key is functionally the same as a primary key). </li></ul><ul><li>Simply create a UNIQUE index on the fields which you with to be unique. </li></ul><ul><li>Add into your table creation the line UNIQUE (firstname, lastname) </li></ul><ul><li>OR Alter the table once created using: ALTER TABLE tablename ADD UNIQUE (Column1, Column2) </li></ul>
  42. 43. Define keys Before After name VARCHAR(99) NOT NULL UNIQUE surname VARCHAR(99) NOT NULL UNIQUE UNIQUE KEY (name, surname) ALTER TABLE contacts ADD UNIQUE KEY (name, surname) CREATE UNIQUE INDEX myindex ON contacts (name, surname); DROP INDEX myindex ON contacts
  43. 44. Keys explained Maximum key length is 500 bytes <ul><li>There can only be one AUTO_INCREMENT column and it must be defined as a key. </li></ul><ul><li>A single column can be part of multiple keys. </li></ul><ul><li>Use fulltext index to avoid 500 bytes limitation or to search words those are less than 3 characters </li></ul><ul><li>a UNIQUE index that does not allow NULL is functionally equivalent to a PRIMARY KEY. </li></ul><ul><li>A key made up of more than one column is a composite key. </li></ul><ul><li>The keyword INDEX may be used instead of KEY. </li></ul><ul><li>You can name an index by including the name just before the column list. </li></ul><ul><li>For a PRIMARY KEY, you don't specify a name because its name is always PRIMARY. </li></ul>
  44. 45. Using Full Text Index Part I <ul><li>CREATE TABLE `search_me` ( </li></ul><ul><li>`id` int(11) NOT NULL auto_increment PRIMARY KEY, `search` varchar(255) NOT NULL default '', </li></ul><ul><li>`descr` text NOT NULL, </li></ul><ul><li>FULLTEXT KEY `search` (`search`,`descr`) </li></ul><ul><li>) TYPE=MyISAM </li></ul><ul><li>SELECT `search` FROM `search_me` WHERE MATCH(`search`,`descr`) AGAINST('disney'); </li></ul><ul><li>You Can also alter the table to add the full text key ALTER TABLE <table> ADD FULLTEXT (fields) ALTER TABLE cds ADD FULLTEXT (title); </li></ul>
  45. 46. Select relevance Part II <ul><li>ask MySQL to display the precedence in the results. Simply repeat the MATCH() function in the select field list, as follows: </li></ul><ul><li>mysql> SELECT copy, MATCH(copy) AGAINST('good,alert') AS relevance FROM fulltext_sample WHERE MATCH(copy) AGAINST('good,alert'); </li></ul><ul><li>+---------------------------+------------------+ </li></ul><ul><li>| copy | relevance | </li></ul><ul><li>| A good alert | 1.3551264824316 | </li></ul><ul><li>| An all-out alert | 0.68526663197496 | </li></ul><ul><li>| It appears good from hear | 0.67003110026735 | </li></ul>
  46. 47. Full text Part III <ul><li>any character-based field (CHAR, VARCHAR, or TEXT) as a FULLTEXT index </li></ul><ul><li>It allows for complex text searching against data stored in those fields. This feature is not to be confused with the LIKE function in MySQL. LIKE works more along the lines of a regular expression and hence may or may not take advantage of indexes. On the other hand, FULLTEXT indices are fully indexed fields which support stopwords, boolean searches, and relevancy ratings. </li></ul><ul><li>MATCH (E.title,E.entry) AGAINST ('+vacation -washington' IN BOOLEAN MODE) </li></ul>
  47. 48. Full text Part IV <ul><li>Rows are returned in order of relevance, descending </li></ul><ul><li>It does not index any words that appear in more than 50% of the rows. This means that if your table contains 2 or less rows, a search on a FULLTEXT index will never return anything. </li></ul><ul><li>MySQL does not index any words less than or equal to 3 characters in length. </li></ul><ul><li>By default, your search query must be at least four characters long and may not exceed 254 characters. </li></ul><ul><li>MySQL has a default stopwords file that has a list of common words (i.e., the, that, has) which are not returned in your search. In other words, searching for ‘the’ will return zero rows. </li></ul>
  48. 49. SQL Log files <ul><li>Binary logging will record in a binary file all SQL transactions executed and attempted on the server. By using the mysqlbinlog utility, the contents of the binary log file can be extracted so that the SQL statements may easily be rerun. To enable binary logging, add the following line to your server's options file (i.e., /etc/my.cnf or c:my.ini, depending on your system) in the [mysqld] group: </li></ul><ul><li>log-bin = /var/log/mysql/bin.log </li></ul><ul><li>The exact path to use will depend on your filesystem and your preferences. </li></ul><ul><li> </li></ul>
  49. 50. Constraints: Foreign Key Part I <ul><li>An important aspect of foreign keys is the referential action (ON DELETE CASCADE, ON UPDATE SET NULL, etc) which allow you to have the database take care of cascading actions when a parent row is deleted without having to worry about having an application programmer do it and make a coding error or forget to and leave orphaned rows. </li></ul><ul><li>So if you have a student parent table and an enrollment child table, you can set it to delete rows in the enrollment table for student 123 if student 123 is deleted form the student table (no orphaned rows in enrollment when a student is deleted). You can also set up referential actions to prevent deleting rows from a parent if there are rows in a child table (ON DELETE RESTRICT), it all depends on your situation. </li></ul><ul><li>Foreign Keys also requires that any rows inserted into the the child table MUST have a value that matches in the parent table. </li></ul><ul><li>So for the student/enrollment table, if you attempt to insert a row into enrollment for studentId 342, the only way that query will work is if there is indeed a student with studentId 342 in the student table. This is the &quot;referential integrity&quot; part of Foreign Keys. </li></ul>
  50. 51. Foreign Keys Part II <ul><li>Both tables must be InnoDB type. In the referencing table, there must be an index where the foreign key columns are listed as the first columns in the same order. In the referenced table, there must be an index where the referenced columns are listed as the first columns in the same order. Index prefixes on foreign key columns are not supported. </li></ul><ul><li>For e.g. No score record could be entred for a student that does not exist in the student table. In addition, we could allow cascading deletion such that if a student were deleted from the student table, any score records for the student would automatically be deleted from the score table. </li></ul>
  51. 52. Foreign keys and sets Part III <ul><li>Remember, the idea of SET is that you could store up to 64 values in the same column for a given row, so that you could, in theory, say that Smitty's is a combination pool hall and dry cleaner (plus up to 62 other things if the owner has some additional sidelines, like selling gardening supplies or magazines). </li></ul><ul><li>E.F. Codd, the guy who came up with the theory upon which all relational databases, like DB2 and MySQL, are based, would object strongly to this because his theory said that you should NEVER have more than one value in a column of a given row. </li></ul><ul><li>Now, a quick discussion of primary keys. Codd said that every table needs a primary key: a value that makes each row of a table unique from all of the other rows in the table. </li></ul>
  52. 53. Foreign keys Part IV <ul><li>You could think of the Business_Type_Lookup table as a sort of spell-checker; if it is turned on, then the only values which can go in the business_type column in the Business_Types table are values that are in the business_type column of the lookup table. Therefore, if the lookup table listed only the values 'pool hall', 'dry cleaner' and 'restaurant', those are the only values that could ever be put in the business_type column in the Business_Types table. </li></ul><ul><li>A foreign key is a column in one table that gets its value from a column in ANOTHER table. In this case, our foreign key is the business_type column in the Business_Types table. That's why it contains a FOREIGN KEY clause that refers back to the lookup table's primary key. If the lookup table contains only the values 'foo', 'bar' and 'squeak', then only those three business types can appear in the Business_Types table. This, in effect, is the activation of the 'spell-checker'. </li></ul><ul><li>The beauty of this is that the enforcement of these values is done entirely by the database; you don't have to write any application code to do it. If you try to insert a value in the Business_Type table that isn't in the lookup table, the DATABASE detects this and refuses with a message that tells you what you tried to do wrong. That's why I think lookup tables are a better approach than the 'SET' datatype. </li></ul><ul><li>Oh, one small thing. Since MySQL isn't fully mature yet, foreign keys only get enforced between tables that use the InnoDB engine; therefore, each of our tables should have 'TYPE=InnoDB' in the definition. </li></ul>
  53. 54. Foreign keys – 3 rules part V <ul><li>All the tables in the relationship must be InnoDB tables. </li></ul><ul><li>The fields used in the foreign key relationship must be explicitly indexed in all references tables. </li></ul><ul><li>The data types of all the fields in the foreign key relationship should be similar. This is especially true of integer types, which must match in both size and sign. </li></ul>
  54. 55. Indexes Part I <ul><li>Index is a separate data object in the database that lists the table rows in order to allow rapid lookup. </li></ul><ul><li>Each index for each table is a separate object. </li></ul><ul><li>Primary keys, Unique and foreign keys are automatically indexed. </li></ul><ul><li>Disadvantages of indexes: Each index may be updated when a row is updated, so indexes slow updates, insertions and deletes. </li></ul><ul><li>Disadvantages of indexes: Index file takes up disk space. </li></ul><ul><li>Practical maximum of 3 or 4 indexes per table. If others are needed on occasion, add and drop them as needed. </li></ul><ul><li>If a database is mostly read, use many indexes to speed performance. </li></ul><ul><li>If database is mostly updates, use as few indexes as possible. </li></ul><ul><li>Clustered indexes: Physically rearrange rows by that single index to maximize disk access speed </li></ul><ul><li>An index on a number column should be faster than the same sized char or varchar column. </li></ul><ul><li>When you use indexed column in comparisions, use columns that are of the same type. </li></ul><ul><li>Make sure your column will accommodate your needs, both current and future. </li></ul><ul><li>Basic rule: everything after ON or in a WHERE clause should either be a primary key or indexed, at least when there are many records in the table. </li></ul>
  55. 56. Indexes Part II <ul><li>MySQL will use only one index per query. So having more indexes doesn’t always help. </li></ul><ul><li>Creating a key will make the query execute very fast, but if that is the only reason for the key you are going to be trading quite a lot of space for the speed of one query. How often are you going to run this query? If you have 324 million rows, then that index is going to consume somewhere in the order of 2G or more of disk space. Is it worth using all that space to make one query faster? </li></ul><ul><li>If a table has 1,000 rows, this is at least 100 times faster than reading sequentially. Note that if you need to access almost all 1,000 rows, it is faster to read sequentially, because that minimizes disk seeks. </li></ul><ul><li>MySQL uses multiple-column indexes in such a way that queries are fast when you specify a known quantity for the first column of the index in a WHERE clause, even if you don't specify values for the other columns. </li></ul><ul><li>/* create table syntax should have fulltext(title,body) defined */ SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('india'); </li></ul><ul><li>When specifying an index for TEXT and BLOB types, you must specify a length. as an example... CREATE TABLE test (sValue TINYTEXT NOT NULL, UNIQUE KEY(sValue(90))) CREATE INDEX part_of_name ON customer name(10)) </li></ul>
  56. 57. Possible indexes on 3 column table Index a,b,c in that order cover the single column index on ‘a’ as well as ‘a,b’
  57. 58. Indexes (Example) Part III <ul><li>Customer ID is primary key </li></ul><ul><li>We also want to search by: Customer name (last, first) City, state Postal (zip) code Address </li></ul><ul><li>Index the name, city/state, zip and address Four indexes: slow insert, update, delete, but fast lookup If customer database is fairly stable, this is fine Similar logic for parts catalog, bill of materials, etc. </li></ul><ul><li>Index every word in the entire database; count occurrences and rank matches. </li></ul><ul><li>Recent advances (frequency of links, usage) enhance this. </li></ul>
  58. 59. Indexes Part IV <ul><li>Avoid single column indexes whenever practical. Most useful indexes contain from 2 to 5 fields. </li></ul><ul><li>Don't forget that PRIMARY keys and UNIQUE constraints are also indexes. </li></ul><ul><li>Design your indexes after your most common or frequently used query patterns. Analyze your WHERE clauses first, then look at speeding up certain queries by considering values in your ORDER BY clauses. </li></ul><ul><li>Learn how to use EXPLAIN. It will give you excellent advice on how to help your queries. </li></ul><ul><li>If the query doesn't use the index then you could use the FORCE INDEX to ensure it does. The FORCE INDEX is only there as of 4.0.9, if that can't be used then try USE INDEX. </li></ul><ul><li>Function call or an arithmetic expressions on a columns prohibits it from using indexes. In short, indexes are not used if you are using functions like lower(col_name) while comparing the text. You will need to reorganize the query, if possible, to take advantage of indexes. </li></ul>
  59. 60. Indexes Part V <ul><li>A column that has ‘yes’ or ‘no’ for content won’t be improved by indexing. On the other hand, a column where the values are unique (for example, Social Security Number) can benefit greatly from indexing. </li></ul><ul><li>The smallest or largest value for an indexed column can be found quickly without examining every row when you use the MIN() and MAX() functions. </li></ul><ul><li>MySQL can often use indexes to perform sorting operations quickly for ORDER BY clause </li></ul><ul><li>Sometimes MySQL can avoid reading the data file entirely. Suppose you’re selecting values from an indexed numeric column and you’re not selecting other columns from the table. In this case, by reading an index value, you’ve already got the value you’d get by reading the data file. There’s no reason to read values twice, so the data file need not even be consulted. </li></ul>