Monday, April 18, 2011

SQL crash course

1 Database
~~~~~~~~~~~~~~~~~~~~~~~

Before you start to create database you're going to need to have better idea of
  what kinds of data you're going to want to store and some ways to categorizing it.

  Database is a container that holds tables and other SQL structures related to
  those tables. Tables are connected in some way.

2  Tables
~~~~~~~~~~~~~~~~~~~~~~~
What is a table anyway?

A column is a piece of data stored by your table. A row is a single set of columns that describe attributes of single thing.
Columns and rows together make up a table.
 
  You can identify categories for the type of data you're collecting. Your categories then become your
  columns. Type could be person, employer etc. Type is your row.

  You need a database for your tables:
CREATE DATABASE zlatozar_test;
You have to tell RDBMS which database you are going to use:
USE zlatozar_test;
And remember SQL is case insensitive! Capitalization and underscores help you to program in SQL.
CREATE TABLE doughnut_list
  (
  doughnut_name VARCHAR(10),
  doughnut_type VARCHAR(6)
  );
Choosing the best matching data type for each column in your table will reduce the size and make operations on your data faster. You can't recreate an existing table or database. Here is some basic rules: 1. Break you data up in categories before you create your table. Pay special attention to the type of data for each column. 2. TIMESTAMP is usually used to capture the current time. DATETIME is best used to store a future event. 3. Check your work with:
DESC doughnut_list;
How to delete a table?
DROP TABLE doughnut_list;
DROP TABLE deletes your table and any data in it. Now it is possible to recreate table and put new categories. How to insert in a table?
INSERT INTO your_table (col_name1, col_name2, ....)
  VALUES ('val1', 'val2', ....);
Values has to be in the same order as the column names. Any value that goes into a VARCHAR, CHAR, DATE, or BLOB column has single quotes around it. DEC and INT don't use quotes. You can insert few columns and leave some out. NULL will be inserted for missing. NULL is not zero, it is "nothing" - undefined value. A column with a NULL value IS NULL, but does not EQUAL NULL! A good practice is to avoid it, but how? Check what is inserted:
SELECT * FROM doughnut_list;
- You can control NULL!
CREATE TABLE doughnut_list
  (
      doughnut_name VARCHAR(10) NOT NULL,
      doughnut_type VARCHAR(6) NOT NULL
  );
- You can add defaults.
CREATE TABLE doughnut_list
  (
      doughnut_name VARCHAR(10) NOT NULL,
      doughnut_type VARCHAR(6) NOT NULL,
      doughnut_cost DEC(3, 2) NOT NULL DEFAULT 1.00
  );
Using DEFAULT value fills the empty columns with a specified value. 3 SELECT statement ~~~~~~~~~~~~~~~~~~~~~~~ 3.1 WHERE ========== We use WHERE clause that five the RDBMS something specific to search for. SELECT well return the rows that match the condition. The VARCHAR, CHAR, BLOB, DATE and TIME data types needs single quotes. DEC and INT, no not. RDBMS will ignore the quotes and treat your DEC and INT values as numbers, even though the quotes indicate they are text values! When inserting quotes have to be escaped with *\* or double *'*. You can specify which columns wish to see. Use '*' to see all. Use AND, OR for better WHERE condition. We can have more than one AND, OR. = equal <> not equal < less than > greater than <= and => Exception is NULL!
SELECT drink_name FROM drink_info WHERE calories IS NULL;
3.2 LIKE ========= % stand-in for any number of unknown character. LIKE '%CA', any end with CA _ stand-in for only one unknown character. 3.3 Selecting ranges ===================== 1st .... WHERE calories > 30 AND calories < 50; 2nd ... WHERE calories BETWEEN 30 AND 50; (including 30 and 50); 3.4 IN ======= ... rating IN ('innovative', 'fabulous'); 3.5 NOT ======== NOT lets you negate your results and get the opposite values. 4 DELETE and UPDATE ~~~~~~~~~~~~~~~~~~~~~~~~ You can't relay on the rows in the table being in chronological order. You can't use DELETE to delete the value from a single column or tableful of column. You can delete every row from the table with:
DELETE FROM your_table;
Be careful with DELETE, use SELECT first to be sure you will delete the right data. You can use UPDATE tho update a single row or multiple rows, depending on the WHERE clause. Update can replace DELETE/INSERT combination. UPDATE statements can be used on multiple records in your table. Use them with basic math operations to manipulate your numeric values. 5 Importance to be Normal ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The simpler query the better! How you're going to use your data will affect how you set up your table. Relational database means how the columns relate to each other to describe the thing. The challenge is to describe the thing using column in a way that makes getting the information out of it easy. Here is a simple plan: 1. Pick your thing, the one thing you want your table to describe. (What's the main thing you want your table to be about?) 2. Make a list of the information you need to know about your one thing when you're know about your one thing when you're using the table. (How will you use this table?) 3. Using the list, break down the information about your thing into pieces you can use for organizing your table. (How can you most easily query this table?) Atomic data: Data is atomic if it's been broken down into smallest piece of data that can't or shouldn't be divided. Simple rules to have atomic data: 1. What is the ONE THING your table describes? 2. How will you use the table to get at the ONE THING? (Design your table to be easy to query) 3. Do your columns contain atomic data to make your queries short and to the point? TIP: Making your data atomic means breaking it down into the smallest pieces that you need to create an efficient table, not just the smallest possible pieces you can. Here is the official rules for atomic data: Rule 1: A column with atomic data can't have several values of the same type of data in a column. (You don't have to look for cell information using LAKE) Rule 2: A table with atomic data can't have multiple columns with the same type of data. teacher student_1 student_2 student_3 ----------+------------+------------+------------ Zlatozar Eli Rosi Monika Too many student columns! Making your table NORMAL, means they follow some standard rules, so making your data atomic is the first step in creating a NORMAL table! Benefits of NORMAL tables - Normal tables won't have duplicate data, which will reduce the size of your database. - Faster queries. 5.1 1NF ======== Each row of data must contain atomic values. Each row of data must have a unique identifier, known as Primary Key (no repeating groups of data) Hm, and what is Primary Key? A primary key is a column in you table that makes each record unique. Primary Key rules: - A primary key can't be NULL. - The primary key must be given a value when the record is inserted. - The primary key must be compact - The primary key values can't be changed TIP: The best primary key may be a new primary key. A key made of two or more columns is known as a COMPOSITE KEY. COMPOSITE KEY is a primary key composed of multiple columns, creating a unique key. TIP: SHOW CREATE TABLE my_table_name; gives you the SQL for table creation and you can use copy paste. Try also:
SHOW COLUMNS FROM table_name;
  SHOW CREATE DATABASE databse_name;
  SHOW INDEX FROM table_name;
Example for table creation that has primary key:
CREATE TABLE my_contacts
  (
      contact_id INT NOT NULL AUTO_INCREMENT,
      ............
      PRIMARY KEY (contact_id)
  )
You can add primary key with ALTER. It will add keys to all rows. 6 ALTER ~~~~~~~~~~~~ The ALTER command allows you to change almost everything in your table without having to reinsert data. But if you change type of the column to different one, you risk losing your data. ALTER TABLE can help you only to improve your table design, nothing more. - Use RENAME
ALTER TABLE projects RENAME TO project_list;
- Use CHANGE - modify both the name and data type of an existing column
ALTER TABLE project_list
  CHANGE COLUMN number proj_id INT NOT NULL AUTO_INCREMENT,
  ADD PRIMARY KEY ('proj_id');
You can change more than one column in a single statement. - Use MODIFY - modify the data type or position of an existing column
ALTER TABLE project_list
  MODIFY COLUMN proj_desc VARCHAR(120);
- Use ADD - add a column to your table - you pick the data type
ALTER TABLE my_contacts
  ADD COLUMN contact_id INT NOT NULL AUTO_INCREMENT FIRST,
  ADD PRIMARY KEY (contact_id);
FIRST says that the column will be first in the table. Also you can use: AFTER your_column, BEFORE your_column, SECOND, THIRD and you get the idea ;) - Use DROP - drops a column from your table
ALTER TABLE project_table
  DROP COLUMN start_date;
Once you've dropped a column, everything that was stored in it is removed too! TIP: Use SELECT first to see data that you intend to drop. 7 Advanced SELECT ~~~~~~~~~~~~~~~~~~~~~~ 7.1 CASE ========= The CASE expression combines all the UPDATE statements by checking an existing column's value against a condition.
UPDATE my_table
  SET new_column =
  CASE
  WHEN column1 = somevalue1
  THEN newvalue1
  WHEN column2 = somevalue2
  THEN newvalue2
  ELSE newvalue3
  END;
Order matters! Also you can use all ADD and OR combinations in WHEN clause. 7.2 ORDER BY ============= If you want to order your query then try ORDER BY. ORDER BY allows you to alphabetically order any column.
SELECT
  ....
  ORDER BY column_name;
It is possible to order by two columns. Better, you can sort by as many columns as you need. Example:
SELECT * FROM movie_table
  ORDER BY category, purchased, title;
Order by category, then for a particular category order by purchased, and then for a particular purchased film order by title. (category (purchased (title))) Use keyword DESC after your column name in ORDER BY clause to reverse the order of you results. 7.3 GROUP BY =============
SELECT first_name, SUM(sales)
  FROM cookie_sales
  GROUP BY first_name
  ORDER BY SUM(sales) DESC;
Group together all the first_name values. Imagine that GROUP BY divide the table for every first_name and then pass every table to the SUM function(see below). In this way for every name we have the total sales (sum of all sales) from high-to-low. TIP: Using GROUP BY combines the duplicates into one single value for each group. But be careful this will not work for tables like this Interests -------------- books, sport pets, books *(Hint: Use joins)* 7.4 SQL functions ================== SQL language has some special keywords called functions. Each function is a bit a code that preform an operation on a value or values. 7.4.1 SUM ----------
SELECT SUM(sales)
  FROM cookie_sales
  WHERE first_name = 'Nicole';
SUM function totals the values in the sales column. 7.4.2 AVG, MIN, MAX, COUNT --------------------------- COUNT will return the number of rows in a column - single number
SELECT COUNT(*)
  FROM Student;
Return the number of rows in a table. 7.5 SELECT DISTINCT ====================
SELECT COUNT(DISTINCT sale_data)
  FROM cookie_sales;
Result will be one row because of the COUNT. DISTINCT is a keyword. 7.6 LIMIT ========== Limits the number of results. Also possible
LIMIT 2
  LIMIT 0,4
  LIMIT 1,1
8 Multi-table database design ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ TIP: Your table design should do the heavy lifting for you. Don't write convoluted queries to "get around" a badly designed table. Ignoring the problem isn't the answer. We need to thing outside of a single table. How to add another table? TIP: We need to move the non-atomic columns in our table into new tables! A description of the data (the columns and tables) in your database, along with any other related objects and the way they all connected in known as a SCHEMA. TIP: Creating a diagram of your table lets you keep the design of the table separate from the data that's inside of it. How to go from one table to two? Here is the algorithm: - Remove the column and put it in its own table - Add columns that will let us identify it. Example: my_contacts interests ----------------- ----------------- contact_id(pk) int_id(pk) last_name last_name first_name first_name interests How to connect tables? 'first_name' and 'last_name' combination is not unique! my_contacts interests ----------------- ----------------- contact_id(pk) int_id(pk) last_name interests first_name contact_id(fk) contact_id in interest table is foreign key! The FOREIGN KEY is a column in a table that references the PRIMARY KEY of another table! 'contact_id' is unique so we know the reference. Foreign key facts: 1. A foreign key can have a different name than the primary key it comes from. 2. The primary key used by a foreign key is also known as a PARENT KEY. The table where the primary key is from is known as a parent table. 3. The foreign key can be used to make sure that the rows in one table have corresponding rows in another table. 4. Foreign key values can be null, even though primary key values can't. 5. Foreign keys don't have to be unique - in fact, they often aren't. A NULL foreign key means that there's no matching primary key in the parent table! But we can make sure that a foreign key contains a meaningful value, one that exist in the parent table, by using a CONSTRAINT. Constraint defines rules for relation. Creating a foreign key as a constraint in your table gives you definite advantages. You'll get errors if you violate the rules, which will stop you accidentally doing anything to break the table. Referential integrity: You will only be able to insert values into your foreign key that exist in the table the key came from, the parent table. TIP: You can use a foreign key to reference a unique value in the parent table. It doesn't have to be the primary key of the parent table, but it must be unique. TIP: Foreign key could be added with ALTER TABLE. Example:
CREATE TABLE interests
  (
      int_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
      interest VARCHAR(50) NOT NULL,
      contact_id INT NOT NULL,
      CONSTRAINT my_contacts_contact_id_FK
      FOREIGN KEY (contact_id)
      REFERENCES my_contacts (contact_id);
  )
TIP: How to form constraint names? parent_table_foreign_key_name>_FK There is many types of relation. 8.1 ONE-TO-ONE =============== One-To-One: exactly one row of a parent table is related to one row of a child table. We used if for example we would like to isolate some data from the parent table. Advantages: - Pulling the data out may allow you to write faster queries. - If you have a column containing values you don't yet know, you can isolate it and avoid NULL values in your main table. - You may wish to make some of your data less accessible. - If you have a large piece of data, a BLOB type for example, you may want that large data in a separate table. 8.2 ONE-TO-MANY ================ One-to-many: a record in Table A can have MANY matching records in Table B, but a record in Table B can only match ONE record in Table A. 8.3 MANY-TO-MANY ================= Unacceptable! How to fix that? We need a table to step in between these two many-to-many tables and simplify the relationship to one-to-many. We need what is called a JUNCTION TABLE, which contain the primary key columns of the two tables we want to relate. In this way we have tow one-to-many relationships. When a column's data must change when another column's data is modified, the first column is functionally dependent on the second. For example if we have table Person with columns name and initials. Initials column is functional dependent on column name. We say that column initials is dependent column. A partial functional dependency means that a non-key column is dependent on some, but not all, of the columns in a composite primary key. If in table Person, primary key is SSN and name, initials is partially dependent on name. If changing any of the non-key columns might cause any of the other columns to change, you have a transitive dependency. You can say: when any non-key column is related to any of the other non-key columns. But how to avoid partial dependency? One simple way is to use ID for a primary key. Adding primary key columns to our tables is helping us achieve 2NF, because the second normal form focuses on how the primary key in a table relates to data in it. Your 1NF table is also 2NF if all the columns in table are part of the primary key OR it has a single column primary key. More strictly here is the rules: Rule 1: Be in 1NF Rule 2: Have no partial functional dependencies. Tip: Any table with an artificial primary key and no composite primary key is always 2NF. Third Normal Form or 3NF: Rule 1: Be in 2NF Rule 2: Have no transitive dependencies Practical rule for decomposition to 3NF: If A is in functional dependency with B then A is a key in decomposed tables! Example: table Apply(SSN, Student_Name, Collage_Name) SSN is in functional dependency with Student_Name => Create two tables with key SSN, table Student(SSN, Student_Name) and table Apply(SSN, Collage_Name). Nirvana, or what is 4NF Forth Normal Form or 4NF: Rule 1: Be in 3NF Rule 2: Have no multivalued dependencies Let's explain this with example. Imagen that you have table Apply(SSN, Student_Name, High_School). It is in 3NF because there is no functional dependencies, but SSN is in multivalued dependencies with Student_Name and High_School because it is a key for every Student_Name and High_School combination (Student_Name*High_School - number of conbinations). Practical rule for decomposition to 4NF: If A is in multivalued dependency with B and C then A is a key in decomposed tables! Example: table Apply(SSN, Student_Name, High_School) SSN is in multivalued dependency with Student_Name and High_School => Create two tables with key SSN, table Apply(SSN, Student_Name) and table HighSchool(SSN, High_School). 9 Start with joins ~~~~~~~~~~~~~~~~~~~~~~~ Because now we have many table and all the information is spread. Using joins we can collect data easily. 9.1 AS ======= Variant 1:
CREATE TABLE profession
  (
  id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  profession VARCHAR(20)
  ) AS
  SELECT profession FROM my_contacts
  GROUP BY profession
  ORDER BY profession;
Variant 2:
CREATE TABLE profession AS
  SELECT profession FROM my_contacts
  GROUP BY profession
  ORDER BY profession;
ALTER TABLE profession
  ADD COLUMN  id INT(11) NOT NULL AUTO_INCREMENT FIRST,
  ADD PRIMARY KEY (id);
In both variants AS is used to references result of a query. It is used also for name aliases.
CREATE TABLE profession
  (
  id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  profession VARCHAR(20)
  ) AS
  SELECT profession AS mc_prof FROM my_contacts
  GROUP BY mc_prof
  ORDER BY mc_prof;
The alias change the name of the column in result but doesn't change the original column name in any way. An alias is temporary! Table aliases are also called correlation names. They are very useful in long queries. TIP: Some time we can alias without AS:
SELECT profession mc_prof
  FROM my_contacts mc
  GROUP BY mc_prof
  ORDER BY mc_prof;
Before we start with joins, let's define simplest join called - Cartesian join, Cartesian product, cross product. It means join all! Toys table Boys table toy_id toy boy_id boy ---------+---------------- ---------+---------------- 1 hula hoop 1 Zlatko 2 balsa glider 2 Mitko 5 baseball cards The CROSS JOIN returns every row from one table crossed with every row from second
SELECT t.toy, b.boy
  FROM toys AS t
  CROSS JOIN boys AS b;
(or you can just write: SELECT toys.toy, boys.boy FROM toys, boys;) This join get number of toys multiplied by numbers of boys to account for every possible combination. TIP: Using CROSS JOIN by mistake you can hanging your machine. But why we need CROSS JOIN? Answer is simple, to understand better INNER JOIN. An INNER JOIN is a CROSS JOIN with some result rows removed by a condition in a query. Here is the syntax:
SELECT some_columns
  FROM table1
  INNER JOIN
  table2
  ON some_condition;
An INNER JOIN combines the records from two tables using comparison operators in a condition. 9.2 EQUIJOIN ============= This is a inner joins test for equality.
SELECT t.toy, b.boy
  FROM boys
  INNER JOIN
  toys
  ON boys.toy_id = toys.toy_id;
9.3 NON-EQUIJOIN =================
SELECT t.toy, b.boy
  FROM boys
  INNER JOIN
  toys
  ON boys.toy_id <> toys.toy_id
  ORDER BY boys.boy;
9.4 NATURAL JOIN ================= Natural joins only work if the column you're joining by has the same name in both tables. Toys table Boys table toy_id toy toy_id boy ---------+---------------- ---------+---------------- 1 hula hoop 1 Zlatko 2 balsa glider 2 Mitko 5 baseball cards
SELECT boys.boy, toys.toy,
  FROM boys
  NATURAL JOIN
  toys;
This two tables are joined on 'toy_id' column. Natural join inner joins identify matching column names. TIP: You can join more than one table The same effect you can achieve using query inside a query - subqueries! 10 SUB-QUERIES ~~~~~~~~~~~~~~~~~~~ Simple data base to illustrate the queries: [Simple database]
A subquery is a query that is wrapped within another query. It's also called an INNER query. Subquery is nothing more than a query inside another query. The outside query is known as containing query, or outer query. The query on the inside is the inner query, or subquery. Example 1:
SELECT some_column, another_column
  FROM table
  WHERE column = (SELECT column FROM table);
Because it use the = operator, this subquery will return a single value, one row from one column (sometimes called a cell, but in SQL known as a scalar value), which is compared to the columns in the WHERE clause. If it returns more you'll get an error. Example 2: JOINS <=> SUBQUERYS
SELECT last_name, first_name
  FROM my_contacts
  WHERE zip_code = (SELECT zip_code FROM
  zip_code WHERE city = 'Memphis' AND state = 'TN');
is equivalent to:
SELECT last_name, first_name
  FROM my_contacts mc
  NATURAL JOIN zip_code zc
  WHERE zc.city = 'Memphis' AND zc.state = 'TN';
TIP: Joins are faster, subquerys are more readable. Because subquerys are very verbose it is a good idea to use aliases - column aliases. Example:
SELECT mc.first_name AS firstname, mc.last_name AS lastname,
  mc.phone AS phone, jc.title AS jobtitle
  FROM job_current AS jc
  NATURAL JOIN my_contacts AS mc
  WHERE jobtitle IN (SELECT title FROM job_listings);
The tricky part about subqueries isn't the structure; it's figuring out what part of the query needs to be the subquery. Or even if you need one at all. Example: Who make the most money out of all of my contacts? - Identify a query that answers part of the question.
SELECT MAX(salary) FROM job_current;
- Continue dissecting your query. We need to select names.
SELECT mc.first_name, mc.last_name
  FROM my_contacts AS mc;
- Finally, figure out how to link the two.
SELECT mc.first_name, mc.last_name, jc.salary
  FROM my_contacts AS mc
  NATURAL JOIN job_current AS jc;
- And now add the WHERE clause to link the two
SELECT mc.first_name, mc.last_name, jc.salary FROM my_contacts AS mc
  NATURAL JOIN job_current AS jc
  WHERE jc.salary = (SELECT MAX(jc.salary) FROM job_current jc);
If a subquery is used as a column expression in a SELECT statement, it can only return one value from one column.
SELECT mc.first_name, mc.last_name,
  (SELECT state FROM zip_code WHERE mc.zip_code = zip_code) AS state
  FROM my_contacts mc;
If the subquery stands alone and doesn't reference anything form the outer query, it is a noncorrelated subquery.
SELECT mc.first_name, mc.last_name, jc.salary
  FROM my_contacts AS mc
  NATURAL JOIN job_current AS jc
  WHERE jc.salary > (SELECT jc.salary
                  FROM my_contacts mc NATURAL JOIN job_current jc
                  WHERE email = 'some@somedomain.com");
In noncorrelated subquery you can use IN or NOT IN to test if the values returned in the subquery are members of a set (or not). This is possible because noncorrelated subqueries could return more than one column.
SELECT mc.first_name, mc.last_name, mc.phone, jc.title
  FROM job_current jc
  NATURAL JOIN my_contacts mc
  WHERE jc.title NOT IN (SELECT title FROM job_listings);
A very common use for correlated subqueries is to find all the rows in the outer query for which no rows exist in a related table:
SELECT mc.first_name firstname, mc.last_name lastname, mc.email email
  FROM my_contacts mc
  WHERE NOT EXIST
  (SELECT * FROM job_current jc
  WHERE mc.contact_id = jc.contact_id);
NOT EXIST finds the first and last names and so on form 'my_contacts' table who are not currently listed in the 'job_current' table. TIP: You can use just EXIST What is the best approach when trying to construct a subquery inside a subquery? Your best bet is to write little queries for the various parts of the question. Then look at them and see how you need to combine them. Also correlated subqueries exist! A correlated subquery means that the inner query relies on the outer query before it can be resolved. Example: People how have three interests.
SELECT mc.first_name, mc.last_name
  FROM my_contacts AS mc
  WHERE 3 = (
           SELECT COUNT(*) FROM contact_interest
           WHERE contact_id = mc.contact_id
           );
The outer query has to be executed before we know what the value of 'mc.contact_id' is! Inner query uses the same aliases or correlation name for 'my_contacts'. mc that was created in the outer query. 11 Outer joins, self-joins, and unions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ With an inner join, you're comparing rows from two table, but the order of those two tables doesn't matter. Outer joins have more to do with the relationship between two tables than the inner joins. The LEFT OUTER JOIN takes all the rows in the left table and maches them to rows in the left table and matches them to rows in the right table. It is useful when the left table and the right table have a one-to-many relationship. The left outer join matches EVERY ROW in the LEFT table with a row from the right table. The big secret to understanding an outer join is to know which table is on the left and which is on the right.
SELECT t1.column, t2.column
  FROM table1 t1
  LEFT OUTER JOIN table2 t2
  ON t1.table_id = t2.table_id;
table1 is the LEFT, table2 is RIGHT. So LEFT table will "say" how many columns result table will have. The big difference is that an outer join gives you a row whether there's a match with the other table or not! So then, a NULL value in the results of a left outer join means that the right table ha NO VALUES that corresponding to the left tables. The same with the RIGHT OUTER JOIN. The right outer join evaluates the right table against the left table. TIP: It will be easier to always stick with one, say the left outer join. You can use the same table as both the right and the left table in an outer join - self-join! But when it will be handy? A SELF-REFERENCING foreign key is the primary key of a table used in that same table for another purpose. The self-referencing part means that it is a key that is referencing another field in the same table. In tables where we have self-referencing we can use self-join. In this way we simulate having two tables. The self-join allows you to query a single table as though there were two tables with exactly the same information in them. 11.1 Unions ============ Except JOINS and SUBQUERIES there is another way to get multi-table information - UNIONS. UNION combines the results of two or more queries into one table, based on what you specify in the list of the SELECT. Example:
SELECT title FROM job_current
  UNION
  SELECT title FROM job_desired
  UNION
  SELECT title FROM job_available;
UNION can take one ORDER BY at the end of the statement. This is because UNION concatenates and groups the results from the multiple SELECT statements.
SELECT title FROM job_current
  UNION
  SELECT title FROM job_desired
  ORDER BY title;
Remember that the number of columns in each SELECT statement must match. Also UNION suppresses all duplicate values from results. If you DO want to see duplicate, you can use operator UNION ALL!
SELECT title FROM job_current
  UNION ALL
  SELECT title FROM job_desired
  ORDER BY title;
More useful example:
CREATE TABLE my_union AS
  SELECT title FROM job_current
  UNION
  SELECT title FROM job_desired;
11.2 INTERSECT and EXCEPT ========================== INTERSECT and EXCEPT are used in much the same way as UNION - to find parts of queries that overlap. INTERSECT returns only those columns that are in the first query and also in the second query. EXCEPT returns only those columns that are in the first query, but NOT in the second query. TIP: This two operations DO NOT EXIST in MySQL. TIP: It is a good practice to be consistent. Subqueries and joins are equivalent and if you have started using joins, used all the time. The same is for subqueries usage. Using joins and subqueries for self-join - Join
SELECT c1.name, c2.name AS boss
  FROM person_info c1
  INNER JOIN person_info c2
  ON c1.boss_id = c2.id;
- Subqueries
SELECT c1.name,
  (SELECT name FROM person_info
  WHERE c1.boss_id = id) AS boss
  FROM person_info c1;
11.3 A Visual Explanation of SQL Joins ======================================= Assume we have the following two tables. Table A is on the left, and Table B is on the right. We'll populate them with four records each. id name id name -- ---- -- ---- 1 Pirate 1 Rutabaga 2 Monkey 2 Pirate 3 Ninja 3 Darth Vader 4 Spaghetti 4 Ninja Let's join these tables by the name field in a few different ways and see if we can get a conceptual match to those nifty Venn diagrams.
SELECT * FROM TableA
  INNER JOIN TableB
  ON TableA.name = TableB.name
[Inner Join]
id name id name -- ---- -- ---- 1 Pirate 2 Pirate 3 Ninja 4 Ninja Inner join produces only the set of records that match in both Table A and Table B.
SELECT * FROM TableA
  FULL OUTER JOIN TableB
  ON TableA.name = TableB.name
id name id name -- ---- -- ---- 1 Pirate 2 Pirate 2 Monkey null null 3 Ninja 4 Ninja 4 Spaghetti null null null null 1 Rutabaga null null 3 Darth Vader [Full Outer Join]
Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null.
SELECT * FROM TableA
  LEFT OUTER JOIN TableB
  ON TableA.name = TableB.name
id name id name -- ---- -- ---- 1 Pirate 2 Pirate 2 Monkey null null 3 Ninja 4 Ninja 4 Spaghetti null null [Left Outer Join]
Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.
SELECT * FROM TableA
  LEFT OUTER JOIN TableB
  ON TableA.name = TableB.name
  WHERE TableB.id IS null
id name id name -- ---- -- ---- 2 Monkey null null 4 Spaghetti null null [Exclude Right Side]
To produce the set of records only in Table A, but not in Table B, we perform the same left outer join, then exclude the records we don't want from the right side via a where clause.
SELECT * FROM TableA
  FULL OUTER JOIN TableB
  ON TableA.name = TableB.name
  WHERE TableA.id IS null 
  OR TableB.id IS null
id name id name -- ---- -- ---- 2 Monkey null null 4 Spaghetti null null null null 1 Rutabaga null null 3 Darth Vader [Exclude Both Sides]
To produce the set of records unique to Table A and Table B, we perform the same full outer join, then exclude the records we don't want from both sides via a where clause. There's also a cartesian product or cross join, which as far as I can tell, can't be expressed as a Venn diagram:
SELECT * FROM TableA
  CROSS JOIN TableB
This joins "everything to everything", resulting in 4 x 4 = 16 rows, far more than we had in the original sets. If you do the math, you can see why this is a very dangerous join to run against large tables. 12 Constraints, views, and transactions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 12.1 Adding CHECK CONSTRAINT ============================= A constraint is a restriction on what you can insert into a column. Constraints are added when we create a table. Some of the constraints we've already seen include NOT NULL, PRIMARY KEY, FOREIGN KEY, and UNIQUE. There's another sort of column constraint, called a CHECK. A CHECK constraint restrict what values you can insert into a column. It uses the same conditionals as WHERE clause. Example:
CREATE TABLE piggy_bank
  (
      id INT AUTO_INCREMENT NOT NULL PRIMARY KEY,
      coin CHAR(1) CHECK (coin IN ('P', 'N', 'D', 'Q'))
  )
If the value you're trying to insert fails the CHECK condition, you get an error. TIP: CHECK doesn't enforce data integrity in MySQL. In MySQL you can work around it with triggers. 12.2 Views =========== If you have a long an complex query that you have to run constantly you can crate a view. Creating view is really simple:
CREATE VIEW web_designers AS
  SELECT mc.first_name, mc_last_name, mc_phone
  FROM my_contacts mc
  NATURAL JOIN job_desired jd
  WHERE jd.title = 'Web Designer';
Instead of CREATE TABLE you write CREATE VIEW. Then you treat is as table:
SELECT * FROM web_designers;
TIP: When you actually use your view in a query, it's behaving as though it were a subquery. Let's dissect SELECT query. What it says is "Select everything from the subquery that returns the first/last name and phone of all the people from 'my_contacts' who are looking for a job as web designer". In code this looks like this:
SELECT * FROM (SELECT mc.first_name, mc.last_name, mc.phone
                FROM my_contacts mc
                NATURAL JOIN job_desired jd
                WHERE jd.title = 'Web Designer') AS web_designers;
(We gave a name to our subquery - web_designers) A VIEW is basically a table that only exist when you use the view in a query. It's considered a virtual table because it acts like a table, and some operations that can be performed on a table can be preformed on a view. TIP: Virtual table doesn't stay in the database. It gets created when we use the view and then deleted. Why views are good for your database? 1. You can keep changes to your database structure from breaking applications that depend on your tables. 2. Views make your life easier bey simplifying you complex query into a simple command. 3. You can create views that hide information that isn't needed by the user. Is it possible to inserting, updating, and deleting with views? Yes, but it depends. If your view used aggregate values (like SUM, COUNT, and AVG), you won't be able to use it to change data. Also, if your view contains GROUP BY, DISTINCT, or HAVING, it won't change data either. Most of the time it might be easier to INSERT, UPDATE, and DELETE the old-fashioned way. TIP: CHECK OPERATION checks each query you try to INSERT or UPDATE to see if it's allowed according to the WHERE clause in your view An updatable view is a view that allows you to change the underlying tables. The important point here is that an updatable view includes all the NOT NULL columns form the tables it references. That way, when you INSERT using view, you can be certain that you will have a value for every column you required to have a value in. The opposite for non-updatable views, it includes all the NOT NULL columns. When you're finished with your view, just drop it.
DROP VIEW view_name;
TIP: CHECK constraints and views both help maintain control when you have multiple users. 12.3 Transaction ================= A transaction is a set of SQL statements that accomplish a single unit of work. During a transaction, if all the steps can't be completed without interference, none of them should be completed. What is ACID? Atomicity: All of the pieces of the transaction must be completed, or none of them will be completed. Consistency: A complete transaction leaves the database in a consistent state at the end of the transaction. Isolation: It means that every transaction has a consistent view of the database regardless of other transactions taking place at the same time. Durability: After the transaction, the database needs to save the data correctly and protect it from power outages or other threads. Example:
START TRANSACTION;
  ........
  /* START TRANSACTION keeps track of all the SQL that follows until you enter either
  COMMIT or ROLLBACK */
  .......
  COMMIT;
TIP: No changes will occur to the database until you COMMIT. What is storage engine? Storage engine is the behind-the-scenes structure that stores all your database data and structures. Some types allows transaction; some types do not, but they are faster. TIP: For MySQL you need make sure your storage engine is either BDB of InnoDB, the two choices that support transactions. RDBMS keeps a record of everything that has been done when you are inside a transaction. It's called transaction log, and it keeps getting bigger and bigger the more you do in transaction. It's best to save using transactions for when you really need to be able to undo what you're doing to avoid wasting space and making your RDBMS faster. 13 Security ~~~~~~~~~~~~~~~~~ See your database documentation. TIP: You can control exactly what users can do to tables and columns with the GRANT statement. Example:
GRANT SELECT ON my_contacts TO programmer;
If you decide to remove privilege you need REVOKE. Example:
REVOKE SELECT ON my_contacts FROM programmer;
There is a lot details here but they are more for database administrators, that's why we just skip them.

No comments:

algorithms (1) cpp (3) cv (1) daily (4) emacs (2) freebsd (4) java (3) javascript (1) JSON (1) linux (2) Lisp (7) misc (8) programming (16) Python (4) SICP (1) source control (4) sql (1) думи (8)