От дума на дума: SQL crash course

1 Database
~~~~~~~~~~~~~~~~~~~~~~~

Before you start to create database you're going to need to have better idea of
  what kinds of data you're going to want to store and some ways to categorizing it.

  Database is a container that holds tables and other SQL structures related to
  those tables. Tables are connected in some way.

2  Tables
~~~~~~~~~~~~~~~~~~~~~~~
What is a table anyway?

A column is a piece of data stored by your table. A row is a single set of columns that describe attributes of single thing.
Columns and rows together make up a table.
 
  You can identify categories for the type of data you're collecting. Your categories then become your
  columns. Type could be person, employer etc. Type is your row.

  You need a database for your tables:
CREATE DATABASE zlatozar_test;
You have to tell RDBMS which database you are going to use:

USE zlatozar_test;
And remember SQL is case insensitive!

  Capitalization and underscores help you to program in SQL.
CREATE TABLE doughnut_list
  (
  doughnut_name VARCHAR(10),
  doughnut_type VARCHAR(6)
  );
Choosing the best matching data type for each column in your table will reduce the size and make
  operations on your data faster. You can't recreate an existing table or
  database. Here is some basic rules:

  1. Break you data up in categories before you create your table. Pay special
     attention to the type of data for each column.

  2. TIMESTAMP is usually used to capture the current time. DATETIME is best used
     to store a future event.

  3. Check your work with:

DESC doughnut_list;
How to delete a table?

DROP TABLE doughnut_list;
DROP TABLE deletes your table and any data in it. Now it is possible to recreate table and put new
  categories.

  How to insert in a table?
INSERT INTO your_table (col_name1, col_name2, ....)
  VALUES ('val1', 'val2', ....);
Values has to be in the same order as the column names. Any value that goes into a VARCHAR, CHAR, DATE,
  or BLOB column has single quotes around it. DEC and INT don't use quotes.
  You can insert few columns and leave some out.
  NULL will be inserted for missing. NULL is not zero, it is "nothing" - undefined value.

  A column with a NULL value IS NULL, but does not EQUAL NULL!

  A good practice is to avoid it, but how?

  Check what is inserted:
SELECT * FROM doughnut_list;
- You can control NULL!
CREATE TABLE doughnut_list
  (
      doughnut_name VARCHAR(10) NOT NULL,
      doughnut_type VARCHAR(6) NOT NULL
  );
- You can add defaults.
CREATE TABLE doughnut_list
  (
      doughnut_name VARCHAR(10) NOT NULL,
      doughnut_type VARCHAR(6) NOT NULL,
      doughnut_cost DEC(3, 2) NOT NULL DEFAULT 1.00
  );
Using DEFAULT value fills the empty columns with a specified value.

3 SELECT statement 
~~~~~~~~~~~~~~~~~~~~~~~

3.1 WHERE 
==========

   We use WHERE clause that five the RDBMS something specific to search for. SELECT well return the rows that match the condition.

   The VARCHAR, CHAR, BLOB, DATE and TIME data types needs single quotes.
   DEC and INT, no not.

   RDBMS will ignore the quotes and treat your DEC and INT values as numbers,
   even though the quotes indicate they are text values!
   When inserting quotes have to be escaped with *\* or double *'*.

   You can specify which columns wish to see. Use '*' to see all.

   Use AND, OR for better WHERE condition. We can have more than one AND, OR.
  =  equal
  <> not equal
  <  less than
  >  greater than
  <= and =>

   Exception is NULL!
SELECT drink_name FROM drink_info WHERE calories IS NULL;
3.2 LIKE 
=========

   %  stand-in for any number of unknown character.

   LIKE '%CA', any end with CA

   _  stand-in for only one unknown character.
   

3.3 Selecting ranges 
=====================

   1st
  .... WHERE calories > 30 AND calories < 50;


   2nd
  ... WHERE calories BETWEEN 30 AND 50; (including 30 and 50);


3.4 IN 
=======
  ... rating IN ('innovative', 'fabulous');

3.5 NOT 
========

   NOT lets you negate your results and get the opposite values.

4 DELETE and UPDATE 
~~~~~~~~~~~~~~~~~~~~~~~~

  You can't relay on the rows in the table being in chronological order.
  You can't use DELETE to delete the value from a single column or tableful of column.
  You can delete every row from the table with:
DELETE FROM your_table;
Be careful with DELETE, use SELECT first to be sure you will delete the right data.

  You can use UPDATE tho update a single row or multiple rows, depending on the WHERE clause.

  Update can replace DELETE/INSERT combination.
  UPDATE statements can be used on multiple records in your table. Use them with basic math operations to manipulate your numeric values.

5 Importance to be Normal
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  The simpler query the better!
  How you're going to use your data will affect how you set up your table.
  Relational database means how the columns relate to each other to describe the thing.
  The challenge is to describe the thing using column in a way that makes getting the information out of it easy.

  Here is a simple plan:
  1. Pick your thing, the one thing you want your table to describe.
     (What's the main thing you want your table to be about?)
  
  2. Make a list of the information you need to know about your one thing when you're
     know about your one thing when you're using the table.
     (How will you use this table?)
  
  3. Using the list, break down the information about your thing into pieces you can use
     for organizing your table.
     (How can you most easily query this table?)


Atomic data: Data is atomic if it's been broken down into smallest piece of data that
can't or shouldn't be divided.

Simple rules to have atomic data:

1. What is the ONE THING your table describes?
2. How will you use the table to get at the ONE THING?
   (Design your table to be easy to query)
3. Do your columns contain atomic data to make your queries short and to the point?

TIP: Making your data atomic means breaking it down into the smallest pieces that you need
to create an efficient table, not just the smallest possible pieces you can.

Here is the official rules for atomic data:

  Rule 1: A column with atomic data can't have several values of the same type of data in a column.
  (You don't have to look for cell information using LAKE)
  Rule 2: A table with atomic data can't have multiple columns with the same
  type of data.


          teacher    student_1   student_2   student_3  
         ----------+------------+------------+------------
          Zlatozar   Eli          Rosi         Monika      

Too many student columns!

Making your table NORMAL, means they follow some standard rules, so making your data atomic
is the first step in creating a NORMAL table!

Benefits of NORMAL tables

- Normal tables won't have duplicate data, which will reduce the size of your database.
- Faster queries.

5.1 1NF 
========
   Each row of data must contain atomic values.
   Each row of data must have a unique identifier, known as Primary Key (no repeating groups of data)

   Hm, and what is Primary Key?
   A primary key is a column in you table that makes each record unique.

   Primary Key rules:

   - A primary key can't be NULL.

   - The primary key must be given a value when the record is inserted.

   - The primary key must be compact

   - The primary key values can't be changed


   TIP: The best primary key may be a new primary key.

   A key made of two or more columns is known as a COMPOSITE KEY.
   COMPOSITE KEY is a primary key composed of multiple columns, creating a unique key.

   TIP:
   SHOW CREATE TABLE my_table_name;

   gives you the SQL for table creation and you can use copy paste.

   Try also:
SHOW COLUMNS FROM table_name;
  SHOW CREATE DATABASE databse_name;
  SHOW INDEX FROM table_name;
Example for table creation that has primary key:
CREATE TABLE my_contacts
  (
      contact_id INT NOT NULL AUTO_INCREMENT,
      ............
      PRIMARY KEY (contact_id)
  )
You can add primary key with ALTER. It will add keys to all rows.
   

6 ALTER 
~~~~~~~~~~~~

  The ALTER command allows you to change almost everything in your table without having
  to reinsert data. But if you change type of the column to different one, you risk losing
  your data.

  ALTER TABLE can help you only to improve your table design, nothing more.

- Use RENAME
ALTER TABLE projects RENAME TO project_list;
- Use CHANGE - modify both the name and data type of an existing column
ALTER TABLE project_list
  CHANGE COLUMN number proj_id INT NOT NULL AUTO_INCREMENT,
  ADD PRIMARY KEY ('proj_id');
You can change more than one column in a single statement.

- Use MODIFY - modify the data type or position of an existing column
ALTER TABLE project_list
  MODIFY COLUMN proj_desc VARCHAR(120);
- Use ADD - add a column to your table - you pick the data type
ALTER TABLE my_contacts
  ADD COLUMN contact_id INT NOT NULL AUTO_INCREMENT FIRST,
  ADD PRIMARY KEY (contact_id);
FIRST says that the column will be first in the table. Also you can use: AFTER your_column,
  BEFORE your_column, SECOND, THIRD and you get the idea ;)

- Use DROP - drops a column from your table
ALTER TABLE project_table
  DROP COLUMN start_date;
Once you've dropped a column, everything that was stored in it is removed too!
  
  TIP: Use SELECT first to see data that you intend to drop.

7 Advanced SELECT 
~~~~~~~~~~~~~~~~~~~~~~

7.1 CASE 
=========

   The CASE expression combines all the UPDATE statements by checking an existing column's
   value against a condition.
UPDATE my_table
  SET new_column =
  CASE
  WHEN column1 = somevalue1
  THEN newvalue1
  WHEN column2 = somevalue2
  THEN newvalue2
  ELSE newvalue3
  END;
Order matters! Also you can use all ADD and OR combinations in WHEN clause.

7.2 ORDER BY 
=============

   If you want to order your query then try ORDER BY. ORDER BY allows you to alphabetically
   order any column.
SELECT
  ....
  ORDER BY column_name;
It is possible to order by two columns. Better, you can sort by as many columns as you need.

   Example:
SELECT * FROM movie_table
  ORDER BY category, purchased, title;
Order by category, then for a particular category order by purchased, and then for a particular purchased film order by title.
   (category (purchased (title)))

   Use keyword DESC after your column name in ORDER BY clause to reverse the order of you results.

7.3 GROUP BY 
=============
SELECT first_name, SUM(sales)
  FROM cookie_sales
  GROUP BY first_name
  ORDER BY SUM(sales) DESC;
Group together all the first_name values. Imagine that GROUP BY divide the table for every
   first_name and then pass every table to the SUM function(see below).
   In this way for every name we have the total sales (sum of all sales) from high-to-low.

   TIP: Using GROUP BY combines the duplicates into one single value for each group. But be careful this will not work for
   tables like this

     Interests     
    --------------
     books, sport  
     pets, books   

   *(Hint: Use joins)*

7.4 SQL functions 
==================

   SQL language has some special keywords called functions. Each function is a bit a code
   that preform an operation on a value or values.

7.4.1 SUM 
----------
SELECT SUM(sales)
  FROM cookie_sales
  WHERE first_name = 'Nicole';
SUM function totals the values in the sales column.

7.4.2 AVG, MIN, MAX, COUNT 
---------------------------

    COUNT will return the number of rows in a column - single number
SELECT COUNT(*)
  FROM Student;
Return the number of rows in a table.

7.5 SELECT DISTINCT 
====================
SELECT COUNT(DISTINCT sale_data)
  FROM cookie_sales;
Result will be one row because of the COUNT.

   DISTINCT is a keyword.

7.6 LIMIT 
==========

   Limits the number of results. Also possible
LIMIT 2
  LIMIT 0,4
  LIMIT 1,1
8 Multi-table database design 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  TIP: Your table design should do the heavy lifting for you. Don't write convoluted queries
  to "get around" a badly designed table. Ignoring the problem isn't the answer.

  We need to thing outside of a single table. How to add another table?

  TIP: We need to move the non-atomic columns in our table into new tables!

  A description of the data (the columns and tables) in your database, along with any other related
  objects and the way they all connected in known as a SCHEMA.

  TIP: Creating a diagram of your table lets you keep the design of the table separate from
  the data that's inside of it.

  How to go from one table to two? Here is the algorithm:

  - Remove the column and put it in its own table
  - Add columns that will let us identify it.

    Example:

      my_contacts              interests
     -----------------       -----------------
      contact_id(pk)           int_id(pk)
      last_name                last_name
      first_name               first_name
                               interests


    How to connect tables? 'first_name' and 'last_name' combination is not unique!
    
      my_contacts              interests
     -----------------       -----------------
      contact_id(pk)           int_id(pk)
      last_name                interests
      first_name               contact_id(fk)


    contact_id in interest table is foreign key!

    The FOREIGN KEY is a column in a table that references the PRIMARY KEY of another table!
    'contact_id' is unique so we know the reference.

    Foreign key facts:

  1. A foreign key can have a different name than the primary key it comes from.
  2. The primary key used by a foreign key is also known as a PARENT KEY. The table where the
     primary key is from is known as a parent table.
  
  3. The foreign key can be used to make sure that the rows in one table have corresponding rows
     in another table.
  
  4. Foreign key values can be null, even though primary key values can't.
  
  5. Foreign keys don't have to be unique - in fact, they often aren't.

A NULL foreign key means that there's no matching primary key in the parent table!
But we can make sure that a foreign key contains a meaningful value, one that exist in the parent table,
by using a CONSTRAINT.

Constraint defines rules for relation. Creating a foreign key as a constraint in your table gives you
definite advantages. You'll get errors if you violate the rules, which will stop you accidentally
doing anything to break the table.

Referential integrity: You will only be able to insert values into your foreign key that exist
in the table the key came from, the parent table.

TIP: You can use a foreign key to reference a unique value in the parent table. It doesn't have
to be the primary key of the parent table, but it must be unique.

TIP: Foreign key could be added with ALTER TABLE.

Example:
CREATE TABLE interests
  (
      int_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
      interest VARCHAR(50) NOT NULL,
      contact_id INT NOT NULL,
      CONSTRAINT my_contacts_contact_id_FK
      FOREIGN KEY (contact_id)
      REFERENCES my_contacts (contact_id);
  )
TIP: How to form constraint names? parent_table_foreign_key_name>_FK

There is many types of relation.

8.1 ONE-TO-ONE 
===============
   One-To-One: exactly one row of a parent table is related to one row of a child table.

   We used if for example we would like to isolate some data from the parent table.

   Advantages:

   - Pulling the data out may allow you to write faster queries.

   - If you have a column containing values you don't yet know,
   you can isolate it and avoid NULL values in your main table.

   - You may wish to make some of your data less accessible.

   - If you have a large piece of data, a BLOB type for example,
   you may want that large data in a separate table.


8.2 ONE-TO-MANY 
================
   One-to-many: a record in Table A can have MANY matching records in Table B, but a record in Table B
   can only match ONE record in Table A.

8.3 MANY-TO-MANY 
=================
   Unacceptable! How to fix that?

   We need a table to step in between these two many-to-many tables and simplify the relationship to one-to-many.
   We need what is called a JUNCTION TABLE, which contain the primary key columns of the two tables we want to relate.
   In this way we have tow one-to-many relationships.

   When a column's data must change when another column's data is modified, the first column
   is functionally dependent on the second.

   For example if we have table Person with columns name and initials. Initials column is functional dependent
   on column name. We say that column initials is dependent column.

   A partial functional dependency means that a non-key column is dependent on some, but not all,
   of the columns in a composite primary key.

   If in table Person, primary key is SSN and name, initials is partially dependent on name.

   If changing any of the non-key columns might cause any of the other columns to change,
   you have a transitive dependency.
   You can say: when any non-key column is related to any of the other non-key columns.

   But how to avoid partial dependency?
   One simple way is to use ID for a primary key. Adding primary key columns to our tables is
   helping us achieve 2NF, because the second normal form focuses on how the primary key in a
   table relates to data in it.

   Your 1NF table is also 2NF if all the columns in table are part of the primary key OR it has
   a single column primary key. More strictly here is the rules:


   Rule 1: Be in 1NF

   Rule 2: Have no partial functional dependencies.


   Tip: Any table with an artificial primary key and no composite primary key is always 2NF.


   Third Normal Form or 3NF:

   Rule 1: Be in 2NF

   Rule 2: Have no transitive dependencies

   
Practical rule for decomposition to 3NF:
If A is in functional dependency with B then A is a key in decomposed tables!
Example:
table Apply(SSN, Student_Name, Collage_Name)

SSN is in functional dependency with Student_Name => Create two tables with key SSN, table Student(SSN, Student_Name)
and table Apply(SSN, Collage_Name).

Nirvana, or what is 4NF


   Forth Normal Form or 4NF:

   Rule 1: Be in 3NF

   Rule 2: Have no multivalued dependencies


Let's explain this with example. Imagen that you have table Apply(SSN, Student_Name, High_School).
It is in 3NF because there is no functional dependencies, but SSN is in multivalued dependencies with Student_Name
and High_School because it is a key for every Student_Name and High_School combination (Student_Name*High_School - number of conbinations).

Practical rule for decomposition to 4NF:
If A is in multivalued dependency with B and C then A is a key in decomposed tables!
Example:
table Apply(SSN, Student_Name, High_School)

SSN is in multivalued dependency with Student_Name and High_School => Create two tables with key SSN, table Apply(SSN, Student_Name)
and table HighSchool(SSN, High_School).

9 Start with joins 
~~~~~~~~~~~~~~~~~~~~~~~

  Because now we have many table and all the information is spread. Using joins we can collect data
  easily.

9.1 AS 
=======

  Variant 1:
CREATE TABLE profession
  (
  id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  profession VARCHAR(20)
  ) AS
  SELECT profession FROM my_contacts
  GROUP BY profession
  ORDER BY profession;
Variant 2:
CREATE TABLE profession AS
  SELECT profession FROM my_contacts
  GROUP BY profession
  ORDER BY profession;
ALTER TABLE profession
  ADD COLUMN  id INT(11) NOT NULL AUTO_INCREMENT FIRST,
  ADD PRIMARY KEY (id);
In both variants AS is used to references result of a query. It is used also for
   name aliases.
CREATE TABLE profession
  (
  id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  profession VARCHAR(20)
  ) AS
  SELECT profession AS mc_prof FROM my_contacts
  GROUP BY mc_prof
  ORDER BY mc_prof;
The alias change the name of the column in result but doesn't change the original
   column name in any way. An alias is temporary! Table aliases are also called correlation names.
   They are very useful in long queries.

  TIP: Some time we can alias without AS:
SELECT profession mc_prof
  FROM my_contacts mc
  GROUP BY mc_prof
  ORDER BY mc_prof;
Before we start with joins, let's define simplest join called - Cartesian join, Cartesian product,
   cross product. It means join all!

   Toys table                             Boys table

     toy_id      toy                  boy_id    boy
    ---------+----------------      ---------+----------------
           1   hula hoop                   1   Zlatko
           2   balsa glider                2   Mitko
           5   baseball cards  

   The CROSS JOIN returns every row from one table crossed with every row from second

SELECT t.toy, b.boy
  FROM toys AS t
  CROSS JOIN boys AS b;
(or you can just write: SELECT toys.toy, boys.boy FROM toys, boys;)

   This join get number of toys multiplied by numbers of boys to account for every possible combination.

   TIP: Using CROSS JOIN by mistake you can hanging your machine.

   But why we need CROSS JOIN? Answer is simple, to understand better INNER JOIN.

   An INNER JOIN is a CROSS JOIN with some result rows removed by a condition in a query.
   Here is the syntax:
SELECT some_columns
  FROM table1
  INNER JOIN
  table2
  ON some_condition;
An INNER JOIN combines the records from two tables using comparison operators in a condition.

9.2 EQUIJOIN 
=============

   This is a inner joins test for equality.
SELECT t.toy, b.boy
  FROM boys
  INNER JOIN
  toys
  ON boys.toy_id = toys.toy_id;
9.3 NON-EQUIJOIN 
=================
SELECT t.toy, b.boy
  FROM boys
  INNER JOIN
  toys
  ON boys.toy_id <> toys.toy_id
  ORDER BY boys.boy;
9.4 NATURAL JOIN 
=================

   Natural joins only work if the column you're joining by has the same name in both tables.

   Toys table                                     Boys table

     toy_id      toy                            toy_id       boy
    ---------+----------------                ---------+----------------
           1   hula hoop                             1   Zlatko
           2   balsa glider                          2   Mitko
           5   baseball cards  

SELECT boys.boy, toys.toy,
  FROM boys
  NATURAL JOIN
  toys;
This two tables are joined on 'toy_id' column.
   Natural join inner joins identify matching column names.

   TIP: You can join more than one table

   The same effect you can achieve using query inside a query - subqueries!

10 SUB-QUERIES 
~~~~~~~~~~~~~~~~~~~

  Simple data base to illustrate the queries:

  [Simple database]
 A subquery is a query that is wrapped within another query. It's also called an INNER query.

  Subquery is nothing more than a query inside another query. The outside query is known as
  containing query, or outer query. The query on the inside is the inner query, or subquery.

  Example 1:
SELECT some_column, another_column
  FROM table
  WHERE column = (SELECT column FROM table);
Because it use the = operator, this subquery will return a single value, one row from one column
  (sometimes called a cell, but in SQL known as a scalar value), which is compared to the
  columns in the WHERE clause. If it returns more you'll get an error.

  Example 2: JOINS <=> SUBQUERYS
SELECT last_name, first_name
  FROM my_contacts
  WHERE zip_code = (SELECT zip_code FROM
  zip_code WHERE city = 'Memphis' AND state = 'TN');
is equivalent to:
SELECT last_name, first_name
  FROM my_contacts mc
  NATURAL JOIN zip_code zc
  WHERE zc.city = 'Memphis' AND zc.state = 'TN';
TIP: Joins are faster, subquerys are more readable.

  Because subquerys are very verbose it is a good idea to use aliases - column aliases.

  Example:
SELECT mc.first_name AS firstname, mc.last_name AS lastname,
  mc.phone AS phone, jc.title AS jobtitle
  FROM job_current AS jc
  NATURAL JOIN my_contacts AS mc
  WHERE jobtitle IN (SELECT title FROM job_listings);
The tricky part about subqueries isn't the structure; it's figuring out what part of
  the query needs to be the subquery. Or even if you need one at all.

  Example:
  Who make the most money out of all of my contacts?

- Identify a query that answers part of the question.
SELECT MAX(salary) FROM job_current;
- Continue dissecting your query. We need to select names.
SELECT mc.first_name, mc.last_name
  FROM my_contacts AS mc;
- Finally, figure out how to link the two.
SELECT mc.first_name, mc.last_name, jc.salary
  FROM my_contacts AS mc
  NATURAL JOIN job_current AS jc;
- And now add the WHERE clause to link the two
SELECT mc.first_name, mc.last_name, jc.salary FROM my_contacts AS mc
  NATURAL JOIN job_current AS jc
  WHERE jc.salary = (SELECT MAX(jc.salary) FROM job_current jc);
If a subquery is used as a column expression in a SELECT statement, it can only return one value from one column.
SELECT mc.first_name, mc.last_name,
  (SELECT state FROM zip_code WHERE mc.zip_code = zip_code) AS state
  FROM my_contacts mc;
If the subquery stands alone and doesn't reference anything form the outer
query, it is a noncorrelated subquery.
SELECT mc.first_name, mc.last_name, jc.salary
  FROM my_contacts AS mc
  NATURAL JOIN job_current AS jc
  WHERE jc.salary > (SELECT jc.salary
                  FROM my_contacts mc NATURAL JOIN job_current jc
                  WHERE email = 'some@somedomain.com");
In noncorrelated subquery you can use IN or NOT IN to test if the values returned in the
subquery are members of a set (or not). This is possible because noncorrelated subqueries could return more than one column.
SELECT mc.first_name, mc.last_name, mc.phone, jc.title
  FROM job_current jc
  NATURAL JOIN my_contacts mc
  WHERE jc.title NOT IN (SELECT title FROM job_listings);
A very common use for correlated subqueries is to find all the rows in the outer query
for which no rows exist in a related table:
SELECT mc.first_name firstname, mc.last_name lastname, mc.email email
  FROM my_contacts mc
  WHERE NOT EXIST
  (SELECT * FROM job_current jc
  WHERE mc.contact_id = jc.contact_id);
NOT EXIST finds the first and last names and so on form 'my_contacts' table who are not
currently listed in the 'job_current' table.

TIP: You can use just EXIST

What is the best approach when trying to construct a subquery inside a subquery?
Your best bet is to write little queries for the various parts of the question. Then
look at them and see how you need to combine them.

Also correlated subqueries exist! A correlated subquery means that the inner query relies
on the outer query before it can be resolved.
Example:  People how have three interests.
SELECT mc.first_name, mc.last_name
  FROM my_contacts AS mc
  WHERE 3 = (
           SELECT COUNT(*) FROM contact_interest
           WHERE contact_id = mc.contact_id
           );
The outer query has to be executed before we know what the value of 'mc.contact_id' is!
Inner query uses the same aliases or correlation name for 'my_contacts'. mc that was created in the outer query.


11 Outer joins, self-joins, and unions 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  With an inner join, you're comparing rows from two table, but the order of those
  two tables doesn't matter. Outer joins have more to do with the relationship between
  two tables than the inner joins.

  The LEFT OUTER JOIN takes all the rows in the left table and maches them to rows
  in the left table and matches them to rows in the right table. It is useful when the
  left table and the right table have a one-to-many relationship.

  The left outer join matches EVERY ROW in the LEFT table with a row from the right table.
  The big secret to understanding an outer join is to know which table is on the left and which is
  on the right.
SELECT t1.column, t2.column
  FROM table1 t1
  LEFT OUTER JOIN table2 t2
  ON t1.table_id = t2.table_id;
table1 is the LEFT, table2 is RIGHT. So LEFT table will "say" how many columns result table will have.

  The big difference is that an outer join gives you a row whether there's a match with the other
  table or not! So then, a NULL value in the results of a left outer join means that the right table
  ha NO VALUES that corresponding to the left tables.

  The same with the RIGHT OUTER JOIN. The right outer join evaluates the right table against the left
  table.

  TIP: It will be easier to always stick with one, say the left outer join.

  You can use the same table as both the right and the left table in an outer join - self-join!
  But when it will be handy?

  A SELF-REFERENCING foreign key is the primary key of a table used in that same table for another purpose.
  The self-referencing part means that it is a key that is referencing another field in the same table.
  In tables where we have self-referencing we can use self-join. In this way we simulate having two tables.

  The self-join allows you to query a single table as though there were two tables with exactly the
  same information in them.

11.1 Unions 
============

   Except JOINS and SUBQUERIES there is another way to get multi-table information - UNIONS.
   UNION combines the results of two or more queries into one table, based on what you specify in the
   list of the SELECT.
   Example:
SELECT title FROM job_current
  UNION
  SELECT title FROM job_desired
  UNION
  SELECT title FROM job_available;
UNION can take one ORDER BY at the end of the statement. This is because UNION concatenates and
   groups the results from the multiple SELECT statements.
SELECT title FROM job_current
  UNION
  SELECT title FROM job_desired
  ORDER BY title;
Remember that the number of columns in each SELECT statement must match. Also UNION suppresses all
   duplicate values from results. If you DO want to see duplicate, you can use operator UNION ALL!
SELECT title FROM job_current
  UNION ALL
  SELECT title FROM job_desired
  ORDER BY title;
More useful example:
CREATE TABLE my_union AS
  SELECT title FROM job_current
  UNION
  SELECT title FROM job_desired;
11.2 INTERSECT and EXCEPT 
==========================
   INTERSECT and EXCEPT are used in much the same way as UNION - to find parts of queries that overlap.

   INTERSECT returns only those columns that are in the first query and also in the second query.
   EXCEPT returns only those columns that are in the first query, but NOT in the second query.

   TIP: This two operations DO NOT EXIST in MySQL.

   TIP: It is a good practice to be consistent. 
Subqueries and joins are equivalent and if you have started using joins, used all the time. The same is for subqueries usage.

   Using joins and subqueries for self-join

   - Join
SELECT c1.name, c2.name AS boss
  FROM person_info c1
  INNER JOIN person_info c2
  ON c1.boss_id = c2.id;
- Subqueries
SELECT c1.name,
  (SELECT name FROM person_info
  WHERE c1.boss_id = id) AS boss
  FROM person_info c1;
11.3 A Visual Explanation of SQL Joins 
=======================================
Assume we have the following two tables. Table A is on the left, and Table B is on the right. We'll populate them with four records each.

  id name               id  name
  -- ----               --  ----
  1  Pirate      1   Rutabaga
  2  Monkey             2   Pirate
  3  Ninja        3   Darth Vader
  4  Spaghetti          4   Ninja


Let's join these tables by the name field in a few different ways and see if we can get a conceptual match to 
those nifty Venn diagrams. 
SELECT * FROM TableA
  INNER JOIN TableB
  ON TableA.name = TableB.name
[Inner Join]
id  name       id   name
  --  ----       --   ----
  1   Pirate     2    Pirate
  3   Ninja      4    Ninja


Inner join produces only the set of records that match in both Table A and Table B.
SELECT * FROM TableA
  FULL OUTER JOIN TableB
  ON TableA.name = TableB.name
id    name            id    name
  --    ----            --    ----
  1     Pirate          2     Pirate
  2     Monkey          null  null
  3     Ninja           4     Ninja
  4     Spaghetti       null  null
  null  null            1     Rutabaga       
  null  null            3     Darth Vader


[Full Outer Join]
Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available.
If there is no match, the missing side will contain null.
SELECT * FROM TableA
  LEFT OUTER JOIN TableB
  ON TableA.name = TableB.name
id  name             id    name
  --  ----             --    ----
  1   Pirate           2     Pirate
  2   Monkey           null  null
  3   Ninja            4     Ninja
  4   Spaghetti        null  null


[Left Outer Join]
Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B.
If there is no match, the right side will contain null.
SELECT * FROM TableA
  LEFT OUTER JOIN TableB
  ON TableA.name = TableB.name
  WHERE TableB.id IS null
id  name            id     name
  --  ----            --     ----
  2   Monkey          null   null
  4   Spaghetti       null   null


[Exclude Right Side]
To produce the set of records only in Table A, but not in Table B, we perform the same left outer join, then exclude the records we don't want 
from the right side via a where clause.
SELECT * FROM TableA
  FULL OUTER JOIN TableB
  ON TableA.name = TableB.name
  WHERE TableA.id IS null 
  OR TableB.id IS null
id    name          id    name
  --    ----          --    ----
  2     Monkey        null  null
  4     Spaghetti     null  null
  null  null          1     Rutabaga
  null  null          3     Darth Vader


[Exclude Both Sides]
To produce the set of records unique to Table A and Table B, we perform the same full outer join, then exclude the records we don't want from both sides via a where clause.

There's also a cartesian product or cross join, which as far as I can tell, can't be expressed as a Venn diagram:
SELECT * FROM TableA
  CROSS JOIN TableB
This joins "everything to everything", resulting in 4 x 4 = 16 rows, far more than we had in the original sets. 
If you do the math, you can see why this is a very dangerous join to run against large tables.




12 Constraints, views, and transactions 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

12.1 Adding CHECK CONSTRAINT 
=============================

   A constraint is a restriction on what you can insert into a column. Constraints are added when we
   create a table. Some of the constraints we've already seen include NOT
   NULL, PRIMARY KEY, FOREIGN KEY, and UNIQUE.
   
   There's another sort of column constraint, called a CHECK.
   
   A CHECK constraint restrict what values you can insert into a column. It uses the same conditionals as WHERE clause.
   Example:
CREATE TABLE piggy_bank
  (
      id INT AUTO_INCREMENT NOT NULL PRIMARY KEY,
      coin CHAR(1) CHECK (coin IN ('P', 'N', 'D', 'Q'))
  )
If the value you're trying to insert fails the CHECK condition, you get an error.

   TIP: CHECK doesn't enforce data integrity in MySQL. In MySQL you can work around it with triggers.

12.2 Views 
===========

   If you have a long an complex query that you have to run constantly you can crate a view. Creating
   view is really simple:
CREATE VIEW web_designers AS
  SELECT mc.first_name, mc_last_name, mc_phone
  FROM my_contacts mc
  NATURAL JOIN job_desired jd
  WHERE jd.title = 'Web Designer';
Instead of CREATE TABLE you write CREATE VIEW. Then you treat is as table:
SELECT * FROM web_designers;
TIP: When you actually use your view in a query, it's behaving as though it were a subquery.

   Let's dissect SELECT query. What it says is "Select everything from the subquery that returns the
   first/last name and phone of all the people from 'my_contacts' who are looking for a job as web designer".

 In code this looks like this:
SELECT * FROM (SELECT mc.first_name, mc.last_name, mc.phone
                FROM my_contacts mc
                NATURAL JOIN job_desired jd
                WHERE jd.title = 'Web Designer') AS web_designers;
(We gave a name to our subquery - web_designers)

   A VIEW is basically a table that only exist when you use the view in a query. It's considered a
   virtual table because it acts like a table, and some operations that can be performed on a table
   can be preformed on a view.

   TIP: Virtual table doesn't stay in the database. It gets created when we use the view and then deleted.

   Why views are good for your database?

   1. You can keep changes to your database structure from breaking applications that depend on your tables.
   2. Views make your life easier bey simplifying you complex query into a simple command.
   3. You can create views that hide information that isn't needed by the user.

Is it possible to inserting, updating, and deleting with views?

Yes, but it depends. If your view used aggregate values (like SUM, COUNT, and AVG), you won't be
able to use it to change data. Also, if your view contains GROUP BY, DISTINCT, or HAVING, it won't
change data either. Most of the time it might be easier to INSERT, UPDATE, and DELETE the old-fashioned way.

TIP: CHECK OPERATION checks each query you try to INSERT or UPDATE to see if it's allowed according to the
   WHERE clause in your view

An updatable view is a view that allows you to change the underlying tables. The important point
here is that an updatable view includes all the NOT NULL columns form the tables it references.
That way, when you INSERT using view, you can be certain that you will have a value for every column
you required to have a value in. The opposite for non-updatable views, it includes all the NOT NULL
columns.

When you're finished with your view, just drop it.
DROP VIEW view_name;
TIP: CHECK constraints and views both help maintain control when you have multiple users.


12.3 Transaction 
=================

   A transaction is a set of SQL statements that accomplish a single unit of work.

   During a transaction, if all the steps can't be completed without interference, none of them
   should be completed.

   What is ACID?

   Atomicity: All of the pieces of the transaction must be completed, or none of them will be completed.
   Consistency: A complete transaction leaves the database in a consistent state at the end of the transaction.
   Isolation: It means that every transaction has a consistent view of the database regardless of other
   transactions taking place at the same time.
   Durability: After the transaction, the database needs to save the data correctly and protect it
   from power outages or other threads.

   Example:
START TRANSACTION;
  ........
  /* START TRANSACTION keeps track of all the SQL that follows until you enter either
  COMMIT or ROLLBACK */
  .......
  COMMIT;
TIP: No changes will occur to the database until you COMMIT.

   What is storage engine?
   Storage engine is the behind-the-scenes structure that stores all your database data and structures.
   Some types allows transaction; some types do not, but they are faster.
   
   TIP: For MySQL you need make sure your storage engine is either BDB of InnoDB, the two choices
      that support transactions.
   
   RDBMS keeps a record of everything that has been done when you are inside a transaction. It's
   called transaction log, and it keeps getting bigger and bigger the more you do in transaction.
   It's best to save using transactions for when you really need to be able to undo what you're
   doing to avoid wasting space and making your RDBMS faster.

13 Security 
~~~~~~~~~~~~~~~~~

  See your database documentation.

  TIP: You can control exactly what users can do to tables and columns with the GRANT statement.
  Example:
GRANT SELECT ON my_contacts TO programmer;
If you decide to remove privilege you need REVOKE.
  Example:
REVOKE SELECT ON my_contacts FROM programmer;
There is a lot details here but they are more for database administrators, that's why we just skip them.
От дума на дума

Monday, April 18, 2011

SQL crash course

No comments:

About Me

Reading

Links

Думи

FreeBSD

Programming

Exploring Lisp

Listmania and Movies

Misc