Every RDBMS vendor uses their own variation of SQL. MS implemented T-SQL (Transactional Structured Query Language) for their MS SQL Server offering.
SQL is strongly typed.
- char(n) - fixed length - good when we know the length or when length is less than 3 (more optimized than varchar(n)) (non-Unicode)
- varchar(n) - variable length (up to n) (non-Unicode)
- nchar(n) - fixed length (Unicode)
- nvarchar(n) - variable length (up to n) (Unicode)
- tinyint (1 byte - 0-255)
- smallint (2 bytes)
- int (4 bytes)
- bigint (8 bytes)
- decimal/numeric (5-17 bytes) - optionally, we can provide precision and scale
- money (8 bytes) - 4 decimal places
- smallmoney (4 bytes)
- date (3 bytes)
- time - time precision might be adjusted
- datetime (8 bytes) - legacy type
- smalldatetime (4 bytes)
- datetime2 (6-8 bytes) - adjustable precision, it’s a good choice in general
- datetimeoffset (10 bytes) - with timezone
We can cast types using the
CAST function. E.g.:
SQL can also cast by itself trying to guess the intention:
SQL Server comes with a bunch of buit-in functions (like
etc.), which we can use in our queries. We can also create our own functions.
Creating a DB
Creating a table
TotalPriceis a calculated colum (the values can be persisted with the
Ordersis the schema,
OrdreTrackingis table’s name
Order of Execution
SELECT statement is executed in the following order:
FROM- dataset is prepared
WHEREfilters data using predicates
GROUP BY- rows combined into groups
HAVING- another filter - for groups this time
SELECT- evaluates a provided list of expressions on every row
ORDER BY- ordering
OFFSET - FETCH- limit the number of rows
CROSS JOIN - A Cartesian Product of two tables - every entity from table A is matched with every element from table B - it’s rarely useful
Customer Country OrderID OrderDate Customer Bob NULL 1 2019-01-01 Jack Chen China 1 2019-01-01 Jack Jack USA 1 2019-01-01 Jack Kelly USA 1 2019-01-01 Jack Sunil India 1 2019-01-01 Jack Bob NULL 2 2019-01-01 Bob Chen China 2 2019-01-01 Bob Jack USA 2 2019-01-01 Bob Kelly USA 2 2019-01-01 Bob Sunil India 2 2019-01-01 Bob Bob NULL 3 2019-01-15 Jack Chen China 3 2019-01-15 Jack Jack USA 3 2019-01-15 Jack Kelly USA 3 2019-01-15 Jack Sunil India 3 2019-01-15 Jack Bob NULL 4 2019-01-16 Chen Chen China 4 2019-01-16 Chen Jack USA 4 2019-01-16 Chen Kelly USA 4 2019-01-16 Chen Sunil India 4 2019-01-16 Chen
We have 5 Customers, and 4 Orders. We see every customer combined with every order using cartesian product - 20 rows. It doesn’t really make any sense.
INNER JOIN - It starts with a CROSS JOIN, and then uses a Join Predicate (like
ON A.Name = B.Name) to extract only the rows where the predicate evaluates to true. E.g. if predicate was
ON 1=1, we’d get the same result as for the CROSS JOIN.
Customer Country OrderID OrderDate Customer Jack USA 1 2019-01-01 Jack Bob NULL 2 2019-01-01 Bob Jack USA 3 2019-01-15 Jack Chen China 4 2019-01-16 Chen
Jack did two orders. We do not see the customers that didn’t do any orders.
LEFT/RIGHT OUTER JOIN - Works like an INNER JOIN, however, we can specify that entities of one of the sets (tables) will be takes even if predicate is not satisfied. The choice of the set is made with
RIGHT. The selected set is a Reserved Set.
Customer Country OrderID OrderDate Customer Bob NULL 2 2019-01-01 Bob Chen China 4 2019-01-16 Chen Jack USA 1 2019-01-01 Jack Jack USA 3 2019-01-15 Jack Kelly USA NULL NULL NULL Sunil India NULL NULL NULL
The customers which did not make any orders are present as well. The columns of the Orders table are NULLed for them.
Due to the fact that SQL includes NULLs, we have to deal with Ternary Logic.
On top of
false there is a possibllity of an
unknown result. To test
unknown we use the
IS NOT NULL operators.
GROUP BY is useful when we do not care about individual entities, but rather
about some aggreagation of them (“How many people…”, “What’s the average…”).
We get a single answer for the entire group instead of getting answers for every
After grouping we can apply filtering on top of it - using the
The difference from
WHERE is the fact that
WHERE is applied to individual
HAVING is applied to groups. Basically, the order of when they are
We cannot create expressions based on aliases used in different expressions. For example,
SELECT (Quantity * Price) AS Total, 0.9 * Total FROM Items;will not work, because
Totalwill be unrecognized.
DISTINCTeliminates duplicates (NULLs are treated as equal). It looks at all the
SELECTed columns and removes rows that are exactly the same.
Dealing with NULLs:
ISNULL(X, Y)function replaces column X (if it’s null) with Y. It’s a simplified version of a more general
COALESCEfunction (which is available in other RDBMSs as well).
- NULLs have always the lowest ordering value (Postgres uses the opposite logic, but allows to change that)
- we can refer to aliases defined in
- ascending order is the default
- the order of rows that have the same values for the column(s) we’re ordering by is undeterministic
TOPis an SQL Server-only feature
OFFSET - FETCHis a more standard way, it also simplifies paging