Skip to content

SQL key points

Tansu Dasli edited this page Oct 23, 2023 · 49 revisions
  • REGEXP vs SUBSTR

REGEXP is a where clause (filter operators), not a SELECT clause. So, You can not parse a string, as in programming languages.
Use SUBSTR in SELECT clause to parse a string

Select SUBSTR(name, 1, 3) AS nam
  From students
 where name REGEXP '^to'
  • NOT vs != or Checking emptiness

Filtering w/ REGEXP, LIKE, =/<>/!=, IS NULL, BETWEEN/ NOT BETWEEN, IN/NOT IN
Where NOT name = "xyz" or Where name != "xyz" are the same

  • Subquery in SELECT clause vs in WHERE clause

Common usage is inner sql, and used after FROM/IN !
Usage in SELECT clause must return one value, and retrieves a calculation with aggregate functions. So you avoid aggregation in main query also !

  • Subquery optimization

runs first
use WITH clause to run one time, If you have many and same subqueries !

with 
 exp1 AS (SELECT ....),
 exp2 AS (SELECT ...)
SELECT *
  FROM exp1

  • stored function vs procedure

stored function takes param, do calculations and returns a single value. apply to columns!
stored procedures manipulates data, no param, no return.
stored procedures are secure, you can limit to access underlying tables, but bad practice (increases db dependency, expensive calculation on db side etc...) !

  • DELETE vs DROP

DELETE is record level!
DROP DATABASE/TABLE/TEMPRORARY TABLE/TRIGGER ... for the deletion of all kind of db objects

  • table creation
create table table_name 
( 
  column_name TYPE,
  column_name ENUM('', '', '')
)
  • BLOB vs TEXT

blob is binary string,
text is non-binary string

  • HAVING vs WHERE

todo: !!!!!

  • Cloning a table

CREATE TABLE xxx AS SELECT * from T to clone w/ data
CREATE TABLE xxx LIKE T to clone w/o data
SHOW CREATE TABLE T to get sql DDL for the source table
CREATE TABLE xxx LIKE T then INSERT INTO xxx SELECT * FROM T to clone then insert the data
there is no CLONE TABLE clause !

  • STRING TYPES

CHAR is fixed + adds empty string where needed vs VARCHAR is variable
CHAR, VARCHAR, TEXT, BLOB are all string type

  • SQL Functions
  • scalar func.: takes 1 or 2 arg. and returns 1 output
  • aggregation func.: takes a column (many rows) as arg. and returns 1 output
    can be used in SELECT or HAVING clause!
  • window functions as analytical func. (DENSE_RANK, RANK, NTILE, ROW_NUMBER etc...) or can also be used w/ aggregate functions w/ OVER clause
    can be used in SELECT clause!
    requires optional PARTITION and mandatory ORDER BY clause (orders partitioned rows, not the result!)
    ORDER BY can be used to order the results
    windowing aggregate functions does not support ORDER BY clause in windowing! (only OVER and PARTITION)
employees ranked in a department based on salary
SELECT empno, deptno, salary,
       DENSE_RANK() OVER (PARTITION BY deptno ORDER BY salary DESC) AS rank
FROM   employees;
no group by :)
SELECT year, country, product, profit,
       SUM(profit) OVER() AS total_profit,
       SUM(profit) OVER(PARTITION BY country) AS country_profit
       FROM sales
       ORDER BY country, year, product, profit;
  • Aggregations
  • max, sum, avg are some of aggregations
  • window functions can be used with OVER clause in aggregation
  • GROUP BY column_name WITH ROOLUP to add extra aggregation lines !
    Use grouping(column_name) in SELECT clause to generate meaningful texts
    Use grouping(column_name) in HAVING clause to filter only added extra lines !
SELECT year, country, product, SUM(profit) AS profit
       FROM sales
       GROUP BY year, country, product WITH ROLLUP
       HAVING GROUPING(year, country, product) <> 0;
  • Encoding

CASE , IF can be used in SELECT clause to encode (i.e. Male <-> M)

  • Isolation Levels (multiple sessions accessing data)
  • READ-COMMITTED, other sessions only read committed data (or old data) !
  • READ_UNCOMMITTED,
  • REPEATABLE-READ,
  • SERIALIZABLE, other sessions wait the commit, no read old data !
  • Locks
  • table lock,
  • Intention locks (Shared or eXclusive): someone is going to lock a row ! (more granular locking)
  • record lock,
  • page lock: not supported in mysql
  • dead lock: same session tries to set lock at the same row ! both roll-backs.
  • Indexes
  • Primary Key index (clustered): unique, PK + clustered created default!
    B-Tree index + all rows stored together (one data structure), no pointer at leaf nodes (refers to actual rows), based on PK, unique and non-null columns, sorted
  • Full-text index: searching text in a string column (TEXT, VARCHAR, CHAR)
  • Spatial index: GEOSpatial data types, R-Tree (GEOMETRY, POINT, LINESTRING, POLYGON w/ MULTIxxxx types...)
  • Indexes increase perf. for SELECT statements, but an negative impact on UPDATE, DELETE statements, and storage.
  • composite indexes contains more than 1-non-null column
  • Relations
  • it is about the primary key (PK) and foreign keys (FK)
  • one-to-one,
  • one-to-many,
  • many-to-many: products, order (header), order_details (order has many same product, product can be in many different orders)
  • Some of Administration concepts
  • to start/stop/restart mysql service mysql start|stop|restart
  • to recover root password use service mysql --skip-grant-tables start
  • mysql_upgrade is deprecated in mysql8. generally gets a snapshot, and after upgrade checks incompatibilities, attempts repair if necessary, do some changes of new capabilities.
  • 3 type of files available. configuration .conf, binary files, and data files (innodb vs myisam) .ibd or .myd -data and .frm -schema
  • If disk size is about to full, then use LVM in linux to increase or zfs pool management, or move data & index files to a new drive and create a symbol link (not a best practice)
  • backups are physical (moving conf and data files), logical (getting sql statements + data w/ mysqldump), and replication