Physical Database Design, Data Types, and SQL Query Optimization
Objectives when selecting column data types
Minimize storage and IO, match business semantics, enforce valid values, support indexing and joins, and ensure sufficient precision/range.
Integrity controls / constraints
NOT NULL: disallow NULL; DEFAULT: supply default when no value provided; CHECK: enforce range or condition on column values.
Primary key index, unique index, secondary (non-unique) index
Primary key index: enforces PK, usually clustered; Unique index: enforces uniqueness for non-PK columns.
Referential integrity options (ON UPDATE / ON DELETE)
RESTRICT / NO ACTION: disallow operation if it breaks referential integrity; CASCADE: propagate update/delete to child rows.
Table scan vs index scan
Table scan: read entire table; Index scan (or seek): use index to find rows.
Foreign key constraint (FK)
Enforces that child-table FK values match parent-table PK/unique values (referential integrity).
Purpose of physical database design
Translate logical model into implementation for performance, storage, integrity, and maintainability.
NULL handling
Use IS NULL / IS NOT NULL; comparisons with = NULL will not work.
Best data types for primary/foreign keys
Use compact, fixed-size numeric types (INT, SMALLINT, BIGINT as needed) or GUIDs only when required.
Index purpose, advantages, disadvantages
Advantages: faster SELECTs, ORDER BY, JOINs; Disadvantages: extra storage, slower INSERT/UPDATE/DELETE.
Balanced vs imbalanced index
Balanced index: B-tree properly balanced → predictable depth/lookup cost; Imbalanced index: skewed tree depth → poor performance.
CHAR / NCHAR / VARCHAR / NVARCHAR
CHAR / NCHAR: fixed-length; VARCHAR / NVARCHAR: variable-length (saves space for varying lengths).
Storage media types & relative speed
CPU caches/registers (fastest) → RAM (main memory) → persistent storage (SSD/HDD) (slowest).
SQL commands (high-level forms)
CREATE TABLE table (col datatype [constraints], ...); ALTER TABLE table ADD|DROP|ALTER COLUMN ...; INSERT INTO table(columns) VALUES(...).
UNION
Combine results of multiple SELECTs with same column structure; removes duplicates unless UNION ALL used.
DATETIME / SMALLDATETIME / DATE
DATE: date only (no time); SMALLDATETIME: date + time with lower precision and smaller storage; DATETIME: full date + time, higher precision and larger range/storage.
JOINs and multi-table queries
INNER JOIN: rows matching both tables; LEFT JOIN: all left rows; NULLs for missing right.
Row-oriented vs column-oriented storage
Row-oriented: stores rows together (optimized for OLTP); Column-oriented: stores columns together (optimized for analytical queries).
Aggregate functions, GROUP BY, HAVING
SUM, AVG, COUNT, MAX, MIN used with GROUP BY for grouped results.
BIGINT / INT / SMALLINT / TINYINT (approx. ranges)
TINYINT: smallest (0-255); SMALLINT: small range (approx. ±32k); INT: typical 32-bit (approx. ±2 billion); BIGINT: large 64-bit (very large range).
Partitioning: vertical vs horizontal
Vertical partitioning: split columns into different tables; Horizontal partitioning: split rows into partitions.
