Data Warehousing & Analytics
Master data warehousing — OLTP vs OLAP, star/snowflake schemas, ETL pipelines, and analytical SQL techniques.
55 min•By Priygop Team•Last updated: Feb 2026
Data Warehouse Concepts
- OLTP vs OLAP: OLTP (Online Transaction Processing) — many small read/write transactions (your app database). OLAP (Online Analytical Processing) — complex queries on large datasets (data warehouse for reporting)
- Star Schema: Central fact table (sales_facts) surrounded by dimension tables (dim_product, dim_customer, dim_date). Denormalized for fast analytical queries. JOINs are simple and predictable
- Snowflake Schema: Dimensions are normalized further — dim_product → dim_category → dim_department. More storage-efficient but more JOINs. Star schema preferred for performance
- ETL Pipeline: Extract (pull from source databases), Transform (clean, aggregate, compute), Load (insert into warehouse). Tools: dbt, Apache Airflow, AWS Glue. Run nightly or in real-time
- Window Functions: ROW_NUMBER(), RANK(), LAG(), LEAD(), SUM() OVER(PARTITION BY ... ORDER BY ...) — analytical computations without GROUP BY. Running totals, rankings, moving averages
- Common Table Expressions (CTEs): WITH ranked_users AS (SELECT *, RANK() OVER(...) AS rank FROM users) SELECT * FROM ranked_users WHERE rank <= 10 — readable, composable subqueries. Recursive CTEs for hierarchical data