DE Interview: Pivot / Unpivot
complete interview-ready guide for
🔹 PATTERN 13: Pivot / Unpivot
🎯 Core skill: Transform rows ↔ columns
Designed for 6–8 years experience, with real interview questions + solutions in:
-
✅ SQL Basic
-
✅ SQL Advanced
-
✅ PySpark Basic
-
✅ PySpark Advanced / Optimized
🔹 Assumed Interview Dataset
sales
1️⃣ Pivot Product Rows into Columns (Total Revenue)
🟢 SQL – Basic (CASE WHEN)
📌 Most common interview pivot
🔵 SQL – Advanced (PIVOT)
🟢 PySpark – Basic
🔵 PySpark – Advanced (Null-safe + rename)
2️⃣ Pivot Multiple Metrics (Revenue & Quantity)
🟢 SQL – Basic
🔵 SQL – Advanced (Double Pivot Pattern)
📌 Very common BI interview question
🟢 PySpark – Basic
🔵 PySpark – Advanced
3️⃣ Monthly Pivot (Rows → Month Columns)
🟢 SQL – Basic
🔵 SQL – Advanced
🟢 PySpark – Basic
🔵 PySpark – Advanced
4️⃣ Unpivot Columns Back to Rows (Very Common!)
🟢 SQL – Basic (UNION ALL)
🔵 SQL – Advanced (UNPIVOT)
📌 Senior-level question
🟢 PySpark – Basic
🔵 PySpark – Advanced (Explode)
5️⃣ Dynamic Pivot (Unknown Values) – Tricky Interview Question
🔵 SQL – Conceptual
SQL requires dynamic SQL for unknown pivot columns.
🔵 PySpark – Advanced (Dynamic)
📌 Very strong Spark interview answer
🔥 Interview Traps & Talking Points
| Topic | Interview Insight |
|---|---|
| CASE vs PIVOT | CASE works everywhere |
| Dynamic pivot | Easy in Spark, hard in SQL |
| NULL handling | Always COALESCE / fillna |
| Performance | Pivot = shuffle-heavy |
| Wide tables | Bad for analytics |
| Unpivot | Needed before ML/BI |
🧠 When to Pivot vs Not Pivot
-
❌ Avoid pivot in raw / bronze layers
-
✅ Use pivot in gold / reporting layers
-
✅ Unpivot before ML / analytics
Comments
Post a Comment