30-Day SQL and Azure SQL for Data Engineering Master Guide

 

For 4–10 Years Experienced Data Engineers


Objective

This guide is designed to:

  • Build strong SQL fundamentals

  • Develop advanced query writing skills

  • Build real-time problem-solving capability

  • Understand Azure SQL deeply

  • Learn production-level SQL optimization

  • Prepare for senior Data Engineering interviews

Target Audience:

  • Data Engineers

  • SQL Developers

  • Azure Data Engineers

  • ETL Developers

  • BI Engineers

Daily Time Commitment:

  • 3 Hours Per Day

  • 30 Days Total

Learning Strategy:

  • 20% Theory

  • 80% Hands-On SQL Coding

Goal:

  • Write optimized SQL queries

  • Solve real-time business problems

  • Handle production SQL incidents

  • Build scalable ETL logic

  • Understand warehouse design

  • Optimize millions/billions of rows


DAILY LEARNING STRUCTURE

Hour 1 – Learn Concepts

Focus on:

  • WHY SQL concepts exist

  • Query execution understanding

  • Real-time usage

  • Optimization mindset

Avoid:

  • Blind memorization

  • Watching endless tutorials


Hour 2 – Coding Practice

Focus on:

  • Writing queries manually

  • Solving analytical problems

  • Window function practice

  • ETL query development


Hour 3 – Real-Time Scenarios

Focus on:

  • Performance tuning

  • Deadlock troubleshooting

  • Incremental loading

  • Query optimization

  • Warehouse design

  • Production debugging


SECTION 1 – SQL BASICS

Topics:

  • SELECT

  • WHERE

  • DISTINCT

  • TOP

  • ORDER BY

  • GROUP BY

  • HAVING

  • CASE WHEN

  • NULL handling


SELECT

Purpose:
Retrieve data.

Practice:

  • Employee records

  • Sales reports

  • Customer details


WHERE

Purpose:
Filter records.

Practice:

  • Salary filtering

  • Date filtering

  • Region filtering


GROUP BY

Purpose:
Aggregation.

Practice:

  • Revenue by region

  • Department-wise employee count

  • Monthly sales trends


HAVING

Purpose:
Filter aggregated results.


CASE WHEN

Purpose:
Conditional logic.

Use Cases:

  • Risk categorization

  • Status mapping

  • Dynamic calculations


NULL HANDLING

Topics:

  • IS NULL

  • ISNULL

  • COALESCE

Real-Time Usage:

  • Missing data handling

  • ETL validations


SECTION 2 – JOINS

Topics:

  • INNER JOIN

  • LEFT JOIN

  • RIGHT JOIN

  • FULL OUTER JOIN

  • SELF JOIN

  • CROSS JOIN

  • OUTER APPLY


INNER JOIN

Purpose:
Match records between tables.


LEFT JOIN

Purpose:
Find unmatched records.

Practice:

  • Customers without orders

  • Employees without managers


SELF JOIN

Purpose:
Compare same table data.

Use Cases:

  • Employee-manager hierarchy

  • Duplicate detection


OUTER APPLY

Critical Senior-Level Topic.

Use Cases:

  • Top N per group

  • Dynamic row processing


SECTION 3 – ADVANCED SQL

Topics:

  • CTEs

  • Subqueries

  • Window functions

  • Temp tables

  • Dynamic SQL


CTE (COMMON TABLE EXPRESSIONS)

Purpose:
Readable modular queries.

Practice:

  • Recursive hierarchy

  • Deduplication logic

  • Multi-step calculations


WINDOW FUNCTIONS

Most Important SQL Topic.

Functions:

  • ROW_NUMBER

  • RANK

  • DENSE_RANK

  • LAG

  • LEAD

  • SUM OVER

Use Cases:

  • Ranking

  • Running totals

  • Latest records

  • Trend analysis


TEMP TABLES

Purpose:
Intermediate processing.

Real-Time Usage:

  • ETL staging

  • Complex transformations


DYNAMIC SQL

Purpose:
Dynamic query generation.

Use Cases:

  • Metadata-driven ETL

  • Dynamic reporting


SECTION 4 – STORED PROCEDURES AND PROGRAMMABILITY

Topics:

  • Stored procedures

  • Functions

  • Views

  • Triggers

  • Transactions

  • Error handling


STORED PROCEDURES

Purpose:
Reusable SQL logic.

Use Cases:

  • ETL processing

  • Reporting logic

  • Incremental loads


UDFS

Types:

  • Scalar functions

  • Table-valued functions


TRIGGERS

Purpose:
Automatic actions.

Use Cases:

  • Audit tracking

  • Change logging


TRANSACTIONS

Critical Topic.

Topics:

  • BEGIN TRAN

  • COMMIT

  • ROLLBACK

Understand:

  • ACID properties

  • Consistency

  • Rollback behavior


ERROR HANDLING

Topics:

  • TRY CATCH

  • THROW

  • RAISERROR


SECTION 5 – PERFORMANCE TUNING

Topics:

  • Execution plans

  • Indexing

  • Query optimization

  • Statistics

  • Partitioning


EXECUTION PLANS

Critical Senior-Level Topic.

Understand:

  • Table scans

  • Index seeks

  • Key lookups

  • Hash match

  • Nested loops


INDEXING

Types:

  • Clustered

  • Nonclustered

  • Composite

  • Filtered indexes

Practice:

  • Create indexes

  • Analyze fragmentation

  • Optimize slow queries


QUERY OPTIMIZATION

Topics:

  • SARGability

  • Predicate pushdown

  • Join optimization

  • Reducing scans


STATISTICS

Purpose:
Query optimizer decisions.


PARTITIONING

Purpose:
Handle huge datasets.

Use Cases:

  • Historical tables

  • Archival strategies


SECTION 6 – SQL SERVER DMVS

Topics:

  • sys.dm_exec_requests

  • sys.dm_exec_query_stats

  • sys.dm_tran_locks

  • sys.dm_os_wait_stats


DMVS

Purpose:
Production monitoring.

Use Cases:

  • Blocking analysis

  • Query tuning

  • Resource troubleshooting


SECTION 7 – DATA WAREHOUSING CONCEPTS

Topics:

  • Fact tables

  • Dimension tables

  • Star schema

  • Snowflake schema

  • Surrogate keys

  • SCD


FACT TABLES

Purpose:
Store measurable metrics.

Examples:

  • Sales

  • Transactions

  • Calls


DIMENSION TABLES

Purpose:
Store descriptive attributes.

Examples:

  • Customer

  • Employee

  • Product


STAR SCHEMA

Critical Interview Topic.

Benefits:

  • Better performance

  • Simpler joins


SCD TYPES

Topics:

  • SCD Type 1

  • SCD Type 2

Practice:

  • Historical tracking

  • Merge logic


SECTION 8 – ETL CONCEPTS

Topics:

  • Incremental loads

  • CDC

  • Watermark logic

  • Upsert and merge


INCREMENTAL LOADS

Purpose:
Load changed data only.


CDC

Purpose:
Track inserts/updates/deletes.


WATERMARK LOGIC

Purpose:
Track latest successful load.


MERGE

Critical Topic.

Use Cases:

  • Upserts

  • SCD processing


SECTION 9 – AZURE SQL

Topics:

  • Azure SQL Database

  • Managed Instance

  • Elastic Pools

  • DTU vs vCore

  • Geo-replication

  • Query Store


AZURE SQL DATABASE

Purpose:
Managed cloud SQL service.

Benefits:

  • Automatic backups

  • HA/DR

  • Managed patching


MANAGED INSTANCE

Purpose:
Near full SQL Server compatibility.


ELASTIC POOLS

Purpose:
Resource sharing across databases.


DTU VS VCORE

Critical Interview Topic.

DTU:

  • Simpler model

vCore:

  • Better scalability control


GEO-REPLICATION

Purpose:
Disaster recovery.


QUERY STORE

Purpose:
Track query history and regressions.


SECTION 10 – REAL-TIME PRODUCTION SCENARIOS

Topics:

  • Deadlocks

  • Blocking

  • TempDB issues

  • CPU spikes

  • Long-running queries

  • Transaction log full


DEADLOCKS

Critical Senior-Level Topic.

Understand:

  • Deadlock graphs

  • Locking behavior

  • Isolation levels


BLOCKING

Practice:

  • Identify blocking sessions

  • Analyze waits

  • Resolve locks

Commands:

  • sp_who2

  • sys.dm_exec_requests


TEMPDB ISSUES

Topics:

  • Temp spills

  • Version store growth

  • Temp table overuse


TRANSACTION LOG ISSUES

Topics:

  • Log growth

  • Backup strategies

  • Recovery models


SECTION 11 – SQL SERVER SECURITY

Topics:

  • Authentication

  • Authorization

  • Roles

  • Permissions

  • TDE

  • Always Encrypted

  • Row-level security


TDE

Purpose:
Transparent Data Encryption.


ALWAYS ENCRYPTED

Purpose:
Protect sensitive columns.


ROW-LEVEL SECURITY

Purpose:
Restrict row visibility.


SECTION 12 – JSON AND XML SUPPORT

Topics:

  • OPENJSON

  • JSON_VALUE

  • XML parsing

  • XPath


JSON SUPPORT

Use Cases:

  • API ingestion

  • Semi-structured data


XML PARSING

Use Cases:

  • Legacy integrations

  • Enterprise systems


SECTION 13 – REAL-TIME ARCHITECTURE

Typical Flow:

Source Systems

ADF / APIs / Kafka

Landing Tables

Staging Layer

Transformation Layer

Warehouse Layer

Power BI Reporting


SECTION 14 – ENTERPRISE ETL DESIGN PATTERNS

Topics:

  • Audit tables

  • Error tables

  • Watermark tables

  • Control tables


AUDIT TABLES

Purpose:
Track ETL execution.


ERROR TABLES

Purpose:
Capture failed records.


WATERMARK TABLES

Purpose:
Track incremental processing.


SECTION 15 – DOMAIN-WISE SQL CODING QUESTIONS

SALES DOMAIN QUESTIONS

  1. Find highest sales by region.

  2. Find top 5 customers by revenue.

  3. Calculate month-over-month growth.

  4. Find products with declining sales.

  5. Detect duplicate transactions.

  6. Find customers with no purchases.

  7. Calculate running revenue totals.

  8. Find top-selling products monthly.

  9. Generate quarterly sales trends.

  10. Find average order value.


EMPLOYEE DOMAIN QUESTIONS

  1. Find nth highest salary.

  2. Find employees without managers.

  3. Find department-wise salary ranking.

  4. Find employees with maximum overtime.

  5. Calculate attendance percentage.

  6. Find duplicate employee records.

  7. Generate payroll summaries.

  8. Find employees hired in last 30 days.

  9. Find longest-serving employees.

  10. Calculate promotion trends.


CALL CENTER DOMAIN QUESTIONS

  1. Find average handling time.

  2. Detect SLA violations.

  3. Find top-performing agents.

  4. Find repeated escalations.

  5. Generate call resolution trends.

  6. Find abandoned call percentages.

  7. Find peak call hours.

  8. Calculate customer satisfaction metrics.

  9. Find agents with lowest closure rate.

  10. Generate weekly performance reports.


HEALTHCARE DOMAIN QUESTIONS

  1. Detect duplicate claims.

  2. Find patients with repeated admissions.

  3. Calculate average treatment costs.

  4. Find top diagnoses by region.

  5. Detect insurance fraud patterns.

  6. Find doctor-wise patient counts.

  7. Generate hospital occupancy reports.

  8. Find delayed claims.

  9. Calculate medicine utilization.

  10. Generate patient recovery trends.


STOCK MARKET DOMAIN QUESTIONS

  1. Find highest stock growth.

  2. Calculate moving averages.

  3. Detect unusual trading activity.

  4. Find top-performing stocks.

  5. Calculate daily volatility.

  6. Generate trend analysis.

  7. Detect price spikes.

  8. Find stock correlation.

  9. Calculate cumulative returns.

  10. Generate sector performance reports.


SECTION 16 – MID-LEVEL SQL PROJECTS

PROJECT 1 – SALES DATA WAREHOUSE

Requirements:

  • Create fact/dimension tables

  • Build incremental loads

  • Create reporting views

Concepts Used:

  • Star schema

  • MERGE

  • Window functions


PROJECT 2 – EMPLOYEE ANALYTICS SYSTEM

Requirements:

  • Track employee history

  • Build payroll reports

  • Detect anomalies

Concepts Used:

  • SCD Type 2

  • Aggregations

  • Stored procedures


PROJECT 3 – CALL CENTER REPORTING

Requirements:

  • Process call records

  • Generate SLA dashboards

  • Detect escalations

Concepts Used:

  • Window functions

  • Performance tuning


PROJECT 4 – INCREMENTAL ETL FRAMEWORK

Requirements:

  • Build CDC pipeline

  • Maintain audit logs

  • Implement watermarking

Concepts Used:

  • Transactions

  • MERGE

  • Error handling


PROJECT 5 – HEALTHCARE CLAIM ANALYTICS

Requirements:

  • Process millions of claims

  • Detect duplicates

  • Optimize queries

Concepts Used:

  • Partitioning

  • Indexing

  • Advanced joins


SECTION 17 – 30-DAY EXECUTION PLAN

WEEK 1 – CORE SQL

Day 1

  • SELECT

  • WHERE

  • GROUP BY

  • HAVING


Day 2

  • CASE WHEN

  • NULL handling

  • DISTINCT

  • TOP


Day 3

  • INNER JOIN

  • LEFT JOIN

  • RIGHT JOIN


Day 4

  • FULL JOIN

  • SELF JOIN

  • CROSS JOIN


Day 5

  • Subqueries

  • CTEs


Day 6

  • Window functions


Day 7

  • Coding practice + mock interview


WEEK 2 – ADVANCED SQL

Day 8

  • Temp tables

  • Dynamic SQL


Day 9

  • Stored procedures

  • Functions


Day 10

  • Transactions

  • Error handling


Day 11

  • Execution plans


Day 12

  • Indexing

  • Statistics


Day 13

  • Query optimization


Day 14

  • Advanced coding practice


WEEK 3 – DATA ENGINEERING + SQL

Day 15

  • Fact tables

  • Dimension tables


Day 16

  • Star schema

  • Snowflake schema


Day 17

  • Incremental loads

  • CDC


Day 18

  • SCD Type 1 & 2


Day 19

  • MERGE

  • Watermark logic


Day 20

  • Audit frameworks


Day 21

  • Real-time ETL scenarios


WEEK 4 – AZURE SQL + PRODUCTION SCENARIOS

Day 22

  • Azure SQL basics

  • Managed Instance


Day 23

  • Geo-replication

  • Query Store


Day 24

  • Deadlocks

  • Blocking


Day 25

  • TempDB issues

  • CPU troubleshooting


Day 26

  • Security

  • TDE

  • Always Encrypted


Day 27

  • JSON support

  • XML parsing


Day 28

  • SQL Agent

  • Maintenance jobs


Day 29

  • Domain-wise mock interview


Day 30
FINAL MOCK INTERVIEW + REVISION


REAL-TIME BEST PRACTICES

Always Follow:

  • Avoid SELECT *

  • Use proper indexing

  • Optimize joins

  • Use explicit transactions carefully

  • Update statistics regularly

  • Use partitioning for huge tables

  • Monitor execution plans

  • Use incremental processing

  • Build modular ETL logic

  • Implement audit logging


MOST IMPORTANT SKILLS FOR SENIOR ENGINEERS

You must become strong in:

  • Query optimization

  • Warehouse design

  • Incremental ETL design

  • Performance tuning

  • Deadlock troubleshooting

  • Production debugging

  • Scalability thinking

  • SQL + Azure integration

  • Real-time analytics support

  • Enterprise architecture understanding


FINAL INTERVIEW EXPECTATIONS

At 4–10 years experience, interviewers expect:

  • Strong SQL fundamentals

  • Advanced analytical query writing

  • Performance tuning capability

  • Real-time troubleshooting ability

  • Warehouse modeling understanding

  • ETL design capability

  • Azure SQL understanding

  • Production support mindset

  • Scalability thinking

They do NOT expect only SELECT query knowledge.

They expect:

  • Engineering mindset

  • Optimization capability

  • Production troubleshooting

  • Business problem solving

  • Enterprise SQL architecture understanding


END OF DOCUMENT

Comments

Popular posts from this blog

SCD TYPE 2 – INTERVIEW QUESTIONS + MERGE CODE

TIME-SERIES SQL

TIME-BASED SQL QUERIES