The Databricks Data Engineer
Subscribe
Sign in
Home
Interview Like Seniors
Execute Like Seniors
Think Like Seniors
Hands-on Labs
Learning Paths
Archive
About
Execute Like Seniors
Latest
Top
Discussions
Code Reviews at Scale: What Fortune 500 Companies Actually Do (And Why Your Quick LGTM Is Fine)
The 80/20 framework that focuses your attention on schema changes, resource sizing, and failure paths - while safely speeding up the rest
Feb 11
•
Jakub Lasak
2
How Liquid Clustering Actually Works in Databricks
Why automatic data organization replaces weekly OPTIMIZE jobs, eliminates partition explosions, and lets you change clustering keys without rewriting…
Jan 28
•
Jakub Lasak
28
1
The Databricks Compute Selection Guide: Jobs, All-Purpose, SQL Warehouses, and Serverless
Why scheduled jobs on All-Purpose clusters are bleeding thousands of dollars a month - and how to fix it in 15 minutes
Jan 14
•
Jakub Lasak
9
1
The Databricks Debugging Maturity Ladder: Junior to Principal
How engineers progress through trial-and-error, analysis paralysis, and over-engineering before earning the wisdom to find the one bottleneck that…
Jan 7
•
Jakub Lasak
16
2
The Spark Cluster Parallelism Guide: Why 32 Nodes Can Be Slower Than 4
The task-to-core ratio that separates $600/month waste from systematic diagnosis - with the complete decision framework for repartition vs. scale
Dec 31, 2025
•
Jakub Lasak
3
Understanding Spark Shuffle: The Complete Architecture Guide
Why your 10-minute job became 2 hours, how data skew causes 100GB → 500GB inflation, and the three-phase mechanism behind every GROUP BY
Dec 23, 2025
•
Jakub Lasak
10
2
Inside the Delta Lake Transaction Log: From Write to Time Travel
How 2KB JSON files control your 10TB table, and why OPTIMIZE temporarily doubles your storage before making queries faster
Dec 18, 2025
•
Jakub Lasak
6
Real-Time Pricing Optimization with Databricks: An E-Commerce Case Study
How streaming data pipelines, Delta Lake, and MLflow turn customer purchases into data-driven pricing decisions in under 3 hours
Dec 17, 2025
•
Jakub Lasak
3
Right-Sizing Your Databricks Cluster: A 500 GB Case Study
A Practical Guide to Shuffle Partitions, Executor Sizing, and Handling Data Skew.
Nov 19, 2025
•
Jakub Lasak
8
The Databricks Schema Management Guide: When to Evolve and When to Enforce
The strategic pattern that turns schema evolution from a 40-hour debugging disaster into controlled change management
Nov 15, 2025
•
Jakub Lasak
Data Modeling in Databricks: The Complete Guide
From stakeholder interviews to production deployment - the 20% of knowledge that delivers 80% of results
Nov 13, 2025
•
Jakub Lasak
4
The Delta Table Optimization Guide: Partitioning, Z-Ordering, and Liquid Clustering
The 20% of knowledge that solves 80% of optimization decisions
Nov 5, 2025
•
Jakub Lasak
6
2
1
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts