About
Welcome to Unpacking Data
Hey there! Welcome to the unpacking data journey. My name is Dan, I am a senior software engineer writing about data engineering, AI engineering, and building robust data systems at scale.
Background
I specialize in big data processing, data pipelines, and analytics using tools like Apache Spark, Databricks, and modern Python frameworks. My work focuses on building robust, scalable systems that can process large volumes of data efficiently while maintaining high quality. I'm also expanding into AI engineering, exploring how to build and deploy AI systems effectively.
Topics I Cover
- Data Engineering: Best practices, PySpark optimization, data pipelines, and big data processing
- Testing & Quality: Testing strategies for data pipelines, property-based testing, data quality and validation
- AI Engineering: Building and deploying AI systems, LLM applications, and ML infrastructure (coming soon!)
- Performance: Optimization techniques and performance tuning for big data applications
Whether you're working with petabytes of data or building the next generation of AI applications, you'll find practical insights and real-world experience here.