What if a 9B model could beat Sonnet at SQL?
It started as a thought experiment: multi-turn benchmarks expose a surprising weakness in frontier models. How close could a fine-tuned local model get?
Posts by Dan on Unpacking Data.
It started as a thought experiment: multi-turn benchmarks expose a surprising weakness in frontier models. How close could a fine-tuned local model get?
Everyone's declaring RAG dead. They're right about the term and wrong about what it means. I built a code search system to figure out what comes after it, and the answer surprised me: retrieval was the easy part.
Data quality is a rather critical part of any production data pipeline. In order to provide accurate SLA metrics and to ensure that the data is correct, it is important to have a way to validate the data and report the metrics for further analysis.
Have you ever stumbled upon a Spark ETL and you were left wondering how a simple loading of a dataset can take hours, even though the filtered dataset you are specifying is relatively small?
In this blogpost we will continue our journey of testing our Data Pipelines. If you haven't checked out the first post, make sure you do.
Unit testing is often regarded as a main pillar of testing your software applications, and it usually involves testing a single/unit component and ensuring that it covers all the edge cases the software developer can think of.