DuckDB is an embedded database, similar to SQLite, but designed for OLAP-style analytics. It is crazy fast and allows you to read and write data stored in CSV, JSON, and Parquet files directly, ...
Every data engineering team right now is being asked the same question: "How do we build a chatbot that talks to our data?" The prototypes are deceptively simple. A developer connects GPT-5.1 to a ...
For years, dbt has cemented its reputation as the premier SQL-based transformation tool, becoming a cornerstone of the modern data stack. Its model-centric workflow, testing capabilities, and ...
Why do I need something like this? dbt-coverage is to dbt what coverage.py and interrogate are to Python. It is a single CLI tool which checks your dbt project for missing documentation and tests.
Python, R, or SQL: Which reigns supreme in 2025's data science landscape? Compare trends and use cases to choose best language for your data science projects. The data science industry is booming, ...
Since its launch in 2013, Databricks has relied on its ecosystem of partners, such as Fivetran, Rudderstack, and dbt, to provide tools for data preparation and loading. But now, at its annual Data + ...
Apache Airflow is a platform for managing data pipeline that is written in Python, used for creating and scheduling tasks. Being entirely based on code, it is extensively used in data engineering for ...
Earlier this year, I had the privilege of serving on the organizing committee for the DataTune conference in my hometown of Nashville, Tenn. Unlike many database-specific or platform-specific ...
As data analytics professionals, we all know that ETL (Extract, Transform, Load) is a critical process for data analytics projects. When it comes to choosing the right tools for ETL, DBT (Data Build ...