Tag: Data Engineering
All the articles with the tag "Data Engineering".
Single-Node Data Engineering: DuckDB, DataFusion, Polars, and LakeSail
Published: at 02:00 PMOptimize single-node data engineering with DuckDB, DataFusion, Polars, and LakeSail. Compare architectures and learn when to transition to Dremio MPP.
Apache Iceberg SCD Type 2 and CDC Patterns: Building Historical Lakehouse Tables
Published: at 11:00 AMA deep dive into implementing Slowly Changing Dimension Type 2 (SCD Type 2) patterns and Change Data Capture (CDC) pipelines on Apache Iceberg, using PySpark and Dremio.
Apache Iceberg Catalogs Explained: REST, Glue, Hive Metastore, Polaris, Nessie, and Snowflake
Published: at 10:00 AMA deep dive into Apache Iceberg catalog architecture, comparing REST catalogs, AWS Glue, Project Nessie, Polaris, and Snowflake. Learn catalog role, credential vending, and cross-engine configurations.