Modern Data Tech Stack tools
CI/CD Pipelines
-automated workflows that define steps involved in building, testing, and deploying code changes Ex tools: jenkins, circleCI, Gitlab
Great Expectations
-open source data validation and testing framework that helps data professionals and data engineers ensure the quality, integrity, and reliability of their data pipelines.
Kafka
-open source event streaming tool
Confluent
Managed Kafka
Apache Iceberg
open-source high-performance format for huge analytics tables. enables the use of SQL tables for big data, while making it possible for engines like spark, trino, flink, presto, hive, implata, doris, and pig