Typically data is processed, extracted, and transformed in steps. Therefore, a sequence of data processing stages can be referred to as a data pipeline.
What is a data pipeline?
Which design pattern to choose?
There are lots of things to consider, i.e. Which data stack to use? What tools to consider? How to design a data pipeline conceptually? ETL or ELT? Maybe ETLT? What is Change Data Capture?
I will try to cover these questions.
Read it on Medium: https://towardsdatascience.com/unit-tests-for-sql-scripts-with-dependencies-in-dataform-847133b803b7