Convergent Evolution and its Patterns

After we learned what convergent evolution is and its relation to data engineering, Let's look at concrete examples and how these different evolutions have arisen.

Convergent Evolution -> Pattern -> Design

We'll navigate from the bottom up, where the bottom is the terms we use or hear in our day-to-day work, the CEs. We'll group them and analyze history, similarities that lead to DE patterns, and later in chapter 5, data engineering design patterns.

Below is a first look at the convergent evolutions, their pattern, and the data engineering design pattern we will explore in this book. This visualization also shows how the different terms proceed into patterns and then design patterns.


Bear in mind that this will be constantly updated.

graph LR

    CE_MV[CE: MV]
    P_CachingDisk[P: Cache]
    DP_DynamicQuerying[DP: Dynamic-Querying]
    CE_BITool[CE: BI Tool]
    CE_SemanticLayer[CE: Semantic Layer]
    CE_ModernOLAP[CE: Modern OLAP System]
	CE_DataVirtualization[CE: Data Virtualization]
    P_InMemory[P: In-Memory / Ad-Hoc Querying]
    CE_MessageQueue[CE: Message-Queue]

    CE_ETL_Tools[CE: Traditional ETL Tools]

    P_ELT[P: ELT]
	P_Transformation_ETL["P: Business Transformation (ETL)"]

    CE_DataLake[CE: Data Lake]
    P_ShortTerm[P: Short-term storage]
    CE_TraditionalOLAP[CE: Traditional OLAP System]
    DP_OpenData[DP: Open Data Platform - Lakehouse]
    P_TableFormats[P: Open Table Format]

    P_DataSharing[P: Data Sharing]
    CE_SchemaEvolution[CE: Schema Evolution]
    CE_DataContract[CE: Data Contracts]
    P_ChangeMgmt[P: Change Management]
    P_DataAsset[P: Data Asset]
    DP_DeclarativeGovernance[DP: Asset-based Enterprise Governance]
    DP_DeclarativePipeline[DP: Declarative Orchestration]
    P_Streaming[P: Streaming]

    CE_StoredProcedures[CE: Stored Procedures]
    CE_BashCron[CE: Bash / Cron]
    CE_PythonScript[CE: Python Script]
    P_ImplicitOrchestration[P: Implicit Orchestration]

	CE_dbt[CE: dbt table]

    P_ModernDataStack[P: Modern Data Stack Approach]
    CE_Microservices[CE: Microservices]
    CE_Monolith[CE: Monolith]
    CE_DataMesh[CE: Data Mesh]

    P_Reusability[P: Reusability]
    P_Orchestration["P: Data-Flow Modeling (Orchestration)"]
    CE_ExportingCSVs[CE: Exporting CSVs]
    CE_DataWarehouse[CE/DP: Data Warehouse]
    DP_RealTime[DP: Real-Time Platform]

    %% Linking Nodes
    CE_MV --> P_CachingDisk
    CE_MV --> P_Transformation_ETL
    CE_dbt --> P_Transformation_ETL
    CE_OBT --> P_Transformation_ETL
    CE_OBT --> P_CachingDisk

    CE_TraditionalOLAP --> P_CachingDisk

    %%CE_OBT --> P_TableFormats
    P_TableFormats --> DP_OpenData

	CE_TraditionalOLAP --> P_CachingDisk
	CE_dbt --> P_CachingDisk

    CE_BITool --> P_InMemory
	CE_DataVirtualization --> P_InMemory

    P_CachingDisk --> DP_DynamicQuerying
	CE_ModernOLAP --> P_CachingDisk
    CE_ODS --> P_CachingDisk

	P_Transformation_ETL --> DP_OpenData
	%% P_Transformation_ETL --> DP_DeclarativePipeline

	CE_SemanticLayer --> P_CachingDisk

    CE_SemanticLayer --> P_InMemory
    CE_ModernOLAP --> P_InMemory

    CE_ODS --> P_ShortTerm
    CE_MessageQueue --> P_ShortTerm
    CE_MessageQueue --> P_ImplicitOrchestration

    CE_SchemaEvolution --> P_ChangeMgmt
    CE_DataContract --> P_ChangeMgmt

    CE_NoSQL --> P_ChangeMgmt

    CE_DataContract --> P_DataAsset

    CE_Monolith --> P_ModernDataStack
    CE_DataMesh --> P_ModernDataStack
    CE_DataMesh --> P_ImplicitOrchestration
    CE_Microservices --> P_ModernDataStack
    CE_Microservices --> P_ImplicitOrchestration
	CE_Microservices --> P_Reusability

    CE_DataWarehouse --> P_Transformation_ETL
	CE_DataWarehouse --> P_CachingDisk
	CE_DataWarehouse --> P_InMemory
	CE_DataLake --> P_TableFormats
	CE_DataLake --> P_ELT
	CE_DataLake --> P_Transformation_ETL
    CE_REVERSE_ETL --> P_DataSharing
    %% CE_REVERSE_ETL --> P_ImplicitOrchestration
    CE_MDM --> P_Reusability
    CE_MDM --> P_DataSharing
    CE_CDP --> P_Transformation_ETL
    CE_CDP --> P_InMemory

    CE_ETL_Tools --> P_Transformation_ETL
    CE_ETL_Tools --> P_Orchestration
    CE_StoredProcedures --> P_Orchestration
    CE_StoredProcedures --> P_Transformation_ETL
    CE_BashCron --> P_Orchestration
    CE_PythonScript --> P_Orchestration
    CE_PythonScript --> P_Reusability
    CE_PythonScript --> P_ImplicitOrchestration
    CE_PythonScript --> P_Transformation_ETL

	P_ImplicitOrchestration --> DP_RealTime
	P_Streaming --> DP_RealTime

	CE_DWA --> P_Reusability
	CE_dbt --> P_Reusability

    P_InMemory --> DP_DynamicQuerying

    %% These are assumed based on the diagram structure
    P_DataAsset --> DP_OpenData
    P_DataAsset --> DP_DeclarativePipeline
    P_ChangeMgmt --> DP_DeclarativeGovernance
    P_ShortTerm --> DP_DynamicQuerying
    P_DataSharing --> DP_OpenData
    P_ModernDataStack --> DP_DeclarativePipeline
    P_ModernDataStack --> DP_OpenData
    P_Reusability --> DP_DeclarativePipeline
	P_Orchestration --> DP_DeclarativePipeline
	P_Orchestration --> DP_OpenData

    P_Streaming --> DP_OpenData
    CE_ExportingCSVs --> P_DataSharing

In the following chapters, we'll dive deeper into these convergent evolutions.

Active Here: 0
Be the first to leave a comment.
Someone is typing
Your comment will appear once approved by a moderator.
No Name
4 years ago
This is the actual comment. It's can be long or short. And must contain only text information.
Your reply must be approved by a moderator.
No Name
2 years ago
This is the actual comment. It's can be long or short. And must contain only text information.
Load More
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Load More