Skip to content

⚙️ elevata Architecture Overview

A high-level view of how elevata transforms metadata into executable SQL - from ingestion to lineage, from logical plans to dialect-aware rendering.

This overview connects the core concepts behind Generation Logic, Incremental Load, Load SQL Architecture, Lineage & Logical Plan, and the Dialect System into one visual narrative.


🔧 1. Core Architecture at a Glance

Source Metadata (DB reflection, APIs)
  ↓
Metadata Model (Datasets, Columns, Lineage)
  ↓
Generation Logic (TargetDataset & Columns)
  ↓
Lineage Model (Dataset + Column Lineage)
  ↓
Logical Plan Builder (Structured Query Representation)
  ↓
SQL Renderer (Deterministic SQL Formatting)
  ↓
get_active_dialect() (Dialect Adapter)
  ↓
Load SQL (Full · Merge · Delete Detection)
  ↓
Target Warehouse (Raw · Stage · Rawcore)
  ↓
Schema Evolution (MigrationPlan)
  ↓
DDL Applier (safe DDL only)

This flow represents the central principle of elevata:

Metadata → Logical Plan → Dialect-aware SQL → Warehouse

Architecture Control provides review, approval, controlled execution, and audit artifacts around the same architecture state:

Architecture State
  ↓
Architecture Diff
  ↓
MigrationPlan
  ↓
Policy Decisions
  ↓
Architecture Change Report
  ↓
Architecture Review Briefing
  ↓
Architecture Approval Artifact
  ↓
Execution Preview
  ↓
Controlled Execution
  ↓
Architecture Execution Record

🔧 2. Architecture Layers

🧩 2.1 Metadata Ingestion Layer

  • Reads schema, columns, keys from source systems
  • Normalizes metadata into elevata’s internal models
  • Reports created, changed, unchanged and removed source metadata outcomes
  • No SQL generation occurs here

🔎 2.1.1 Source Metadata Import Review

Source Metadata Import Review makes source onboarding inspectable immediately after import.

It reports what elevata discovered and how the SourceColumn metadata changed:

  • created columns
  • changed columns
  • unchanged columns
  • removed columns
  • detected primary key columns
  • skipped datasets
  • datasets that need manual review

Changed and unchanged are intentionally separated. A changed column means the stored technical source metadata now differs from the previous state. An unchanged column means the source was checked and still matches the previous metadata state.

The review result is transient and read-only as a report. It does not persist import history, introduce a new workflow, execute loads, or generate target architecture. It only makes the existing metadata import outcome transparent before downstream generation and control steps.

🧩 2.2 Generation Layer

  • Creates TargetDatasets in Raw, Stage, Rawcore
  • Injects surrogate keys where required
  • Produces column mappings based entirely on lineage

Incremental scoping and ingestion behavior are derived from SourceDataset metadata and consistently applied across ingestion, merge, and delete detection.

Raw datasets may be ingested via native ingestion or skipped entirely in federated setups.

🧩 2.3 Lineage Layer

  • Establishes dataset-level and column-level lineage
  • Feeds the Logical Plan Builder
  • Ensures traceability from source to Rawcore

🧩 2.4 Logical Plan Layer

  • Builds structured plans (not SQL!)
  • Vendor-neutral representation of SELECT, JOIN, UNION logic
  • Used by Raw → Stage → Rawcore previews and loads

🧩 2.5 SQL Rendering Layer

  • Applies formatting rules (indentation, aliasing, column order)
  • Hands off dialect-specific tasks to the dialect adapter
  • Deterministic output for UI and CI

🧩 2.6 Dialect Adapter Layer

  • Implements quoting, merge syntax, hashing, concatenation
  • Ensures SQL runs identically across platforms (BigQuery, Databricks, DuckDB, Fabric Warehouse, MSSQL, Postgres, Snowflake)

🧩 2.7 Load SQL Layer

  • Full load: INSERT INTO ... SELECT
  • Incremental merge: upsert logic based on natural key lineage
  • Delete detection: anti-join removal of missing rows

🧩 2.7.1 Schema Evolution (MigrationPlan + Applier)

Before executing load SQL, elevata derives a MigrationPlan from the Architecture Diff and translates it into deterministic schema evolution steps:

  • Dataset renames are expressed as RENAME TABLE
  • Column renames are expressed as RENAME COLUMN
  • Missing columns may be added (ADD COLUMN) when supported
  • Column drops are policy-gated and disabled by default
    • Base tables: ELEVATA_ALLOW_AUTO_DROP_COLUMNS=true enables physical DROP COLUMN
    • _hist tables: physical drops require ELEVATA_ALLOW_AUTO_DROP_HIST_COLUMNS=true
    • Without the hist flag, removed business columns in _hist are retired (inactive + detached lineage)

Important design principle:
Schema evolution does not provision missing tables. Table provisioning is handled centrally by the load runner (ensure_target_table(...)) and executed via the target execution engine.

Preflight validation includes schema introspection and dialect-aware semantic equivalence rules to suppress non-actionable type differences.

🧩 2.7.2 Architecture Catalog

Architecture Catalog provides the read-only discovery layer for metadata-defined executable architecture.

It helps users inspect:

  • dataset inventory
  • schema / layer placement
  • materialization semantics
  • incremental strategy
  • ownership
  • metadata health
  • query logic
  • upstream and downstream relationships
  • column contract signals
  • serving-layer Data Product readiness
  • layer maps and dependency matrices
  • latest execution evidence references
  • architecture quality and governance insights
  • Architecture Control review status summaries

The Catalog links to dedicated pages for:

  • dataset details
  • lineage
  • query contracts
  • Catalog Data Products
  • Architecture Control
  • execution history

Architecture Catalog does not edit metadata and does not execute loads.

Catalog Data Products provide a read-only consumer-readiness perspective for serving-layer datasets. They combine ownership, metadata health, query contracts, lineage, review state and execution evidence into transparent readiness groups: Consumption-ready, Review recommended and Not consumption-ready.

Catalog Insights provide read-only signals for ownership gaps, metadata health findings, custom query logic, downstream consumer visibility, inactive datasets with consumers, and missing execution evidence. Dataset-specific insight signals are also shown on Catalog detail pages.

Catalog Maps provide a read-only architecture lens across populated schemas and direct TargetDataset dependencies. Layer cards, layer flow overview, dependency matrix and transition examples make architecture structure visible without introducing graph editing, execution controls or metadata mutation.

🧩 2.7.3 Architecture Control

Architecture Control makes metadata-defined architecture reviewable, approvable, executable through controlled scopes, and auditable.

It provides deterministic artifacts for:

  • Architecture State
  • Architecture Change Reports
  • Architecture Promotion Reports
  • Architecture Approval Artifacts
  • Architecture Execution Records
  • policy decisions
  • report fingerprints

Controlled execution is delegated to the load runner. Architecture Control does not bypass preflight validation, materialization policy checks, Architecture Guard enforcement, or dialect-owned SQL rendering.

Command responsibilities:

Command Responsibility
elevata_state Render the metadata-defined architecture state
elevata_plan Render architecture change intent and policy decisions
elevata_promote Compare two architecture state artifacts
elevata_approve Create architecture approval artifacts
elevata_approval_check Verify approval artifacts
elevata_load Execute loads with preflight and guard checks

Architecture Control uses the same semantic path as execution:

Architecture State → Architecture Diff → MigrationPlan → Policy Decisions

The Architecture Control UI adds a constrained operational layer. Architecture Review Briefing summarizes reviewer attention from the current scoped report, review status and execution preview before approval or execution:

  • scope-aware report and review status
  • compact Architecture Review Briefing
  • approval artifact creation and verification
  • execution preview
  • controlled load execution
  • target-only execution for TargetDataset scopes
  • captured execution output
  • persisted Architecture Execution Records

Execution scopes are explicit:

Scope Execution behavior
All datasets Executes all active target datasets with dependency ordering
Schema Executes selected schema roots with dependency ordering
TargetDataset Executes the selected TargetDataset with dependency ordering
TargetDataset, target-only Executes only the selected TargetDataset

The default execution path remains lineage-aware. Target-only execution is available only for TargetDataset scopes and is intended for focused iteration when upstream data is already available.

Architecture Execution Records capture the audit context of controlled execution:

  • execution identifier
  • operator
  • timestamps and duration
  • status and message
  • Architecture Control scope
  • dependency mode
  • report fingerprint
  • approval identifier
  • preview fingerprint
  • command invocation metadata
  • output and error tails
  • deterministic record fingerprint

🔧 3. Bizcore - Business Semantics as Metadata

elevata introduces a dedicated Bizcore layer for modeling business meaning, rules, and calculations as first-class metadata.

Bizcore sits explicitly between Core and Serving:

RAW → STAGE → CORE → BIZCORE → SERVING

🧩 What Bizcore is

  • A business semantics layer, not a technical projection
  • Explicitly modeled datasets and columns
  • Deterministically executed like all other datasets
  • Fully lineage-aware and explainable

Bizcore datasets express:

  • business concepts (e.g. Customer, Contract, Revenue)
  • business rules and classifications
  • derived business identifiers
  • KPIs and domain logic as dataset fields

🧩 What Bizcore is not

  • No BI semantic layer
  • No metric store
  • No query-time metric resolution
  • No tool-specific abstraction

Bizcore logic is compiled into the same logical plans and SQL as technical datasets, preserving elevata’s guarantees around determinism, transparency, and reproducibility.

🧩 Serving - Presentation Logic & Consumer Hand-off

Serving is the presentation-facing layer. Serving datasets typically expose Bizcore datasets 1:1 (often as views), while allowing consumer-specific shaping such as naming, ordering, and lightweight joins where required. Serving is intended as the hand-off layer to BI tools / semantic layers / frontend use cases - without moving business logic out of Bizcore.

🧩 Custom Query Logic (Query Tree)

For most datasets, elevata generates SQL automatically from metadata. In semantic layers (bizcore, serving), elevata additionally supports Custom Query Logic via an explicit Query Tree.

The Query Tree defines the shape of a query (e.g. windowing, aggregation steps, union composition) while remaining fully metadata-native.

If enabled, the Query Tree is compiled into the same Logical Plan and Expression AST used by the default generation pipeline. If disabled, elevata falls back to fully automatic SQL generation.

This ensures advanced query shaping without introducing manual SQL or breaking determinism, lineage, or governance guarantees.


🔧 4. Incremental Processing Path

Stage Dataset
  ↓  (Lineage Mapping)
Merge SQL
  ↓
Delete Detection
  ↓
Rawcore Dataset

These two strategies are currently implemented:

  • full
  • merge

Both operate exclusively between Stage → Rawcore.


🔧 5. Dialect Resolution Overview

ELEVATA_SQL_DIALECT env var  →  Dialect Adapter (override)
Active Profile (elevata_profiles.yaml)  →  Dialect Adapter
DuckDBDialect (fallback)  →  Dialect Adapter

The resolution order is:

  1. Environment override
  2. Profile definition
  3. DuckDB fallback

🔧 6. Unified SQL Generation Pipeline

Metadata Model
  → Logical Plan Builder
  → SQL Renderer
  → Dialect Adapter
  → Load SQL (full, merge, delete)

🔧 7. Why This Architecture Matters

  • Vendor neutrality via dialect adapters
  • Determinism via SQL rendering rules
  • Traceability via lineage-driven logic
  • Extensibility (new dialects, strategies, materializations)
  • Incremental ready with merge + delete detection
  • Safe for CI/CD - predictable SQL for diffing and testing
  • Execution & Logging are part of the system


© 2025-2026 elevata - Technical Documentation