Skip to content

⚙️ Determinism & Execution Semantics

This document defines elevata’s rules for deterministic SQL generation and execution.
It applies to both standard generation and custom query logic (Query Trees).


🔧 1. Why determinism matters in elevata

elevata is built for reproducibility:

  • SQL previews must match executed SQL
  • CI checks must be stable
  • the same metadata must produce the same output across runs
  • multi-dialect rendering must not introduce semantic drift

Determinism is therefore treated as a correctness requirement, not a “best practice”.


🔧 2. Determinism model: errors vs warnings

elevata classifies findings as:

  • ERROR (blocking): execution is ambiguous or unsafe
  • WARNING (advisory): execution is valid, but quality or semantics may be degraded

The Query Builder UI surfaces this via:

  • deterministic / needs ordering badges
  • error/warning counts

🔧 3. Preflight Validation Phase

elevata includes a preflight validation phase executed before any DDL or DML statements are applied.

The preflight phase guarantees that execution behavior is fully predictable.

🧩 Responsibilities

The preflight phase performs:

  • schema introspection
  • materialization planning
  • type drift detection
  • validation of blocking conditions
  • execution safety checks

No SQL affecting data or schema is executed before preflight completes successfully.

🧩 Deterministic Failure Modes

Execution may fail during preflight when:

  • unsafe schema evolution is required
  • narrowing or incompatible type drift is detected
  • required dialect capabilities are missing
  • metadata inconsistencies are found

Failures always occur before execution starts.

This guarantees:

  • no partially applied schema changes
  • no partial data loads
  • reproducible execution behavior.

🧩 Full Refresh Exception

Datasets using full refresh materialization are exempt from type drift blocking
because the table is recreated during execution.

Type drift warnings may still be emitted for visibility.


🔧 4. Window functions

Some window functions are inherently nondeterministic without ordering.

Rule:

  • If a window function requires ordering, an ORDER BY clause is mandatory. Missing ORDER BY → ERROR

Examples of functions requiring ORDER BY:

  • ROW_NUMBER, RANK, DENSE_RANK
  • LAG, LEAD
  • FIRST_VALUE, LAST_VALUE, NTH_VALUE
  • NTILE

Windowed aggregates (SUM/AVG/…) may not require ORDER BY:

  • missing ORDER BY is usually ok → optional warning depending on policy

🔧 5. Aggregation determinism

Aggregations can become nondeterministic if result ordering is undefined in the aggregation semantics.

Rule patterns:

  • Ordered aggregates (e.g. STRING_AGG) require explicit ORDER BY inside the function. Missing ordering → ERROR (or strict WARNING, depending on policy)

Other aggregates (SUM, COUNT, MIN, MAX, AVG) are deterministic without ordering.


🔧 6. Contract stability and collisions

The output contract must be stable and unambiguous.

Rules:

  • Output column name collisions → ERROR
  • Missing inputs / disconnected tree → ERROR
  • Cycles in the Query Tree → ERROR

🔧 7. Why elevata is not a semantic layer

elevata does not implement query-time semantics (like BI semantic layers or metric stores).
Instead, elevata materializes semantics into datasets deterministically:

  • business logic belongs in bizcore
  • consumer shaping belongs in serving
  • execution is metadata-native and explainable via lineage + query contract

This avoids tool-specific logic and ensures reproducible pipelines.


🔧 8. References


© 2025-2026 elevata Labs — Internal Technical Documentation