Unified Table Schema#
Purpose#
The unified table schema is the canonical input contract across all analysis families in this package. It enables repeatable pipelines while still allowing loose real-world data.
If your input originated in design-research-experiments, see
Experiments-To-Analysis Handoff for the recommended events.csv validation and
join workflow.
Column Expectations#
Required:
timestamp
Strongly recommended:
record_idtextsession_idactor_idevent_type
Optional:
meta_json
Loose Schema Strategy#
Missing values for actor_id and event_type can be derived with
deterministic mapper functions before running sequence analyses.
In the experiments export handoff, record_id may also be derived when the
upstream artifact keeps stable event rows but does not emit explicit record
identifiers.
Key API surfaces:
design_research_analysis.table.group_rows
Example#
from design_research_analysis import (
derive_columns,
validate_unified_table,
)
rows = [
{"timestamp": "2026-01-01T10:00:00Z", "text": "hello", "speaker": "alice"},
{"timestamp": "2026-01-01T10:00:01Z", "text": "world", "speaker": "bob"},
]
rows = derive_columns(
rows,
actor_mapper=lambda row: row["speaker"],
event_mapper=lambda _row: "utterance",
)
report = validate_unified_table(rows)
assert report.is_valid