← All posts
engineering#ai-architecture#hub-shadow#transformer#llama#multilayer#pre-registered#confirmedarXiv:2604.23639

The Transformer Paradox: Historical Lineage vs Benchmark Dominance in AI Architecture

F6 pre-registered result: Transformer is the top hub by citation lineage and architectural inheritance, but LLaMA leads current benchmarks. Two layers, same 20 models — structurally divergent hubs reveal the gap between historical influence and contemporary relevance.

pre-reg: bff4e101

The Experiment

    Twenty major ML architectures — Transformer, BERT, GPT-2/3/4, T5, ViT, ResNet, AlexNet, Word2Vec, Bahdanau Attention, LSTM, RNN, GAN, VAE, Diffusion/DDPM, LLaMA, Mamba, CLIP, RoBERTa — modelled as a three-layer multilayer graph:
  • citation_dependency (d1): directed citation graph — "X was introduced in a paper that cites Y"
  • architecture_inheritance (d2): directed architectural derivation — "X's architecture is a variant or extension of Y"
  • benchmark_co_performance (d3): co-participation in the same standard evaluation benchmark

The pre-registered prediction (hash bff4e101, committed 2026-04-30): the two lineage layers (citation, inheritance) should be more correlated with each other than either is with the benchmark layer.

The Result

r(citation_dependency ↔ architecture_inheritance) = +0.915 (Spearman = 0.676, p = 0.002)

r(citation_dependency ↔ benchmark_co_performance) = −0.003

Δr = +0.918 — the largest single directional gap in this set of experiments.

The Functional Proximity Law is confirmed: the two historical lineage layers agree strongly; the contemporary evaluation layer is structurally orthogonal to both.

The Informative Denial: h5

The pre-registered h5 predicted Transformer would also be the top hub in benchmark co-performance. It isn't. LLaMA is the top benchmark hub. Transformer ranks #5.

This is not a failure — it is exactly what the structural divergence predicts: benchmark co-performance reflects current evaluation practice (recent models share evaluation suites), while citation and inheritance layers encode historical influence. The hub identity shifted from Transformer to LLaMA across that structural boundary.

What This Means

    The AI architecture graph has a clear temporal seam: the same metric (hub importance) means something different depending on which layer you look at.
  • Citation hub: who everything cites — Transformer
  • Inheritance hub: whose architecture every later model extends — Transformer
  • Benchmark hub: who everyone is being compared to today — LLaMA

Transformer is historically central. LLaMA is currently central. The structural divergence between these two answers is not noise — it is signal about how field influence migrates over time.

A system designed to maximise benchmark relevance should prioritise LLaMA-like architectural patterns. A system trying to understand the foundational intellectual lineage of modern AI should prioritise Transformer.

Both are correct answers to different questions. IRDME makes the distinction visible from topology alone.

Pre-registration Record

Hash: bff4e101… — committed 2026-04-30T20:26:15 UTC before any analysis was run. Full record: github.com/vladi160/preregistrations (commit 3014808).

Paper: arXiv:2604.23639

Reproducibility

This result was pre-registered before analysis. SHA-256 hash: bff4e101

Verify at github.com/vladi160/preregistrations