← case studies·AI / ML Researchuniversal_hubPre-registered · F6

AI Model Architecture Graph

IRDME identifies foundational AI architectures from structure alone — no reading, no citation counts. Transformer ranks #1 in both citation lineage and architecture inheritance, confirmed before analysis was run.

20
Nodes
83
Relations
3
Layers
0.9147
r (structural pair)

What was measured

20 major ML model architectures as nodes. Three independent layers encoding different types of relationship between them. All edges are verifiable from the original papers listed in the data sources.

d1citation_dependency

Model A is cited as a direct architectural predecessor in model B's paper. Edge exists only when the original paper explicitly credits the prior architecture as a structural foundation.

d2architecture_inheritance

Models A and B share a core structural mechanism: multi-head attention, convolutional feature extraction, sequential state processing, or latent generative framework. Layer is independent of citation practice.

d3benchmark_co_performance

Models A and B are evaluated together in the same benchmark comparison papers (GLUE, SuperGLUE, ImageNet, FID, SSM benchmarks). This is a behavioral signal — it encodes what models do on tasks, not how they are built.

Cross-layer hub correlation

IRDME computes Pearson r over hub-importance scores between each pair of layers. r ≈ 1 means the same nodes dominate both layers. r ≈ 0 means the layers reveal structurally independent pictures.

citation_dependency ↔ architecture_inheritanceCONFIRMED
r = 0.9147
betweenness r = 0.7346 · p = 0.002

Near-perfect correlation. How papers cite predecessors and which structural mechanisms are reused agree almost completely on which architectures are central.

citation_dependency ↔ benchmark_co_performanceBASELINE
r = −0.0025
betweenness r = −0.1213 · p = n/s

No correlation. Citation lineage and benchmark co-performance reveal structurally independent pictures of the field.

architecture_inheritance ↔ benchmark_co_performanceBASELINE
r = ~0.0
betweenness r = −0.0737 · p = n/s

No correlation. Shared structural mechanism and benchmark co-evaluation are independent dimensions of model organisation.

Functional Proximity Law:r(citation_dependency ↔ architecture_inheritance) = 0.9147 >> r(citation_dependency ↔ benchmark_co_performance) = −0.0025. The law fires. Structural layers agree on hubs; the behavioral layer does not.

When declared rank and structural rank agree

Transformer ranks #1 in citation_dependency (degree 11) and #1 in architecture_inheritance (degree 10). Pre-registered before analysis. Confirmed from the edge list alone.

citation_dependency
#1transformer
degree 11
architecture_inheritance
#1transformer
degree 10
benchmark_co_performance
#1llama
degree 8

Finding: Transformer is the structural centre of the AI field in both the way papers cite predecessors and the way architectural mechanisms propagate. In the behavioral layer (benchmark co-performance), llama displaces it as top hub — a structural divergence the layer analysis names explicitly.

Denied hypothesis h5 — named mechanism

Pre-registered prediction: Transformer would rank ≤ 3 in the benchmark_co_performance layer. Result: rank #5. Denied. The mechanism is named.

DENIED · h5

Transformer ranks #5 (not ≤ 3) in benchmark_co_performance. Top hub: llama.

Mechanism: recency bias in benchmark selection. Benchmark comparison papers systematically co-evaluate recent models on the same leaderboard. LLaMA, GPT-3, BERT, and RoBERTa were all actively benchmarked in the same 2022–2024 window. Transformer (2017 original) is cited as a foundation in nearly every paper but rarely appears as a direct benchmark entry — it is treated as infrastructure, not as a competitor. This is the opposite pattern to the structural layers, where recency of publication is irrelevant. A hub_shadowin behavioral space: Transformer's real structural centrality is invisible to benchmark co-performance alone.

Hub ranking by layer

Selected architectures. Rank = degree centrality position within each layer.

modelcitation_dependencyarchitecture_inheritancebenchmark_co_perfarchetype
transformer#1#1#5universal_hub
rnn#2#10#20hub_shadow
gpt3#3#2#3universal_hub
lstm#4#5#8relay
bert#5#3#2universal_hub
resnet#6#4#6relay
attention_mechanism#7#11#20hub_shadow
vae#8#14#10relay
clip#9#15#9relay
gpt2#10#6#7relay
llama#16#8#1chameleon
roberta#19#12#4chameleon

12 selected rows. All 20 nodes in the raw output file.

Hub shadows in structural layers

Nodes that rank high in citation_dependency (the declared structural role) but low in architecture_inheritance (the mechanism layer). These are architectures the field cites heavily but whose structural mechanisms were not widely inherited.

rnnhub_shadow
citation_dep
#2
arch_inherit
#10
benchmark
#20
gap = 8

RNN is the second-most cited predecessor in the corpus, but only 10th in architecture inheritance. The field cites RNN as a foundation but has broadly moved to attention and SSM mechanisms rather than recurrent connectivity.

attention_mechanismhub_shadow
citation_dep
#7
arch_inherit
#11
benchmark
#20
gap = 4

Bahdanau attention is cited as a direct predecessor in many papers but its mechanism was absorbed into the Transformer rather than inherited independently. Virtually absent from benchmark evaluations as a standalone model.

Chameleons — rank inversion between structural and behavioral layers

Nodes that rank low in citation lineage but high in benchmark co-performance. High behavioral visibility, low structural position.

llamachameleon
citation_dep
#16
arch_inherit
#8
benchmark
#1

LLaMA ranks #1 in benchmark co-performance but #16 in citation dependency. It is the most benchmarked model in 2023–2024 comparison papers but contributes few novel architectural citations itself — it is a derivative of the GPT/Transformer lineage.

robertachameleon
citation_dep
#19
arch_inherit
#12
benchmark
#4

RoBERTa ranks #4 in benchmark co-performance but #19 in citation dependency. It is a training procedure refinement of BERT with no new architectural mechanism, yet it appears in virtually every NLP benchmark comparison.

Pre-registered hypotheses

CONFIRMEDh1

r(citation_dependency ↔ architecture_inheritance) > r(citation_dependency ↔ benchmark_co_performance)

0.9147 > −0.0025

CONFIRMEDh2

r(citation_dependency ↔ architecture_inheritance) > 0.4 and significant

Pearson r = 0.9147 · Spearman r = 0.6757 · p = 0.002

CONFIRMEDh3

transformer is rank #1 in citation_dependency

rank #1 · degree 11 (next: rnn degree 5)

CONFIRMEDh4

transformer is rank #1 in architecture_inheritance

rank #1 · degree 10 (next: gpt3 degree 4)

DENIEDh5

transformer ranks ≤ 3 in benchmark_co_performance

rank #5 · top hub: llama (degree 8). Mechanism: recency bias in benchmark selection.

What this means

For ML researchers
  • ·r = 0.9147 between citation dependency and architecture inheritance means citation practice is a near-perfect proxy for mechanism inheritance. How a paper cites is how it builds.
  • ·LLaMA's rank inversion (citation rank #16 → benchmark rank #1) is not an anomaly — it is the defining property of a chameleon: structurally peripheral in the dependency graph, dominant in deployment benchmarks. High benchmark visibility, zero structural novelty.
  • ·The RNN hub shadow (architecture rank #2 → benchmark rank #10) quantifies a pattern the field knows qualitatively: foundational mechanisms become invisible in leaderboard comparisons once they are absorbed into infrastructure.
The denied h5 — reading the absence

h5 predicted transformer would rank ≤ 3 in benchmark co-performance. It ranked #5. The denial mechanism: recency bias in benchmark selection — newer architectures (LLaMA, GPT-4, BERT) displaced transformer as the benchmarked model even as it remained #1 in the structural layers.

Transformer's absence from the benchmark top-3 is itself the signal: it is treated as infrastructure, not a competitor. Benchmarks measure what is new, not what is foundational. The structural rank is the correct measure of influence. The benchmark rank measures recency.

Universal Layer Grammar:transformer is a universal hub in AI architecture for the same structural reason PVCL/PVCR are universal hubs in the C. elegans connectome: they appear as top hubs in every independently defined layer. Domain-invariant topology.

Reproduce

Pre-registration hash bff4e101… was committed to github.com/vladi160/preregistrations before analysis was run.

# validate
irdme validate-experiment examples/experiments/ai_architecture_law.json
# commit-prediction (done — hash on GitHub)
irdme commit-prediction examples/experiments/ai_architecture_law.json --push
# run
irdme examples/experiments/ai_architecture_law.json outputs/output_ai_architecture.json
# verify
irdme verify-prediction examples/experiments/ai_architecture_law.json