← All posts
cancer_biology#cancer#NSCLC#structural-analysis#CRISPR#drug-discovery#pre-registered

Structural Weak Points in NSCLC: When the Network Bottleneck and the Survival Dependency Are Different Nodes

Four pre-registered experiments on a 16-protein NSCLC signaling network. ERK1 is the biggest structural coupling bottleneck -- but DepMap CRISPR screens show MYC is the most essential undrugged node. Structural coherence and cancer survival are orthogonal axes (Spearman r = -0.722, p = 0.0024). The drug development landscape targets nodes in the wrong structural tier.

The Question

Current NSCLC therapy targets EGFR (osimertinib), MEK1 (trametinib), and KRAS-G12C (sotorasib). The question we asked: are these the right structural targets? Or is there a mismatch between which proteins are structurally central in cancer biology and which ones have drugs pointing at them?

We ran four pre-registered experiments on a 16-protein NSCLC signaling network (EGFR/RAS/PI3K pathway) using three data layers: physical interactions (STRING v12.0), co-expression in cancer tissue (TCGA LUAD, n=585 tumors), and drug coverage (FDA approvals and clinical trials). All pre-registrations were committed to a public GitHub repo before any analysis ran.

Experiment 1: The Drug Coverage Gap

The first experiment identified which proteins are structural hubs in both biology layers (physical interaction + co-expression) but absent from drug coverage. Result: at least 4 proteins showed a rank gap of 6+ positions between their biological centrality and their therapeutic coverage. The biggest gap: MYC (co-expression rank 2, drug rank 13 -- zero approved drugs).

Experiment 2: The Structural Bottleneck

We built a perturbation loop: remove each node one at a time, measure how much the cross-layer coupling coefficient r(physical_interaction, coexpression_cancer) drops. The node whose removal causes the largest collapse is the structural bottleneck -- the node that holds the two biological data layers in alignment.

    Full ranking by delta-r (most disruptive first):
  • ERK1: -0.135 (no approved drug in NSCLC)
  • EGFR: -0.096 (well-covered: osimertinib, gefitinib)
  • TP53: -0.090 (loss-of-function in ~50% NSCLC; undruggable)
  • PTEN: -0.080 (loss-of-function; undruggable)
  • HER2: -0.071 (trastuzumab deruxtecan)
  • MEK1: -0.011 (trametinib approved)
  • MDM2: +0.100 (removing it INCREASES coupling)

ERK1 -- not a current drug target in NSCLC -- is the single most disruptive node to remove. MEK1, which HAS an approved inhibitor (trametinib), is 12x less structurally disruptive.

Experiment 3: Do MEK Inhibitors Address the ERK1 Problem?

Pre-registered prediction: MEK1 removal (simulating trametinib) produces LESS structural disruption than ERK1 removal, because ERK1 remains intact as the coupling anchor even when its input is blocked.

Result: CONFIRMED. MEK1 removal: delta-r = -0.011. ERK1 removal: delta-r = -0.135. The 12x gap confirmed: approved MEK inhibitors target the relay, not the bottleneck. ERK1 itself maintains the KRAS-ERK1 co-expression link and the ERK1-MYC physical interaction that are the structural backbone of the MAPK layer.

Experiment 4: External Validation Against DepMap CRISPR

Here the story gets surprising.

    We pre-registered three hypotheses before downloading DepMap 23Q4 CRISPR data (mean Chronos gene effect across A549, NCI-H1299, HCC827, NCI-H460):
  • H1: ERK1 is more essential than MEK1 (structural prediction)
  • H2: TP53 is non-essential despite high structural disruption score (structural void hypothesis)
  • H3: PTEN is non-essential despite high structural disruption score (same)
    Results:
  • H1 DENIED: ERK1 CRISPR = -0.091, MEK1 CRISPR = -0.152. ERK1 is barely essential; MEK1 is slightly more so.
  • H2 CONFIRMED: TP53 CRISPR = +1.341. Strongly anti-essential -- cancer cells grow faster without TP53.
  • H3 CONFIRMED: PTEN CRISPR = +0.608. Same.

The H1 denial has a clear explanation: ERK1 and ERK2 form a functional redundancy module. The 16-node network includes ERK1 but not ERK2 (MAPK1). When ERK1 is knocked out in a CRISPR screen, ERK2 compensates. Delta-r correctly identifies ERK1's structural position within the modeled network; it cannot capture paralog compensation that falls outside the model.

The Unexpected Finding: r = -0.722

When we computed Spearman r(delta-r, CRISPR_essentiality) across all 15 nodes with both measurements: r = -0.722, p = 0.0024.

The relationship inverts. Nodes with more negative delta-r (structural anchors) are LESS essential for cancer survival. Nodes with positive delta-r (structural disruptors) are MORE essential.

The full pattern:

| Node | delta-r | CRISPR | Role | |------|---------|--------|------| | ERK1 | -0.135 | -0.091 | structural anchor, barely essential | | EGFR | -0.096 | -0.175 | structural anchor, mildly essential | | TP53 | -0.090 | +1.341 | structural void -- anti-essential | | PTEN | -0.080 | +0.608 | structural void -- anti-essential | | MYC | +0.040 | -2.140 | structural disruptor -- most essential | | mTOR | +0.049 | -1.372 | structural disruptor -- highly essential | | KRAS | +0.061 | -0.868 | structural disruptor -- highly essential | | MDM2 | +0.100 | -1.026 | structural disruptor -- highly essential |

All 6 nodes with positive delta-r are among the most essential. The 3 most essential undrugged nodes (MYC, mTOR, MDM2) all have positive delta-r.

What This Means

The CRISPR data and the structural analysis are not measuring the same thing -- and the gap between them is informative.

Delta-r measures coupling sensitivity: how much a node's removal disrupts the alignment between physical signaling and transcriptional co-expression. This is a property of the network's organizational structure.

CRISPR essentiality measures survival dependency: which nodes cancer cells cannot live without.

The strong anti-correlation (r = -0.722) means these are nearly orthogonal axes in the NSCLC signaling network. Cancer's structural anchors (ERK1, EGFR, HER2) maintain coherence but are not what drives survival. Cancer's survival dependencies (MYC, mTOR, KRAS, MDM2) are structural disruptors -- their presence maintains the decoupled cancer state, where transcriptional and metabolic locking replaces signaling coherence as the organizing principle.

The tumor suppressors (TP53, PTEN) are a third class: coupling-critical nodes that cancer has already removed via loss-of-function mutation. They appear anti-essential because established cancer cell lines operate in a regime where these nodes are absent. Delta-r detects the structural cost of their absence; CRISPR confirms the cancer has already paid that cost.

Three Distinct Node Classes

    The experiments separate into three structural categories:
  • Structural anchors (negative delta-r, moderate CRISPR, drugged or undrugged): ERK1, EGFR, HER2, SOS1, PI3K, MEK1. These maintain network coherence. Current drug development correctly targets EGFR and HER2; MEK1 has a drug but is 12x less disruptive than ERK1 structurally.
  • Structural disruptors (positive delta-r, high CRISPR essentiality, some drugged): MYC, mTOR, KRAS, MDM2. Cancer's actual survival dependencies. MYC has no approved drug; mTOR and MDM2 have drugs but their structural role (positive delta-r) means they drive the decoupled cancer state.
  • Structural voids (negative delta-r, anti-essential CRISPR): TP53, PTEN. Coupling-critical nodes already eliminated by the cancer. Structural analysis correctly identifies them as important; CRISPR confirms cancer has already exploited their loss.

The Undrugged Priority Depends on Which Axis You Use

By delta-r: the biggest undrugged structural gap is ERK1 (-0.135, no approved NSCLC drug). But ERK1 has low CRISPR essentiality due to ERK2 redundancy.

By CRISPR essentiality: the most essential undrugged node is MYC (-2.140, no approved drug). MYC has positive delta-r (+0.040) -- its removal would actually improve structural coupling.

These are different nodes measuring different things. Both are real. The correct conclusion is not to pick one -- it is to report both axes separately and note that they do not agree.

What Needs Replication

The anti-correlation (r = -0.722) is from one dataset: 16 nodes, 4 NSCLC cell lines. It is a pre-registered result, not a data-mined pattern. But it needs replication in a second cancer network to become a general claim. We are calling this the Oncogenic Decoupling Signature (ODS) and treating it as PROVISIONAL.

All pre-registration files, datasets, and analysis scripts are publicly available. The pre-registration hashes are at github.com/vladi160/preregistrations.