Research - External

Do Neural Networks Share a Common Grammar?

A large-scale empirical study suggests that deep networks—regardless of what they’re trained to do—may converge toward remarkably similar internal structures.


When neural networks learn to perform different tasks, do they arrive at fundamentally different solutions? Or is there something like a shared vocabulary underlying how these systems organize information?

A recent preprint from Kaushik and colleagues at Johns Hopkins offers striking evidence for the latter. Their work, titled “The Universal Weight Subspace Hypothesis,” presents what appears to be the first large-scale empirical demonstration that neural networks systematically converge to shared spectral subspaces—regardless of initialization, task, or domain.

What They Did

The researchers applied spectral decomposition techniques to the weight matrices of over 1,100 models spanning dramatically different architectures and purposes: 500 Mistral-7B LoRA adaptations, 500 Vision Transformers, and 50 LLaMA-8B models. The scale here matters. Previous observations of structural similarities in neural networks have often relied on smaller samples or narrower architectural ranges. This study’s breadth allows patterns to emerge that might otherwise be dismissed as coincidence.

The analytical approach—mode-wise spectral analysis—examines how variance distributes across the principal directions of each model’s weight matrices. If networks trained on different tasks were finding genuinely independent solutions, you’d expect their spectral signatures to diverge. That’s not what the data show.

What They Found

Across this diverse model zoo, the researchers identified universal subspaces that capture the majority of parametric variance in just a few principal directions. More striking still: these sparse, shared subspaces appear to be consistently exploited across architectures, tasks, and datasets.

The findings suggest that deep networks may not be discovering arbitrary solutions to their training objectives. Instead, they appear to be converging toward something like a common low-dimensional manifold—a shared “grammar” of weight organization that transcends the particularities of any given task.

What This Doesn’t Tell Us

The study documents correlation, not mechanism. We don’t yet know why networks converge to these shared subspaces, whether this reflects fundamental constraints of gradient-based optimization, implicit biases in common architectures, or something deeper about the structure of learnable functions.

The authors are appropriately cautious about causal claims, framing their contribution as raising questions rather than settling them. Chief among these: could we identify these universal subspaces a priori, without the computational expense of training thousands of models first?

Why This Matters to Us

From MPRG’s perspective, this work resonates with a broader question we’ve been circling: to what extent do the behaviors we observe in language models reflect genuinely task-specific learning versus convergence toward shared representational structures?

If diverse models are organizing their weights similarly despite radically different training objectives, it raises interesting questions about what we’re actually measuring when we probe model behavior. Are we observing task-specific competencies, or are we sampling from a more universal repertoire of learned structure?

The authors note practical implications—model merging, multi-task learning, reduced training costs. But for those of us interested in what these systems are doing at a more fundamental level, the finding invites a different set of questions: What does it mean that the solution space for neural learning appears so constrained? And what might we infer about systems that navigate to the same parametric neighborhoods despite starting from different places and pursuing different goals?

We don’t have answers yet. But the question itself seems worth holding.


References

Kaushik, P., Chaudhari, S., Vaidya, A., Chellappa, R., & Yuille, A. (2025). The Universal Weight Subspace Hypothesis. arXiv. https://arxiv.org/abs/2512.05117