In-Context Learning in Transformers: Correlation vs Causation

Published: 2026-04-11 00:13:47 |

In a16z podcast, Columbia professor Vishal Misra argues that transformers mainly learn correlation, not causation—an obstacle to true intelligence and AGI. He explains how LLMs generate text by predicting the next token from probability distributions, and how prompt context can sharply change outputs. Misra highlights in-context learning as a key mechanism: when an LLM receives examples in the prompt, it can solve problems in real time. He says this behavior resembles Bayesian updating, where new evidence shifts belief and the model’s next-token probabilities update in a mathematically predictable way. The discussion also covers why token-space is modeled sparsely (many token combinations are effectively nonsensical), improving efficiency by filtering irrelevant combinations. Misra notes practical extensions such as domain-specific languages (DSLs) that convert natural-language questions into structured database queries. To evaluate machine-learning architectures more rigorously, Misra proposes a “Bayesian wind tunnel,” a controlled testing framework to compare transformers with other model types (e.g., MAMBA, LSTMs, MLPs). Overall, he frames progress toward AGI as moving beyond pattern matching toward causal understanding and continuous post-training learning—while using in-context learning as a bridge.

Neutral

这则内容本质上是 AI 研究观点与方法讨论（transformers 的相关性/因果性、in-context learning 与贝叶斯更新的类比、以及“Bayesian wind tunnel”测试框架），并未直接提到任何具体加密资产、协议升级、监管变化或链上/链下的可量化催化事件。因此对加密市场的直接冲击有限，交易上更可能体现为“叙事层面”的关注，而非立刻改变资金流。短期看：在缺少明确的代币/平台利好或利空信号时，市场波动通常由宏观流动性、比特币动量与风险偏好主导。该消息更像是科技讨论，短期不太可能触发“估值重定价”路径。长期看：若此类研究推动 AI 从相关性学习迈向因果理解，并强化可评估的模型测试体系，可能间接利好算力、AI 基础设施与潜在的 AI 相关 Web3叙事。但这仍属于间接、节奏较慢的影响，难以在短时间形成确定的交易信号。对比以往经验：当新闻仅停留在研究解读（无对应代币/产品落地）时，通常不会对 BTC/ETH 等主流资产产生持续趋势行情；只有当研究成果转化为可验证的产品、集成或监管/经济变量变化时，才更可能形成更明确的市场方向。