Context Engineering：在大模型竞赛寒武纪中求生

Manus的这篇博客（Context Engineering：在大模型竞赛"寒武纪"中求生），虽不算一部血泪史，却真是一种极为理性的务实选择。抛弃自Bert时代开始的微调之路，转而拥抱外部大模型，走上了所谓的“上下文工程”（Context Engineering）。从KV-Cache技术入手，到系统状态管理，从认知注意力机制到自适应学习，每一步都体现了Agent开发中的理性避险哲学。唯一遗憾的是，如果再纳入多模型协作与协同管理的议题，文章便更可称为上乘之作。

回望过去，早期的NLP从业者无不铭记谷歌开源Bert时的激动，那似乎已是一个遥远的时代，毕竟转眼已是七八年光阴流逝。在当时，任务迁移意味着必须进行复杂的微调；然而GPT-3的横空出世，却带来了黎明前的第一道曙光。

对于创业公司而言，迅速适应市场是生存法则。所谓创新的本质，不正是快速决策与敏捷执行吗？Manus坦言，他们正是在那一刻放弃了自主研发大模型，转而积极融入大模型生态。与此同时，大部分行业中人却还深陷于应用场景中，苦苦追求搜索、图谱、意图识别的些微优化。

端到端的自研训练固然自由度极高，但它的致命缺陷同样明显：迭代缓慢，成本巨大。而更为严峻的是，团队数月心血倾注的成果，往往会被OpenAI的一次模型更新瞬间淘汰。这种残酷的技术迭代逻辑，正如进化中的物种更替，弱势者注定被历史无情地抛弃。

于是，“上下文工程”战略应运而生。通过对前沿模型的有效解耦和利用，企业得以充分捕获当下数波大模型浪潮的红利。或许正是这一明晰果敢的战略选择，带来了Manus融资五亿美元的成功故事。这已不再是一个纯粹的技术路线选择，而是关乎企业生存与进化的关键节点。与其费尽心血创造一个很快便濒临灭绝的“新物种”，不如成为一个驾驭强大模型的“共生体”，在浩荡的技术浪潮中与强者共存。

当然，这篇文章的价值并不仅限于战略选择本身。

例如，在成本控制方面，Manus详尽地探讨了如何最大化KV-Cache命中率。这种努力将模糊的“降本增效”口号转化为明确量化的技术指标。虽然与Azure的接口实现略有差异，但本质逻辑一致。Anthropic、OpenAI及Google等大厂，早已在连续stream内部自动实现KV-cache，用户只需维持稳定的connection或conversation_id即可自动享受提速降费，无须额外代码投入。

再如上下文处理的物理极限问题。即便模型的上下文窗口扩展到128K甚至1M级别，在处理网页、PDF等非结构化数据时仍然面临容量迅速被填满的挑战。如何通过外部文件系统构建更加持久、无限延展的上下文结构，已成为Agent设计者必须跨越的最终门槛。

此外，即便是任务失败的数据也具有特殊的价值。考虑到大型任务平均需要近50次工具调用，如何记录任务轨迹、如何分析错误原因，以及如何处理现实世界固有的复杂性与不确定性，最终塑造出在理想条件下近乎完美的Agent，这都值得深思。

在这条通往AGI的漫长道路上，Manus将上下文工程视为一种与现实世界有效、稳定、经济交互的“操作系统”。未来的竞争，或许已不再是单纯的大模型能力较量，而是一场关于“上下文架构”的生存之战。

只是，我们不禁要问：大模型的原厂们会如何看待这一切？毕竟，模型本身正以前所未有之势吞噬着一切。

每一次技术革新，都是一次生态位的重新分配：旧物种艰难挣扎，甚至灭绝；新物种快速崛起，占据核心生态位。今天的大模型之争，正是如此——在这个生态系统中，无数初创公司必须做出自己的进化选择：或者成为自我封闭、自我进化但命运未卜的孤岛物种，或者选择与强大的大模型形成共生关系，迅速融入这片崭新的生态系统。

过去，我们一直将模型视作“工具”，而如今模型已逐渐转化为一种“环境”甚至“生态”，我们自身反而逐渐演变成了栖息其中的“共生体”。在这种新型关系中，我们究竟还能掌握多少主动权？当模型的规模与能力持续增长，当上技术不断扩展人类认知边界，我们是否仍然是技术的主人，还是已不知不觉成为了技术生态系统的一部分？这是我们必须认真思考的问题，因为未来或许早已不再由我们单方面决定，而是取决于我们如何与技术共同进化。

EN Version:

Manus’ recent blog post may not be an emotional “tale of blood and tears,” but it stands as a lucid, pragmatic act of survival. Rather than remaining on the fine-tuning highway that has defined the field since the BERT era, Manus swerved to embrace external large language models (LLMs) and coined a new discipline—“Context Engineering.” From the first line of KV-Cache utilization and rigorous system-state management to cognitive attention routing and adaptive learning, every decision reflects a rational, risk-averse philosophy for agent development. The only lingering regret is that the post stops just short of discussing multi-model collaboration and orchestration; with that dimension, the essay would have bordered on the sublime.

Looking back, NLP veterans still remember the adrenaline rush when Google open-sourced BERT. It feels distant, yet only seven or eight years have slipped away. Task transfer then meant painstaking fine-tuning; the abrupt arrival of GPT-3 shattered that timeline and let the first rays of dawn into a static sky.

For startups, rapid adaptation is the law of nature. Innovation is nothing more than making decisions at light speed and executing them even faster. Manus publicly acknowledged abandoning in-house LLM training the moment that law became clear, choosing instead to plug directly into the new LLM ecosystem. Meanwhile, much of the industry was still neck-deep in marginal gains for search, knowledge graphs, and intent recognition.

End-to-end self-training offers maximum freedom, but its drawbacks are lethal: sluggish iteration and huge cost. Worse, months of effort can evaporate overnight with one OpenAI model update. Technological evolution is brutal; weaker species are simply erased from the timeline.

Thus emerged the strategy of Context Engineering. By cleanly decoupling application logic from the model itself and leveraging the cutting-edge capabilities of leading providers, companies capture successive waves of LLM dividends. Manus’ decisive adoption of this approach no doubt contributed to its $500 million fund-raise. By now, this is no longer a mere technology-stack choice; it is a life-and-death moment in a company’s evolutionary path. Instead of birthing a “new species” doomed to rapid extinction, it is wiser to evolve into a symbiotic organism that can ride the currents of dominant LLMs.

Of course, the article’s value extends well beyond that strategic epiphany.

For instance, Manus drills deep into cost control by maximizing KV-Cache hit rates, turning the fuzzy mantra of “cost efficiency” into a quantifiable metric. While their implementation diverges slightly from Azure’s interface, the physics remains identical. Major vendors—Anthropic, OpenAI, Google—already perform automatic KV-Cache within continuous streams. By maintaining the same connection and conversation ID, developers gain higher throughput and lower bills, no extra code required.

The post also confronts the hard ceiling of context windows. Even with 128 K-token or 1 M-token limits, unstructured sources such as PDFs or sprawling web pages can overflow memory in seconds. Building an external file-system layer that supplies persistent, virtually unbounded context has become the final boss fight for agent architects.

Even “failed” task data is gold. A complex job may invoke fifty tool calls; every failure trace is a fossil record to log, dissect, and replay. Only by respecting these fossils can we approach fault-tolerant, near-perfect agents.

On the long road to AGI, Manus casts Context Engineering as an operating system—one that enables agents to interact with the real world in a stable, economical, and scalable manner. The coming battles will no longer hinge on raw model power alone; they will hinge on the sophistication of context architectures.

Yet one cannot help but wonder how the model providers themselves view this shift. After all, their creations now consume everything in sight with unprecedented appetite.

Every technological revolution redistributes ecological niches: old species struggle, fade, or vanish; new species erupt and dominate. Today’s race toward larger models is no different—startups must decide whether to remain isolated island species or integrate as symbionts within the emerging ecosystem.

Historically, we saw models merely as tools. Now they are morphing into environments—entire ecosystems—while we evolve into the organisms living inside them. As models swell in scale and Context Engineering pushes the boundaries of human cognition, how much agency do we truly retain? Are we still masters of the code, or inhabitants of a digital biome we no longer control? The future will be written not by humans alone, but by how effectively we co-evolve with the machines we have unleashed.

Another Version: Manus の最新ブログは、「血と涙の物語」というよりも、冷徹なサバイバルの記録と言える。BERT 以来続いてきたファインチューニングの大道から果敢に舵を切り、外部の大型言語モデル（LLM）を積極的に受け入れ、「コンテクスト・エンジニアリング」という新たな流儀を打ち立てたのである。KV-Cache の活用と厳密なシステム状態管理、認知的アテンションのルーティング、適応学習に至るまで、あらゆる決定にはエージェント開発における理性的なリスク回避の哲学が貫かれている。唯一惜しまれるのは、マルチモデル協調とオーケストレーションの論点が触れられていないことだ。そこまで踏み込めば、まさに円熟の域に達していただろう。

往年の NLP 技術者なら、Google が BERT を公開した瞬間の高揚をいまも鮮明に覚えているはずだ。しかし、あれからわずか 7〜8 年しか経っていないとは信じ難い。当時、タスク移行には煩雑なファインチューニングが不可欠だった。だが GPT-3 の電撃的な登場は、その常識を一夜にして打ち砕き、新時代の曙光をもたらした。

スタートアップの世界では、迅速な適応こそが生存法則である。イノベーションとは、光速で意思決定し、それ以上の速度で実行することに他ならない。Manus はその瞬間、内製 LLM の開発を断念し、新興 LLM エコシステムに直結する道を選んだ。一方、業界の多くは依然として検索やナレッジグラフ、インテント認識の僅かな最適化に没頭していた。

エンドツーエンドの独自学習は自由度が高い反面、致命的な欠点を抱える。イテレーションが遅く、コストは莫大だ。しかも、数カ月分の労力が OpenAI のモデル更新一つで瞬時に無価値となる。技術進化の淘汰圧は容赦なく、劣位の種は歴史から抹消される。

こうして「コンテクスト・エンジニアリング」戦略が台頭した。アプリケーション層をモデル本体から明瞭に切り離し、最先端 LLM の能力を最大限に借りることで、企業は連続する技術波の配当を取り込める。Manus がこの路線を明快に選択したことは、5 億ドルの資金調達成功にも大きく寄与したに違いない。もはや単なる技術選択ではなく、企業の進化の岐路である。短命の「新種」を生むより、強大な LLM と共生し、激流を乗りこなす「共生体」となる方が賢明だ。

もっとも、ブログの価値は戦略論にとどまらない。

たとえばコスト管理の章では、KV-Cache のヒット率最大化に深く踏み込み、「コスト最適化」という曖昧なスローガンを定量指標へと転換した。実装は Azure の API とはやや異なるものの、原理は同一だ。Anthropic、OpenAI、Google などの大手は、連続ストリーム内で自動的に KV-Cache を実装している。同一の connection と conversation ID を維持するだけで、追加コードなしに高速化と料金削減が得られる。

さらに、コンテクストウィンドウの物理的上限にも言及している。128K トークン、あるいは 1M トークンまで拡張されても、長大な PDF や Web ページといった非構造データは容易にメモリを飽和させる。外部ファイルシステムを接続し、実質無限のコンテクスト格子を構築することが、エージェント設計者にとって最後の難関となりつつある。

失敗ログさえも貴重な資源だ。複雑なジョブは平均 50 回前後のツール呼び出しを要するが、そのすべてのトレースを記録し、解析し、再生することでのみ、耐障害性に優れた準完璧なエージェントへ近づける。化石記録を無視すれば、進化の履歴は失われる。

AGI への長い旅路において、Manus はコンテクスト・エンジニアリングを “OS” と位置づける。現実世界と安定・経済的に接続するための作動層だ。今後の主戦場はモデル単体の性能ではなく、いかに高度なコンテクストアーキテクチャを築けるかに移行する。

とはいえ、モデルの供給元はこの潮流をどう見ているのだろうか。彼らの創り出したモデルは、かつてない速さで世界を飲み込んでいる。

技術革命は生態的ニッチを再配分する。旧来種は抗い、衰え、やがて消える。新種は爆発的に増え、覇権を握る。現代の LLM 競争も同じ構図だ。スタートアップは孤島の種であり続けるか、巨大モデルと共生関係を結ぶかを選ばねばならない。

かつてモデルは単なる「道具」だった。今やそれ自体が「環境」、さらには「生態系」へと変貌しつつある。そして我々は、その内部で生きる生物へと進化している。モデルの規模が膨張し、コンテクスト技術が人間の認知限界を押し広げる時、我々は依然コードの支配者なのか、それとも制御を失ったデジタル生態系の住人なのか。未来はもはや人間だけが描くものではない。テクノロジーと共進化できるか否かが、運命を決める鍵となる。