Orkas Orkas
Home首页 Blog博客 AgentsAgent
AgentsAgent

An Agent That Gets Better on Its Own: Inside Orkas's Self-Evolution一个会自己变好用的 Agent:拆解 Orkas 的自演进机制

Inside Orkas's local self-evolution loop: lightweight signals, background reflection, executable skills, skill metrics, and guardrails against learning the wrong lesson.拆解 Orkas 的本地自演进闭环:轻量信号、后台反思、可执行技能、技能指标,以及避免学错方向的防线。

Most AI assistants are "use it and forget it." Correct a habit today and it repeats the same mistake tomorrow; teach it your team's particular workflow last week and this week it acts like it never heard of it. Every conversation starts from zero — and however smart the model is, it's still a smart person with amnesia.

Orkas is going after something else: letting the agent learn from its own daily use, distilling recurring experience so it can apply it next time on its own. Put plainly — it gets more useful the more you use it, and that "usefulness" grows toward you, your preferences, your domain, rather than something the model vendor preset for everyone.

This article unpacks how that mechanism is built. It's not as simple as "make the model remember the conversation" — behind it is a complete loop: observe itself → decide whether to reflect → actually reflect → write the conclusions into something reusable → use it again next time. We'll go through it one piece at a time.

The most important thing first: everything below — all the "observing," "recording," and "reflecting" — happens entirely on your own device. Run data, skills, the agent's understanding of itself — all of it lives locally as ordinary files. None of it is uploaded to Orkas's servers, and none of it is used for cross-user analysis or model training. "Self-evolution" means a program reading its own run records locally and improving itself locally — not collecting your data. This experience never leaves the machine, and it serves only you, on this one machine.

The loop, end to end

real use, over and over
      │  recorded locally: which tools were called, errors or not, corrected or not
      ▼
   signals accumulate
      │  signals extracted from the conversation in place, all kept on-device
      ▼
  decide whether to reflect
      │  weighted scoring across signals, fires only past a threshold; network hiccups don't count
      ▼
   reflect in the background
      │  not every turn — periodically, picking whichever agents qualify
      ▼
  distilled into two things
      │  ① reusable "skills"   ② an "understanding" of itself
      ▼
  carried in automatically next turn
      └──────────► back to the top, keep rolling

Every step in this loop has its subtleties. The places easiest to get wrong are exactly the two steps that feel simplest at a glance: when to reflect, and what to record once you do. Let's start from the front.

Step 1: observing itself at nearly zero cost

To learn from experience you first need "experience" to look at. At the end of each agent run, the program counts out — locally, on the spot — a few lightweight facts about the run: roughly how many tools were called this turn, whether anything errored, whether it was a transient error (like a network issue) or a real one, and whether the user corrected it on the spot. Just a handful of counts and flags, all computed on the machine, calling no model and sent nowhere.

This matters because it costs nothing in model spend. These are counted straight from the current turn's conversation record; there's no need to make an extra model call just to "analyze itself." If every turn required another model call for introspection, the cost and latency would be unbearable and the whole mechanism would never ship.

The "corrected or not" bit is mildly interesting. It's a purely heuristic local judgment, estimated by matching a few phrasings in the message on your device — Chinese like "不对" / "应该是" / "重新," English like wrong, actually, instead. It doesn't aim to be precise — it's only a signal, not a verdict, and the occasional false positive is fine, because it later gets weighted together with other signals; nothing is decided on it alone.

Step 2: when is it actually worth reflecting

This is the part of the whole mechanism I think shows the most craft.

The naïve approach is "reflect once you've accumulated N occurrences." But that's crude: three network timeouts in a row and three user corrections in a row are obviously not the same thing and shouldn't be treated alike. Orkas uses weighted multi-signal scoring: each noteworthy phenomenon is a signal carrying a weight; sum the weights of the signals this turn triggered, and reflect only if the total clears a threshold (0.7 by default).

The main signals look roughly like this:

SignalWeightTrigger condition
User correction0.9A user correction was detected this turn
Skill ineffective0.85A skill was loaded, yet the turn still errored
Recovered from error0.8Errored, but ultimately pulled it back
Hit a known weakness0.7The task hit a soft spot noted in the self-assessment
Task complexity0.5Tool-call count exceeded a certain number

For example: a turn with both a user correction (0.9) and some complexity (0.5) sums to 1.4, well past 0.7, so it reflects; a turn that's only a bit complex (0.5) falls short and is let go. The weighting also reflects a judgment call — a direct user correction gets the highest weight, 0.9, because it's the highest signal-to-noise feedback there is: the user has plainly said you're wrong, so it's very likely worth recording.

That one critical exemption

In the whole scoring logic, there's one rule I consider the watershed for whether this mechanism "learns the right things": transient errors never count.

Network timeouts, dropped connections, rate limits — these are environmental problems, not deficiencies in the agent's own capability. Fail to exclude them and something bad happens: a tool errors because of one chance network hiccup, and the reflection mechanism records it as "this tool is unreliable, use it less" — or even mangles or deletes a perfectly good skill. From then on the agent has learned a wrong lesson, and that mistake follows it around.

So the "recovered from error," "skill ineffective," and "hit a known weakness" signals all explicitly keep pure transient errors out. The reflection prompt repeats the reminder too: network-class errors are environmental, don't record them as weaknesses, don't touch the related skills. What a self-improving system should fear most isn't learning slowly — it's learning in the wrong direction. This exemption is what guards against exactly that.

Step 3: reflection runs in the background, not in your face

An easy trap: the moment you detect "time to reflect," stop and reflect right there. That makes the agent feel like it stutters now and then, wandering off to "ponder life" — a bad experience.

Orkas moves reflection to the background, on a fixed cadence. The scheduling rules, roughly:

  • Start a reflection cycle every so often (say, on the order of a dozen-plus hours).
  • Enforce a minimum cooldown between two reflections for the same agent (a few hours), so it doesn't run too often.
  • But if it hasn't reflected in too long (say, over a week), force one, so it doesn't drag on indefinitely.
  • Cap the number of agents picked per cycle, so it doesn't spread too thin at once.

There's a small design I'm fond of called the dirty gate: when a cycle starts, first check whether this agent has anything new since its last reflection — any new signals, any updated conversation records. If there's no movement at all, skip it this time and don't waste a (model-costing) reflection. Simple, but it saves a lot in practice.

Step 4: how reflection actually works

When it's really time to reflect, the flow is: first organize the recent activity into a "packet," then pair it with a carefully written prompt and hand it to the model to read and summarize.

The packet has a budget: take at most a handful of recent conversations, add a few classes of system events, interleave them chronologically, and cap the total under a token limit (say, ten-thousand-plus). Not the whole history shoveled in — it wouldn't fit, and the signal-to-noise ratio would be poor.

What really takes care is the prompt. It requires the model to produce not "descriptions" but executable imperatives. The difference looks small and matters enormously. Compare:

✗ "The agent's output is sometimes too verbose; be mindful."

✓ "When answering family-office questions, never exceed 5 bullet points."

✗ "The user seems to prefer concise output."

✓ "When answering in a family-office context, always give the bottom line first, then the reasoning."

The prompt explicitly steers the model toward "never / always / when-then" structures with concrete trigger conditions. The reason is practical: a note that says "be mindful of being concise" tells the agent nothing actionable next time it reads it, whereas "never exceed 5 bullet points" can be followed directly. For self-improvement to be useful, what gets distilled has to be an instruction that lands — not a correct platitude.

After reflecting, the model can do a few things: create or modify a skill, update its understanding of itself, or — if there's genuinely nothing worth recording this window — just say "nothing to save." Letting it do nothing is itself an important design choice: don't force a learning outcome, lest you accumulate a pile of useless noise.

Distilled into two things

The output of reflection lands in two places.

One is skills. Each skill is a Markdown document with metadata — a frontmatter block recording the name, description, creation and update times, how many times it's been patched, and when it was last used — followed by the actual steps or key points:

---
name: "Weekly Report Export"
description: "Compile this week's data into the standard weekly-report format"
createdAt: "2025-01-01T00:00:00Z"
updatedAt: "2025-01-08T00:00:00Z"
patchCount: 2
lastUsedAt: "2025-01-09T10:00:00Z"
---

## Steps
1. ...
2. ...

Storing skills as files is a pragmatic choice: a person can read them directly and edit them directly — not locked away in some opaque database.

The other is an understanding of itself. This part is more like a memo the agent writes to itself, in two pieces: one notes "what I'm good at and where I tend to trip up," the other notes "the plays I've worked out for this user and this domain." Both have a length cap, forcing them to stay concise — not longer is better, but truer is better. At the start of the next conversation, this content is injected into the system prompt, so the agent walks in with "an understanding of itself."

Skills aren't write-only

Just creating skills, and you accumulate a junkyard over time. So skills have a full lifecycle.

Beyond creation, the more common operation is actually patching: changing a small span in an existing skill rather than tearing it down and rewriting. Each patch bumps a counter and refreshes the update time. This lets a skill grow gradually with experience, instead of being rewritten wholesale at every turn.

There's a ceiling on count, too. The total number of skills is capped (say, 200); once full, adding a new one evicts an old one via LRU (least-recently-used) to make room. Eviction has a preference: kick out the ones never used since creation first — a skill that's never been read was probably never distilled right in the first place, and is better off making way.

Every time the agent reads a skill, its "last used time" refreshes. This timestamp both feeds LRU's eviction decision and lets the local mechanism tell which skills are genuinely in use and which are just taking up space.

How you know whether a skill is actually useful

This is the step many "auto-learning" systems lazily skip: it learned something — but is it any good? Orkas turns this into a few metrics, locally. These metrics are computed for the on-machine evolution mechanism's own use — deciding which skill to revise or delete — and likewise never leave this machine.

The mechanism: at the start of each turn, the available skills appear in the system prompt's index — that's one "impression"; if the agent actually reads a skill that turn, that's one "invocation." Compare the two and you get the first metric —

  • Invocation rate = invocations / impressions. A skill that sits there day after day with no takers has a low invocation rate, meaning it's either useless or described so that no one can tell when to use it.
  • Edit-after-hit rate = the share of times a skill was invoked but the user then edited the result by hand. High means what the skill produced isn't quite to the user's taste.
  • Ineffective rate = the share of times a skill was invoked but the turn ended in a (non-transient) error. High suggests something may be wrong with the skill itself.

Here you see the shadow of that exemption again: when computing the ineffective rate, transient errors don't count, and neither do turns the user manually halted partway — you can't hang a black mark on a perfectly good skill over one network hiccup.

With these few numbers, skills go from "accumulating in a black box" to "something that can be evaluated and optimized." Which skill to revise or delete is no longer a gut call.

Closing the loop

Stringing the above together, one full cycle goes like this:

The agent works through real tasks, recording run data locally and marking signals in place as it goes. When the background reflection cycle comes due, it picks the agents that have new movement and have passed their cooldown, organizes each one's recent activity into a packet, and has the model review it against its current self-understanding — merging what should be merged, retiring what should be retired, distilling what should be distilled into new skills. The review's output becomes skills and self-understanding. On the next conversation, those skills go into the prompt index and the self-understanding goes into the system prompt, and the agent walks back in carrying what it learned last round. Then this round produces new metrics and signals, fed back to the start.

The loop rolls on, round after round. Not every round brings a dramatic leap, but the direction is one-way: toward understanding you better and repeating fewer of the same mistakes.

A few trade-offs worth naming

Looking back, a few decisions in this mechanism are key.

Introspection must be cheap. Observing itself uses zero-model-cost metrics; the genuinely expensive reflection is moved to the background, run infrequently, and gated by the dirty check first. Clamp down hard on the "expensive" part and the whole mechanism can actually run.

Better not to learn than to learn wrong. The transient-error exemption, allowing reflection to "save nothing," writing executable imperatives instead of vague descriptions — all point to the same judgment: for a self-improving system, learning in the wrong direction is far more dangerous than learning slowly.

What's learned must be visible, editable, and in your hands. Skills are plain-text files, self-understanding is a plain-text memo, skill effectiveness is checkable via metrics — and all these files sit on your own machine, not in the cloud. No black box anywhere; a human can open it and tweak it anytime.

Put brakes on learning. Count caps, LRU eviction, length limits — without these, "continuous learning" sooner or later becomes "continuous bloat." Forgetting, dropping, and pruning matter as much as remembering.

Wrapping up

Orkas's self-evolution is, at heart, adding a slow loop to the agent: the fast loop is the immediate response of each conversation; the slow loop is periodically looking back and distilling experience into something usable next time. The hard part isn't "making the model remember" — it's the easily-overlooked engineering judgments: how to tell which experiences are worth recording, how not to be thrown off by a chance failure, how to make what's learned genuinely executable, and how to prune it before it bloats.

Those judgments, taken together, turn "gets more useful the more you use it" from a marketing line into a mechanism that actually runs. An assistant that learns from you — and won't learn the wrong things — may be closer to what most people actually want than one that's merely smarter.

大多数 AI 助手是「用完即忘」的。你今天纠正它一个习惯,明天它照犯不误;你上周教它一套你团队特有的流程,这周它当作没听过。每一次对话都从零开始,再聪明的模型也只是个失忆的聪明人。

Orkas 想做的是另一件事:让 agent 从自己每天的使用里学东西,把反复出现的经验沉淀下来,下次自己用上。说得直白点——它会越用越顺手,而且这个「顺手」是冲着你、你的偏好、你的领域长出来的,不是模型厂商替所有人预设好的。

这篇文章拆一下这套机制是怎么搭的。它不是「让模型记住对话」这么简单,背后是一条完整的闭环:观察自己 → 判断要不要反思 → 真的去反思 → 把结论写成可复用的东西 → 下次再用上。下面一段段看。

先把最要紧的一句话放前面:下文讲的所有「观察」「记录」「反思」,从头到尾都发生在你自己的设备上。 运行情况、技能、对自己的认知,全部以普通文件的形式存在本地,不上传到 Orkas 的服务端,也不会被拿去做跨用户的分析或模型训练。所谓「自演进」,是程序在本地读自己的运行记录、在本地改进自己,而不是把你的数据收上去。这些经验既出不了这台机器,也只服务于这一台机器上的你。

先看整条闭环

一次次真实使用
      │  在本地记下:调了哪些工具、出没出错、有没有被纠正
      ▼
   信号累积
      │  就地从对话里提取出一条条信号,全部留在本机
      ▼
  判断该不该反思
      │  多个信号加权打分,过线才触发;网络抖动这类不算
      ▼
   后台跑反思
      │  不是每轮都跑,是隔一段时间挑符合条件的来一次
      ▼
  沉淀成两类东西
      │  ① 可复用的「技能」  ② 对自己的「认知」
      ▼
  下一轮自动带上
      └──────────► 回到最上面,继续滚

这条环里每一步都有讲究。最容易做错的地方,恰恰是大家直觉上觉得最简单的两步:什么时候该反思,以及反思完该记下什么。先从最前面说起。

第一步:几乎零成本地观察自己

要从经验里学,先得有「经验」可看。每一轮 agent 运行结束,程序会在本地就地数出几个很轻的运行情况——这轮大致调了几次工具、有没有出错、出的是网络这类瞬时错误还是真错误、有没有被当场纠正。就这么几个计数和标记,全在本机算完,既不调模型、也不往任何地方发。

这点之所以关键,是因为它完全不花模型的钱。这些都是从本轮对话记录里直接数出来的,不需要再额外调一次模型去「分析自己」。如果每轮都要为了自省再调一次模型,成本和延迟都扛不住,这套机制根本上不了线。

其中「有没有被纠正」稍微有点意思。它是一个纯启发式的本地判断,靠在你的设备上匹配消息里的一些措辞来估,比如中文的「不对」「应该是」「重新」、英文的 wrongactuallyinstead。它不追求准——这只是一个信号而不是结论,偶尔误判完全可以接受,因为它后面还要和别的信号一起加权,不会单凭它就下判断。

第二步:什么时候才值得反思

这是整套机制里我觉得最见功力的地方。

朴素的做法是「攒够 N 次就反思一回」。但这很糙:连续三次网络超时,和用户连续三次纠正你,显然不是一回事,不该同等对待。Orkas 用的是多信号加权打分:每一类值得注意的现象是一个信号,带一个权重,把这轮触发的信号权重加起来,超过阈值(默认 0.7)才反思。

几个主要信号大致是这样的:

信号权重触发条件
用户纠正0.9这轮里检测到用户在纠正你
技能没起效0.85加载了技能,结果这轮还是错了
从错误中恢复0.8出过错但最终救回来了
撞上已知弱点0.7任务命中了自我评估里记着的薄弱项
任务复杂0.5工具调用次数超过一定数量

举个例子:一轮对话里既有用户纠正(0.9)又比较复杂(0.5),加起来 1.4,远过 0.7,触发反思;而一轮只是稍微复杂一点(0.5),不到线,就放过。权重的设计也透着取舍——用户的直接纠正给到最高的 0.9,因为那是信号噪音比最高的反馈:用户都明说你错了,这事多半真值得记下来。

那条最关键的豁免线

整个打分逻辑里,有一条规则我认为是这套机制能不能「学对东西」的分水岭:瞬时错误一律不算

网络超时、连接被掐、限流——这些是环境问题,不是 agent 自己的能力缺陷。如果不把它们排除掉,会发生很糟的事:某个工具因为一次偶发的网络抖动报了错,反思机制把这记成「这个工具不靠谱,以后少用」,甚至把一个本来好好的技能给改坏、删掉。从此以后 agent 学会了一个错误的教训,而且这个错误会一直跟着它。

所以「从错误中恢复」「技能没起效」「撞上已知弱点」这几个信号,都明确把纯瞬时错误挡在外面。反思的提示词里也会再叮嘱一遍:网络类的错误是环境问题,不要记成弱点、不要去动相关的技能。一套自我改进的系统,最怕的不是学得慢,而是学错方向——这条豁免线挡的就是这个。

第三步:反思在后台跑,不在你面前跑

一个容易踩的坑是:检测到「该反思了」,就当场停下来反思一把。这会让用户感觉 agent 时不时卡一下、走神去「想人生」,体验很差。

Orkas 把反思挪到了后台,按一个固定节奏来。大致的调度规矩是:

  • 每隔一段时间(比如十几个小时)起一个反思周期;
  • 同一个 agent 两次反思之间有最短冷却(比如几小时),不会过于频繁;
  • 但如果太久没反思过了(比如超过一周),强制来一次,免得一直拖;
  • 一个周期里挑的 agent 数量有上限,避免一次性铺太开。

还有一个我很喜欢的小设计叫脏检查(dirty gate):起反思周期时,先看这个 agent 自上次反思以来到底有没有新东西——有没有新的信号、对话记录有没有更新过。要是压根没动静,这次就直接跳过,不浪费一次(要花模型钱的)反思。简单,但省得很实在。

第四步:反思具体怎么做

到了真要反思的时候,流程是:先把最近这段时间的活动整理成一份「材料」,再配一段精心写的提示词,交给模型去读、去总结。

材料这块是有预算的:最多取最近的若干段对话,再加上几类系统事件,按时间顺序穿插起来,总量卡在一个 token 上限内(比如一万多)。不是把所有历史一股脑塞进去——既塞不下,信噪比也低。

真正讲究的是提示词。它要求模型产出的不是「描述」,而是可执行的祈使句。这个区别看着小,影响极大。对比一下:

✗ 「Agent 输出有时过于冗长,应当注意。」

✓ 「回复家办场景的问题时,绝不超过 5 个要点。」

✗ 「用户似乎偏好简洁的输出。」

✓ 「当回复家办场景时,永远先给出结论,再讲理由。」

提示词明确引导模型用「绝不 / 永远 / 当……就……」这种带具体触发条件的结构来写。原因很实在:一条「应当注意简洁」的笔记,下次 agent 读到了也不知道该怎么做;而一条「绝不超过 5 个要点」是能直接照着执行的。自我改进要有用,沉淀下来的东西必须是能落地的指令,不是正确的废话。

反思完,模型可以做几件事:新建或修改一个技能、更新对自己的认知、或者——如果这段时间确实没什么值得记的,就明说一句「没什么要存的」。允许它什么都不做,本身也是个重要的设计:不强行凑学习成果,免得攒下一堆没用的噪音。

沉淀成两类东西

反思的产出落到两个地方。

一类是技能。 每个技能就是一个带元信息的 Markdown 文档,开头一段 frontmatter 记着名字、描述、创建和更新时间、被改过几次、上次用是什么时候,正文是具体的操作步骤或要点:

---
name: "导出周报"
description: "把本周数据整理成固定格式的周报"
createdAt: "2025-01-01T00:00:00Z"
updatedAt: "2025-01-08T00:00:00Z"
patchCount: 2
lastUsedAt: "2025-01-09T10:00:00Z"
---

## 步骤
1. ……
2. ……

技能用文件来存,是个很务实的选择:人能直接看、能直接改,不锁在某个不透明的数据库里。

另一类是对自己的认知。 这部分更像 agent 写给自己的备忘,分两份:一份记「我擅长什么、在哪儿容易翻车」,一份记「面对这个用户、这个领域,我摸索出的打法」。两份都有字数上限,逼着它保持精炼——不是越长越好,是越准越好。下一轮对话开始时,这些内容会注进系统提示词,让 agent 带着「对自己的了解」上场。

技能不是只进不出

光会创建技能,攒着攒着就成了垃圾场。所以技能有完整的生命周期。

创建之外,更常用的其实是打补丁:在已有技能上改一小段,而不是推倒重来。每打一次补丁,计数加一、更新时间刷新。这让技能能随着经验慢慢长,而不是动不动整篇重写。

数量也有天花板。技能总数设了上限(比如两百个),到顶了再加新的,就按 LRU(最近最少使用) 淘汰一个旧的腾位置。淘汰还有个偏好:优先踢那些建了之后就再没被用过的——一个从没被读取过的技能,多半当初就没沉淀对,留着不如让位。

每次 agent 读取一个技能,它的「上次使用时间」就刷新一下。这个时间戳既喂给 LRU 做淘汰判断,也让本地的机制能分清哪些技能真在用、哪些只是占着位置。

怎么知道技能到底有没有用

这是很多「自动学习」系统会偷懒跳过的一环:学是学了,到底有没有用?Orkas 在本地把它做成了几个指标。这些指标算出来只给本机的演进机制自己用——判断哪个技能该改、该删,同样不出这台机器。

机制是这样的:每轮对话开始时,可用的技能会出现在系统提示词的索引里,这叫一次「曝光」;如果 agent 这轮真的去读取了某个技能,算一次「调用」。两者一比,就有了第一个指标——

  • 调用率 = 被调用次数 / 被曝光次数。技能天天摆在那儿却没人用,调用率就低,说明它要么没用、要么描述写得让人看不出该什么时候用。
  • 改动率 = 调用了该技能、但用户随后又动手改了结果的比例。高,说明这技能给出的东西不太对用户胃口。
  • 失效率 = 调用了该技能、这轮却以(非瞬时的)错误收场的比例。高,说明技能本身可能有问题。

这里又能看到那条豁免线的影子:算失效率时,瞬时错误同样不算数,用户中途主动喊停的也不算——总不能因为一次网络抖动,就给一个本来好用的技能记上一笔黑账。

有了这几个数,技能从「黑盒地攒着」变成了「能被评估、能被优化」的东西。哪个技能该改、该删,不再靠拍脑袋。

闭环合上

把前面这些串起来,一次完整的循环是这样走的:

agent 在一次次真实任务里干活,顺手在本地记下运行情况、就地标出信号。后台的反思周期到点了,挑出有新动静、又过了冷却期的 agent,把它最近的活动整理成材料,让模型对照着自己当前的认知去复盘——该合并的合并、该淘汰的淘汰、该提炼成新技能的提炼。复盘的产出落成技能和自我认知。下一轮对话,这些技能进了提示词索引,自我认知进了系统提示词,agent 带着上一轮学到的东西重新上场。然后这一轮又产生新的指标和信号,喂回最开始。

环就这么一圈圈滚下去。每一圈不一定有惊天动地的进步,但方向是单向的:朝着更懂你、更少犯同样的错。

几个值得说的取舍

回头看,这套机制里有几个决定挺关键。

自省必须便宜。 观察自己用的是零模型成本的指标,真正费钱的反思被挪到后台、低频、还要先过脏检查。把「贵」的部分死死摁住,这套机制才跑得起。

宁可不学,不可学歪。 瞬时错误的豁免线、允许反思「什么都不存」、写成可执行祈使句而不是模糊描述——这几条都指向同一个判断:一个自我改进的系统,学错方向比学得慢危险得多。

学到的东西要看得见、改得动、且就在你手上。 技能是明文文件,自我认知是明文备忘,技能有效性有指标可查,而且这些文件全都躺在你自己的机器上,不上云。整套机制没有黑盒,人随时能打开看、随手能改。

给学习装上刹车。 数量上限、LRU 淘汰、字数限制——没有这些,「持续学习」迟早变成「持续膨胀」。会忘、会丢、会精简,和会记一样重要。

小结

Orkas 的自演进,本质上是给 agent 加了一条慢回路:快回路是每一次对话的即时响应,慢回路是隔一段时间回头看看、把经验沉淀成下次能用的东西。难点不在「让模型记住」,而在那些容易被忽略的工程判断——怎么判断哪些经验值得记、怎么不被偶发故障带歪、怎么让学到的东西真能执行、又怎么在它膨胀之前及时修剪。

这些判断合起来,让「越用越顺手」从一句产品宣传,变成了一套真的在转的机制。一个能从你身上学习、又不会学歪的助手,比一个单纯更聪明的助手,可能更接近大多数人真正想要的那个东西。