HomeInício首页ホーム BlogBlog博客ブログ AgentsAgentesAgentAgent

AgentsAgentesAgentAgent

An Agent That Gets Better on Its Own: Inside Orkas's Self-EvolutionUm agente que melhora por conta própria: por dentro da autoevolução de Orkas一个会自己变好用的 Agent：拆解 Orkas 的自演进机制自分で良くなる Agent：Orkas の自己進化ループ

Inside Orkas's local self-evolution loop: lightweight signals, background reflection, executable skills, skill metrics, and guardrails against learning the wrong lesson.Por dentro do ciclo de autoevolução local de Orkas: sinais leves, reflexão de fundo, habilidades executáveis, métricas de habilidades e proteções contra o aprendizado da lição errada.拆解 Orkas 的本地自演进闭环：轻量信号、后台反思、可执行技能、技能指标，以及避免学错方向的防线。Orkas のローカル自己進化ループを解説します。軽量 signals、バックグラウンド reflection、実行可能な skills、skill metrics、そして誤った学習を防ぐ guardrails。

Orkas TeamEquipe OrkasOrkas 团队Orkas チーム Jun 10, 202610 de junho de 20262026 年 6 月 10 日2026年6月10日

Most AI assistants are "use it and forget it." Correct a habit today and it repeats the same mistake tomorrow; teach it your team's particular workflow last week and this week it acts like it never heard of it. Every conversation starts from zero — and however smart the model is, it's still a smart person with amnesia.

Orkas is going after something else: letting the agent learn from its own daily use, distilling recurring experience so it can apply it next time on its own. Put plainly — it gets more useful the more you use it, and that "usefulness" grows toward you, your preferences, your domain, rather than something the model vendor preset for everyone.

This article unpacks how that mechanism is built. It's not as simple as "make the model remember the conversation" — behind it is a complete loop: observe itself → decide whether to reflect → actually reflect → write the conclusions into something reusable → use it again next time. We'll go through it one piece at a time.

The most important thing first: everything below — all the "observing," "recording," and "reflecting" — happens entirely on your own device. Run data, skills, the agent's understanding of itself — all of it lives locally as ordinary files. None of it is uploaded to Orkas's servers, and none of it is used for cross-user analysis or model training. "Self-evolution" means a program reading its own run records locally and improving itself locally — not collecting your data. This experience never leaves the machine, and it serves only you, on this one machine.

The loop, end to end

real use, over and over
      │  recorded locally: which tools were called, errors or not, corrected or not
      ▼
   signals accumulate
      │  signals extracted from the conversation in place, all kept on-device
      ▼
  decide whether to reflect
      │  weighted scoring across signals, fires only past a threshold; network hiccups don't count
      ▼
   reflect in the background
      │  not every turn — periodically, picking whichever agents qualify
      ▼
  distilled into two things
      │  ① reusable "skills"   ② an "understanding" of itself
      ▼
  carried in automatically next turn
      └──────────► back to the top, keep rolling

Every step in this loop has its subtleties. The places easiest to get wrong are exactly the two steps that feel simplest at a glance: when to reflect, and what to record once you do. Let's start from the front.

Step 1: observing itself at nearly zero cost

To learn from experience you first need "experience" to look at. At the end of each agent run, the program counts out — locally, on the spot — a few lightweight facts about the run: roughly how many tools were called this turn, whether anything errored, whether it was a transient error (like a network issue) or a real one, and whether the user corrected it on the spot. Just a handful of counts and flags, all computed on the machine, calling no model and sent nowhere.

This matters because it costs nothing in model spend. These are counted straight from the current turn's conversation record; there's no need to make an extra model call just to "analyze itself." If every turn required another model call for introspection, the cost and latency would be unbearable and the whole mechanism would never ship.

The "corrected or not" bit is mildly interesting. It's a purely heuristic local judgment, estimated by matching a few phrasings in the message on your device — Chinese like "不对" / "应该是" / "重新," English like wrong, actually, instead. It doesn't aim to be precise — it's only a signal, not a verdict, and the occasional false positive is fine, because it later gets weighted together with other signals; nothing is decided on it alone.

Step 2: when is it actually worth reflecting

This is the part of the whole mechanism I think shows the most craft.

The naïve approach is "reflect once you've accumulated N occurrences." But that's crude: three network timeouts in a row and three user corrections in a row are obviously not the same thing and shouldn't be treated alike. Orkas uses weighted multi-signal scoring: each noteworthy phenomenon is a signal carrying a weight; sum the weights of the signals this turn triggered, and reflect only if the total clears a threshold (0.7 by default).

The main signals look roughly like this:

Signal	Weight	Trigger condition
User correction	0.9	A user correction was detected this turn
Skill ineffective	0.85	A skill was loaded, yet the turn still errored
Recovered from error	0.8	Errored, but ultimately pulled it back
Hit a known weakness	0.7	The task hit a soft spot noted in the self-assessment
Task complexity	0.5	Tool-call count exceeded a certain number

For example: a turn with both a user correction (0.9) and some complexity (0.5) sums to 1.4, well past 0.7, so it reflects; a turn that's only a bit complex (0.5) falls short and is let go. The weighting also reflects a judgment call — a direct user correction gets the highest weight, 0.9, because it's the highest signal-to-noise feedback there is: the user has plainly said you're wrong, so it's very likely worth recording.

That one critical exemption

In the whole scoring logic, there's one rule I consider the watershed for whether this mechanism "learns the right things": transient errors never count.

Network timeouts, dropped connections, rate limits — these are environmental problems, not deficiencies in the agent's own capability. Fail to exclude them and something bad happens: a tool errors because of one chance network hiccup, and the reflection mechanism records it as "this tool is unreliable, use it less" — or even mangles or deletes a perfectly good skill. From then on the agent has learned a wrong lesson, and that mistake follows it around.

So the "recovered from error," "skill ineffective," and "hit a known weakness" signals all explicitly keep pure transient errors out. The reflection prompt repeats the reminder too: network-class errors are environmental, don't record them as weaknesses, don't touch the related skills. What a self-improving system should fear most isn't learning slowly — it's learning in the wrong direction. This exemption is what guards against exactly that.

Step 3: reflection runs in the background, not in your face

An easy trap: the moment you detect "time to reflect," stop and reflect right there. That makes the agent feel like it stutters now and then, wandering off to "ponder life" — a bad experience.

Orkas moves reflection to the background, on a fixed cadence. The scheduling rules, roughly:

Start a reflection cycle every so often (say, on the order of a dozen-plus hours).
Enforce a minimum cooldown between two reflections for the same agent (a few hours), so it doesn't run too often.
But if it hasn't reflected in too long (say, over a week), force one, so it doesn't drag on indefinitely.
Cap the number of agents picked per cycle, so it doesn't spread too thin at once.

There's a small design I'm fond of called the dirty gate: when a cycle starts, first check whether this agent has anything new since its last reflection — any new signals, any updated conversation records. If there's no movement at all, skip it this time and don't waste a (model-costing) reflection. Simple, but it saves a lot in practice.

Step 4: how reflection actually works

When it's really time to reflect, the flow is: first organize the recent activity into a "packet," then pair it with a carefully written prompt and hand it to the model to read and summarize.

The packet has a budget: take at most a handful of recent conversations, add a few classes of system events, interleave them chronologically, and cap the total under a token limit (say, ten-thousand-plus). Not the whole history shoveled in — it wouldn't fit, and the signal-to-noise ratio would be poor.

What really takes care is the prompt. It requires the model to produce not "descriptions" but executable imperatives. The difference looks small and matters enormously. Compare:

✗ "The agent's output is sometimes too verbose; be mindful."
✓ "When answering family-office questions, never exceed 5 bullet points."

✗ "The user seems to prefer concise output."
✓ "When answering in a family-office context, always give the bottom line first, then the reasoning."

The prompt explicitly steers the model toward "never / always / when-then" structures with concrete trigger conditions. The reason is practical: a note that says "be mindful of being concise" tells the agent nothing actionable next time it reads it, whereas "never exceed 5 bullet points" can be followed directly. For self-improvement to be useful, what gets distilled has to be an instruction that lands — not a correct platitude.

After reflecting, the model can do a few things: create or modify a skill, update its understanding of itself, or — if there's genuinely nothing worth recording this window — just say "nothing to save." Letting it do nothing is itself an important design choice: don't force a learning outcome, lest you accumulate a pile of useless noise.

Distilled into two things

The output of reflection lands in two places.

One is skills. Each skill is a Markdown document with metadata — a frontmatter block recording the name, description, creation and update times, how many times it's been patched, and when it was last used — followed by the actual steps or key points:

---
name: "Weekly Report Export"
description: "Compile this week's data into the standard weekly-report format"
createdAt: "2025-01-01T00:00:00Z"
updatedAt: "2025-01-08T00:00:00Z"
patchCount: 2
lastUsedAt: "2025-01-09T10:00:00Z"
---

## Steps
1. ...
2. ...

Storing skills as files is a pragmatic choice: a person can read them directly and edit them directly — not locked away in some opaque database.

The other is an understanding of itself. This part is more like a memo the agent writes to itself, in two pieces: one notes "what I'm good at and where I tend to trip up," the other notes "the plays I've worked out for this user and this domain." Both have a length cap, forcing them to stay concise — not longer is better, but truer is better. At the start of the next conversation, this content is injected into the system prompt, so the agent walks in with "an understanding of itself."

Skills aren't write-only

Just creating skills, and you accumulate a junkyard over time. So skills have a full lifecycle.

Beyond creation, the more common operation is actually patching: changing a small span in an existing skill rather than tearing it down and rewriting. Each patch bumps a counter and refreshes the update time. This lets a skill grow gradually with experience, instead of being rewritten wholesale at every turn.

There's a ceiling on count, too. The total number of skills is capped (say, 200); once full, adding a new one evicts an old one via LRU (least-recently-used) to make room. Eviction has a preference: kick out the ones never used since creation first — a skill that's never been read was probably never distilled right in the first place, and is better off making way.

Every time the agent reads a skill, its "last used time" refreshes. This timestamp both feeds LRU's eviction decision and lets the local mechanism tell which skills are genuinely in use and which are just taking up space.

How you know whether a skill is actually useful

This is the step many "auto-learning" systems lazily skip: it learned something — but is it any good? Orkas turns this into a few metrics, locally. These metrics are computed for the on-machine evolution mechanism's own use — deciding which skill to revise or delete — and likewise never leave this machine.

The mechanism: at the start of each turn, the available skills appear in the system prompt's index — that's one "impression"; if the agent actually reads a skill that turn, that's one "invocation." Compare the two and you get the first metric —

Invocation rate = invocations / impressions. A skill that sits there day after day with no takers has a low invocation rate, meaning it's either useless or described so that no one can tell when to use it.
Edit-after-hit rate = the share of times a skill was invoked but the user then edited the result by hand. High means what the skill produced isn't quite to the user's taste.
Ineffective rate = the share of times a skill was invoked but the turn ended in a (non-transient) error. High suggests something may be wrong with the skill itself.

Here you see the shadow of that exemption again: when computing the ineffective rate, transient errors don't count, and neither do turns the user manually halted partway — you can't hang a black mark on a perfectly good skill over one network hiccup.

With these few numbers, skills go from "accumulating in a black box" to "something that can be evaluated and optimized." Which skill to revise or delete is no longer a gut call.

Closing the loop

Stringing the above together, one full cycle goes like this:

The agent works through real tasks, recording run data locally and marking signals in place as it goes. When the background reflection cycle comes due, it picks the agents that have new movement and have passed their cooldown, organizes each one's recent activity into a packet, and has the model review it against its current self-understanding — merging what should be merged, retiring what should be retired, distilling what should be distilled into new skills. The review's output becomes skills and self-understanding. On the next conversation, those skills go into the prompt index and the self-understanding goes into the system prompt, and the agent walks back in carrying what it learned last round. Then this round produces new metrics and signals, fed back to the start.

The loop rolls on, round after round. Not every round brings a dramatic leap, but the direction is one-way: toward understanding you better and repeating fewer of the same mistakes.

A few trade-offs worth naming

Looking back, a few decisions in this mechanism are key.

Introspection must be cheap. Observing itself uses zero-model-cost metrics; the genuinely expensive reflection is moved to the background, run infrequently, and gated by the dirty check first. Clamp down hard on the "expensive" part and the whole mechanism can actually run.

Better not to learn than to learn wrong. The transient-error exemption, allowing reflection to "save nothing," writing executable imperatives instead of vague descriptions — all point to the same judgment: for a self-improving system, learning in the wrong direction is far more dangerous than learning slowly.

What's learned must be visible, editable, and in your hands. Skills are plain-text files, self-understanding is a plain-text memo, skill effectiveness is checkable via metrics — and all these files sit on your own machine, not in the cloud. No black box anywhere; a human can open it and tweak it anytime.

Put brakes on learning. Count caps, LRU eviction, length limits — without these, "continuous learning" sooner or later becomes "continuous bloat." Forgetting, dropping, and pruning matter as much as remembering.

Wrapping up

Orkas's self-evolution is, at heart, adding a slow loop to the agent: the fast loop is the immediate response of each conversation; the slow loop is periodically looking back and distilling experience into something usable next time. The hard part isn't "making the model remember" — it's the easily-overlooked engineering judgments: how to tell which experiences are worth recording, how not to be thrown off by a chance failure, how to make what's learned genuinely executable, and how to prune it before it bloats.

Those judgments, taken together, turn "gets more useful the more you use it" from a marketing line into a mechanism that actually runs. An assistant that learns from you — and won't learn the wrong things — may be closer to what most people actually want than one that's merely smarter.

A maioria dos assistentes de IA são do tipo "use e esqueça". Corrija um hábito hoje e ele repetirá o mesmo erro amanhã; ensine a ele o fluxo de trabalho específico de sua equipe na semana passada e esta semana ele age como se nunca tivesse ouvido falar dele. Toda conversa começa do zero e, por mais inteligente que seja o modelo, ainda é uma pessoa inteligente com amnésia.

Orkas está buscando outra coisa: deixar o agente aprender com seu próprio uso diário, destilando experiências recorrentes para que ele possa aplicá-las por conta própria na próxima vez. Simplificando: ele se torna mais útil quanto mais você o usa, e essa "utilidade" cresce para você, suas preferências, seu domínio, em vez de algo que o fornecedor do modelo predefiniu para todos.

Este artigo explica como esse mecanismo é construído. Não é tão simples como “fazer o modelo se lembrar da conversa” – por trás disso há um ciclo completo: observar a si mesmo → decidir se deve refletir → realmente refletir → escrever as conclusões em algo reutilizável → usá-lo novamente na próxima vez. Analisaremos uma parte de cada vez.

A coisa mais importante primeiro: tudo abaixo — toda a "observação", "gravação" e "reflexão" — acontece inteiramente em seu próprio dispositivo. Execute dados, habilidades, a compreensão que o agente tem de si mesmo — tudo isso reside localmente como arquivos comuns. Nada disso é carregado nos servidores do Orkas e nada é usado para análise entre usuários ou treinamento de modelo. "Autoevolução" significa um programa lendo seus próprios registros de execução localmente e melhorando-se localmente - sem coletar seus dados. Essa experiência nunca sai da máquina e serve apenas a você, nesta máquina.

O loop, de ponta a ponta

uso real, repetidamente
      │ registrado localmente: quais ferramentas foram chamadas, erros ou não, corrigidas ou não
      ▼
   sinais se acumulam
      │ sinais extraídos da conversa no local, todos mantidos no dispositivo
      ▼
  decidir se refletirá
      │ pontuação ponderada entre sinais, dispara apenas além de um limite; soluços de rede não contam
      ▼
   refletir no fundo
      │ não em todos os turnos — periodicamente, escolhendo os agentes qualificados
      ▼
  destilado em duas coisas
      │ ① "habilidades" reutilizáveis ② uma "compreensão" de si mesmo
      ▼
  transportado automaticamente no próximo turno
      └──────────► voltar ao topo, continuar rolando

Cada etapa desse ciclo tem suas sutilezas. Os pontos mais fáceis de errar são exatamente as duas etapas que parecem mais simples à primeira vista: quando refletir e o que registrar depois de fazer isso. Vamos começar pela frente.

Etapa 1: observar a si mesmo com custo quase zero

Para aprender com a experiência você primeiro precisa de “experiência” para olhar. No final de cada execução do agente, o programa conta — localmente, no local — alguns fatos leves sobre a execução: aproximadamente quantas ferramentas foram chamadas neste turno, se houve algum erro, se foi um erro transitório (como um problema de rede) ou real, e se o usuário o corrigiu no local. Apenas algumas contagens e sinalizadores, todos computados na máquina, sem chamar nenhum modelo e não serem enviados a lugar nenhum.

Isso é importante porque não custa nada nos gastos do modelo. Estes são contados diretamente do registro da conversa do turno atual; não há necessidade de fazer uma chamada extra de modelo apenas para "analisar-se". Se cada curva exigisse outro modelo de introspecção, o custo e a latência seriam insuportáveis e todo o mecanismo nunca seria lançado.

A parte "corrigido ou não" é levemente interessante. É um julgamento local puramente heurístico, estimado pela correspondência de algumas frases na mensagem em seu dispositivo, como "não está certo", "deveria ser" ou "refaça", além de termos como errado, na verdade e em vez disso. Não pretende ser preciso — é apenas um sinal, não um veredicto, e os falsos positivos ocasionais são aceitáveis, porque mais tarde são ponderados juntamente com outros sinais; nada é decidido sozinho.

Etapa 2: quando realmente vale a pena refletir

Essa é a parte de todo o mecanismo que acho que mostra mais habilidade.

A abordagem ingênua é "refletir depois de acumular N ocorrências". Mas isso é grosseiro: três tempos limite de rede seguidos e três correções de usuário seguidas obviamente não são a mesma coisa e não devem ser tratados da mesma forma. Orkas usa pontuação multissinal ponderada: cada fenômeno digno de nota é um sinal que carrega um peso; some os pesos dos sinais acionados neste turno e reflita apenas se o total ultrapassar um limite (0,7 por padrão).

Os principais sinais são mais ou menos assim:

Sinal	Peso	Condição de gatilho
Correção do usuário	0,9	Uma correção do usuário foi detectada neste turno
Habilidade ineficaz	0,85	Uma habilidade foi carregada, mas o turno ainda com erro
Recuperado do erro	0,8	Erro, mas finalmente recuperou
Atingiu um ponto fraco conhecido	0,7	A tarefa atingiu um ponto fraco observado na autoavaliação
Tarefa complexidade	0,5	A contagem de chamadas de ferramenta excedeu um determinado número

Por exemplo: uma curva com correção do usuário (0,9) e alguma complexidade (0,5) soma 1,4, bem além de 0,7, então reflete; uma curva que é apenas um pouco complexa (0,5) falha e é abandonada. A ponderação também reflete um julgamento – uma correção direta do usuário recebe o peso mais alto, 0,9, porque é o feedback sinal-ruído mais alto que existe: o usuário disse claramente que você está errado, então provavelmente vale a pena registrar.

Aquela isenção crítica

Em toda a lógica de pontuação, há uma regra que considero o divisor de águas para saber se esse mecanismo "aprende as coisas certas": erros transitórios nunca contam.

Tempos limite de rede, queda de conexões, limites de taxa — estes são problemas ambientais, não deficiências na capacidade do próprio agente. Deixar de excluí-los e algo ruim acontece: uma ferramenta falha devido a um soluço de rede casual, e o mecanismo de reflexão registra isso como "esta ferramenta não é confiável, use-a menos" - ou até mesmo deturpa ou exclui uma habilidade perfeitamente boa. A partir de então, o agente aprendeu uma lição errada, e esse erro o acompanha.

Portanto, os sinais "recuperado do erro", "habilidade ineficaz" e "atingir uma fraqueza conhecida" mantêm explicitamente os erros transitórios puros fora. O prompt de reflexão também repete o lembrete: erros de classe de rede são ambientais, não os registre como pontos fracos, não toque nas habilidades relacionadas. O que um sistema de autoaperfeiçoamento mais deveria temer não é aprender lentamente – é aprender na direção errada. Essa isenção é o que protege exatamente contra isso.

Etapa 3: a reflexão ocorre em segundo plano, não na sua cara

Uma armadilha fácil: no momento em que você detectar “hora de refletir”, pare e reflita ali mesmo. Isso faz com que o agente sinta que gagueja de vez em quando, vagando para "pensar sobre a vida" - uma experiência ruim.

Orkas move o reflexo para o fundo, em uma cadência fixa. As regras de agendamento, aproximadamente:

Inicie um ciclo de reflexão de vez em quando (digamos, na ordem de mais de uma dúzia de horas).
Imponha um tempo de espera mínimo entre duas reflexões para o mesmo agente (algumas horas), para que ele não seja executado com muita frequência.
Mas se não for refletido há muito tempo (digamos, mais de uma semana), force um, para que não se arraste indefinidamente.
Limite o número de agentes escolhidos por ciclo, para que não se espalhe muito. de uma vez.

Há um pequeno design que eu gosto chamado de portão sujo: quando um ciclo começa, primeiro verifique se este agente tem algo novo desde sua última reflexão – quaisquer novos sinais, quaisquer registros de conversa atualizados. Se não houver nenhum movimento, pule desta vez e não desperdice uma reflexão (de custeio do modelo). Simples, mas economiza muito na prática.

Etapa 4: como a reflexão realmente funciona

Quando chegar a hora de refletir, o fluxo é: primeiro organize a atividade recente em um "pacote", depois combine-a com um prompt cuidadosamente escrito e entregue-o ao modelo para leitura e resumo.

O pacote tem um orçamento: pegue no máximo algumas conversas recentes, adicione algumas classes de eventos do sistema, intercale-as cronologicamente e limite o total abaixo de um limite de tokens (digamos, mais de dez mil). Nem toda a história incluída - ela não caberia e a relação sinal-ruído seria ruim.

O que realmente importa é o prompt. Requer que o modelo produza não "descrições", mas imperativos executáveis. A diferença parece pequena e é enormemente importante. Comparar:

✗ "A resposta do agente às vezes é muito detalhada; esteja atento."
✓ "Ao responder perguntas do family office, nunca exceda 5 marcadores."

✗ "O usuário parece preferir resultados concisos."
✓ "Ao responder em um contexto de family office, sempre forneça primeiro o resultado final e depois o raciocínio."

O prompt direciona explicitamente o modelo para estruturas "nunca/sempre/quando-então" com condições de acionamento concretas. A razão é prática: uma nota que diz “tenha cuidado para ser conciso” não diz ao agente nada acionável na próxima vez que a ler, enquanto “nunca exceda 5 marcadores” pode ser seguido diretamente. Para que o autoaperfeiçoamento seja útil, o que é destilado tem que ser uma instrução que acerte - e não um lugar-comum correto.

Depois de refletir, o modelo pode fazer algumas coisas: criar ou modificar uma habilidade, atualizar seu entendimento de si mesmo ou — se realmente não houver nada que valha a pena registrar nesta janela — apenas dizer "nada para salvar". Deixá-lo não fazer nada é em si uma escolha importante de design: não force um resultado de aprendizagem, para não acumular uma pilha de ruído inútil.

Destilado em duas coisas

A saída da reflexão fica em dois lugares.

Uma delas são as habilidades. Cada habilidade é um documento Markdown com metadados — um bloco frontmatter registrando o nome, a descrição, os tempos de criação e atualização, quantas vezes foi corrigido e quando foi usado pela última vez — seguido pelas etapas reais ou pontos-chave:

---
nome: "Exportação de relatório semanal"
description: "Compile os dados desta semana no formato padrão de relatório semanal"
criado em: "2025-01-01T00:00:00Z"
atualizadoEm: "2025-01-08T00:00:00Z"
contagem de patches: 2
últimoUsadoEm: "2025-01-09T10:00:00Z"
---

## Etapas
1. ...
2. ...

Armazenar habilidades como arquivos é uma escolha pragmática: uma pessoa pode lê-las diretamente e editá-las diretamente, e não trancá-las em algum banco de dados opaco.

A outra é a compreensão de si mesmo. Esta parte é mais como um memorando que o agente escreve para si mesmo, em duas partes: uma anota "no que sou bom e onde tenho tendência a tropeçar", a outra anota "as jogadas que elaborei para este usuário e este domínio". Ambos têm um limite de comprimento, forçando-os a serem concisos - não mais é melhor, mas mais verdadeiro é melhor. No início da próxima conversa, esse conteúdo é injetado no prompt do sistema, para que o agente entre com "uma compreensão de si mesmo".

Habilidades não são apenas escritas

Basta criar habilidades e você acumula um ferro-velho com o tempo. Portanto, as habilidades têm um ciclo de vida completo.

Além da criação, a operação mais comum é, na verdade, aplicar patches: alterar uma pequena extensão de uma habilidade existente em vez de desmontá-la e reescrevê-la. Cada patch supera um contador e atualiza o tempo de atualização. Isso permite que uma habilidade cresça gradualmente com a experiência, em vez de ser reescrita a cada passo.

Também existe um limite máximo de contagem. O número total de habilidades é limitado (digamos, 200); uma vez cheio, adicionar um novo despeja um antigo via LRU (usado menos recentemente) para liberar espaço. O despejo tem uma preferência: primeiro expulse aqueles que nunca foram usados desde a criação — uma habilidade que nunca foi lida provavelmente nunca foi destilada corretamente, e é melhor abrir caminho.

Cada vez que o agente lê uma habilidade, seu "horário da última utilização" é atualizado. Este carimbo de data/hora alimenta a decisão de despejo da LRU e permite que o mecanismo local diga quais habilidades estão realmente em uso e quais estão apenas ocupando espaço.

Como você sabe se uma habilidade é realmente útil

Esta é a etapa que muitos sistemas de "aprendizagem automática" ignoram preguiçosamente: eles aprenderam alguma coisa - mas será que é bom? Orkas transforma isso em algumas métricas, localmente. Essas métricas são calculadas para uso próprio do mecanismo de evolução na máquina - decidindo qual habilidade revisar ou excluir - e da mesma forma nunca sai desta máquina.

O mecanismo: no início de cada turno, as habilidades disponíveis aparecem no índice do prompt do sistema — essa é uma "impressão"; se o agente realmente ler uma habilidade naquele turno, isso será uma “invocação”. Compare os dois e você obterá a primeira métrica —

Taxa de invocação = invocações/impressões. Uma habilidade que permanece lá dia após dia sem compradores tem uma baixa taxa de invocação, o que significa que é inútil ou descrita de forma que ninguém sabe quando usá-la.
Taxa de edição após acerto = a proporção de vezes que uma habilidade foi invocada, mas o usuário editou o resultado manualmente. Alto significa que a habilidade produzida não é exatamente do gosto do usuário.
Taxa ineficaz = a proporção de vezes que uma habilidade foi invocada, mas o turno terminou com um erro (não transitório). Alto sugere que algo pode estar errado com a habilidade em si.

Aqui você vê a sombra dessa isenção novamente: ao calcular a taxa ineficaz, os erros transitórios não contam, nem as interrupções manuais do usuário no meio do caminho — você não pode colocar uma marca preta em uma habilidade perfeitamente boa por causa de um problema de rede.

Com esses poucos números, as habilidades passam de “acumular em uma caixa preta” para “algo que pode ser avaliado e otimizado”. Qual habilidade revisar ou excluir não é mais uma decisão instintiva.

Fechando o ciclo

Juntando os itens acima, um ciclo completo é assim:

O agente trabalha em tarefas reais, registrando os dados da execução localmente e marcando os sinais à medida que avançam. Quando o ciclo de reflexão em segundo plano chega ao vencimento, ele escolhe os agentes que têm novo movimento e já passaram do tempo de espera, organiza a atividade recente de cada um em um pacote e faz com que o modelo os revise em relação à sua autocompreensão atual – mesclando o que deveria ser mesclado, retirando o que deveria ser retirado, destilando o que deveria ser destilado em novas habilidades. O resultado da revisão torna-se habilidades e autocompreensão. Na próxima conversa, essas habilidades vão para o índice de prompt e a autocompreensão vai para o prompt do sistema, e o agente volta carregando o que aprendeu na última rodada. Em seguida, esta rodada produz novas métricas e sinais, que são transmitidos desde o início.

O loop continua, rodada após rodada. Nem toda rodada traz um salto dramático, mas a direção é unidirecional: entender você melhor e repetir menos dos mesmos erros.

Algumas compensações que vale a pena mencionar

Olhando para trás, algumas decisões nesse mecanismo são fundamentais.

A introspecção deve ser barata. A própria observação usa métricas de custo de modelo zero; o reflexo genuinamente caro é movido para segundo plano, executado com pouca frequência e primeiro bloqueado pela verificação suja. Aperte com força a parte "cara" e todo o mecanismo poderá realmente funcionar.

É melhor não aprender do que aprender errado. A isenção de erros transitórios, permitindo que a reflexão "não salve nada", escrevendo imperativos executáveis em vez de descrições vagas - todos apontam para o mesmo julgamento: para um sistema que se aperfeiçoa, aprender na direção errada é muito mais perigoso do que aprender lentamente.

O que você aprendeu deve estar visível, editável e estar em suas mãos. As habilidades são arquivos de texto simples, a autocompreensão é um memorando de texto simples, a eficácia das habilidades pode ser verificada por meio de métricas — e todos esses arquivos ficam em sua própria máquina, não na nuvem. Nenhuma caixa preta em lugar nenhum; um ser humano pode abri-lo e ajustá-lo a qualquer momento.

Coloque freios no aprendizado. Limites de contagem, despejo de LRU, limites de comprimento — sem isso, o "aprendizado contínuo", mais cedo ou mais tarde, torna-se um "inchaço contínuo". Esquecer, abandonar e podar importa tanto quanto lembrar.

Concluindo

A autoevolução de Orkas é, no fundo, adicionar um loop lento ao agente: o loop rápido é a resposta imediata de cada conversa; o loop lento periodicamente olha para trás e destila a experiência em algo utilizável na próxima vez. A parte difícil não é "fazer o modelo lembrar" - são os julgamentos de engenharia facilmente esquecidos: como saber quais experiências valem a pena registrar, como não ser prejudicado por uma falha casual, como tornar o que foi aprendido genuinamente executável e como podá-lo antes que inche.

Esses julgamentos, tomados em conjunto, transformam "fica mais útil quanto mais você o usa" de uma linha de marketing em um mecanismo que realmente funciona. Um assistente que aprende com você (e não aprende as coisas erradas) pode estar mais próximo do que a maioria das pessoas realmente deseja do que aquele que é apenas mais inteligente.

大多数 AI 助手是「用完即忘」的。你今天纠正它一个习惯，明天它照犯不误；你上周教它一套你团队特有的流程，这周它当作没听过。每一次对话都从零开始，再聪明的模型也只是个失忆的聪明人。

Orkas 想做的是另一件事：让 agent 从自己每天的使用里学东西，把反复出现的经验沉淀下来，下次自己用上。说得直白点——它会越用越顺手，而且这个「顺手」是冲着你、你的偏好、你的领域长出来的，不是模型厂商替所有人预设好的。

这篇文章拆一下这套机制是怎么搭的。它不是「让模型记住对话」这么简单，背后是一条完整的闭环：观察自己 → 判断要不要反思 → 真的去反思 → 把结论写成可复用的东西 → 下次再用上。下面一段段看。

先把最要紧的一句话放前面：下文讲的所有「观察」「记录」「反思」，从头到尾都发生在你自己的设备上。 运行情况、技能、对自己的认知，全部以普通文件的形式存在本地，不上传到 Orkas 的服务端，也不会被拿去做跨用户的分析或模型训练。所谓「自演进」，是程序在本地读自己的运行记录、在本地改进自己，而不是把你的数据收上去。这些经验既出不了这台机器，也只服务于这一台机器上的你。

先看整条闭环

一次次真实使用
      │  在本地记下：调了哪些工具、出没出错、有没有被纠正
      ▼
   信号累积
      │  就地从对话里提取出一条条信号，全部留在本机
      ▼
  判断该不该反思
      │  多个信号加权打分，过线才触发；网络抖动这类不算
      ▼
   后台跑反思
      │  不是每轮都跑，是隔一段时间挑符合条件的来一次
      ▼
  沉淀成两类东西
      │  ① 可复用的「技能」  ② 对自己的「认知」
      ▼
  下一轮自动带上
      └──────────► 回到最上面，继续滚

这条环里每一步都有讲究。最容易做错的地方，恰恰是大家直觉上觉得最简单的两步：什么时候该反思，以及反思完该记下什么。先从最前面说起。

第一步：几乎零成本地观察自己

要从经验里学，先得有「经验」可看。每一轮 agent 运行结束，程序会在本地就地数出几个很轻的运行情况——这轮大致调了几次工具、有没有出错、出的是网络这类瞬时错误还是真错误、有没有被当场纠正。就这么几个计数和标记，全在本机算完，既不调模型、也不往任何地方发。

这点之所以关键，是因为它完全不花模型的钱。这些都是从本轮对话记录里直接数出来的，不需要再额外调一次模型去「分析自己」。如果每轮都要为了自省再调一次模型，成本和延迟都扛不住，这套机制根本上不了线。

其中「有没有被纠正」稍微有点意思。它是一个纯启发式的本地判断，靠在你的设备上匹配消息里的一些措辞来估，比如中文的「不对」「应该是」「重新」、英文的 wrong、actually、instead。它不追求准——这只是一个信号而不是结论，偶尔误判完全可以接受，因为它后面还要和别的信号一起加权，不会单凭它就下判断。

第二步：什么时候才值得反思

这是整套机制里我觉得最见功力的地方。

朴素的做法是「攒够 N 次就反思一回」。但这很糙：连续三次网络超时，和用户连续三次纠正你，显然不是一回事，不该同等对待。Orkas 用的是多信号加权打分：每一类值得注意的现象是一个信号，带一个权重，把这轮触发的信号权重加起来，超过阈值（默认 0.7）才反思。

几个主要信号大致是这样的：

信号	权重	触发条件
用户纠正	0.9	这轮里检测到用户在纠正你
技能没起效	0.85	加载了技能，结果这轮还是错了
从错误中恢复	0.8	出过错但最终救回来了
撞上已知弱点	0.7	任务命中了自我评估里记着的薄弱项
任务复杂	0.5	工具调用次数超过一定数量

举个例子：一轮对话里既有用户纠正（0.9）又比较复杂（0.5），加起来 1.4，远过 0.7，触发反思；而一轮只是稍微复杂一点（0.5），不到线，就放过。权重的设计也透着取舍——用户的直接纠正给到最高的 0.9，因为那是信号噪音比最高的反馈：用户都明说你错了，这事多半真值得记下来。

那条最关键的豁免线

整个打分逻辑里，有一条规则我认为是这套机制能不能「学对东西」的分水岭：瞬时错误一律不算。

网络超时、连接被掐、限流——这些是环境问题，不是 agent 自己的能力缺陷。如果不把它们排除掉，会发生很糟的事：某个工具因为一次偶发的网络抖动报了错，反思机制把这记成「这个工具不靠谱，以后少用」，甚至把一个本来好好的技能给改坏、删掉。从此以后 agent 学会了一个错误的教训，而且这个错误会一直跟着它。

所以「从错误中恢复」「技能没起效」「撞上已知弱点」这几个信号，都明确把纯瞬时错误挡在外面。反思的提示词里也会再叮嘱一遍：网络类的错误是环境问题，不要记成弱点、不要去动相关的技能。一套自我改进的系统，最怕的不是学得慢，而是学错方向——这条豁免线挡的就是这个。

第三步：反思在后台跑，不在你面前跑

一个容易踩的坑是：检测到「该反思了」，就当场停下来反思一把。这会让用户感觉 agent 时不时卡一下、走神去「想人生」，体验很差。

Orkas 把反思挪到了后台，按一个固定节奏来。大致的调度规矩是：

每隔一段时间（比如十几个小时）起一个反思周期；
同一个 agent 两次反思之间有最短冷却（比如几小时），不会过于频繁；
但如果太久没反思过了（比如超过一周），强制来一次，免得一直拖；
一个周期里挑的 agent 数量有上限，避免一次性铺太开。

还有一个我很喜欢的小设计叫脏检查（dirty gate）：起反思周期时，先看这个 agent 自上次反思以来到底有没有新东西——有没有新的信号、对话记录有没有更新过。要是压根没动静，这次就直接跳过，不浪费一次（要花模型钱的）反思。简单，但省得很实在。

第四步：反思具体怎么做

到了真要反思的时候，流程是：先把最近这段时间的活动整理成一份「材料」，再配一段精心写的提示词，交给模型去读、去总结。

材料这块是有预算的：最多取最近的若干段对话，再加上几类系统事件，按时间顺序穿插起来，总量卡在一个 token 上限内（比如一万多）。不是把所有历史一股脑塞进去——既塞不下，信噪比也低。

真正讲究的是提示词。它要求模型产出的不是「描述」，而是可执行的祈使句。这个区别看着小，影响极大。对比一下：

✗ 「Agent 输出有时过于冗长，应当注意。」
✓ 「回复家办场景的问题时，绝不超过 5 个要点。」

✗ 「用户似乎偏好简洁的输出。」
✓ 「当回复家办场景时，永远先给出结论，再讲理由。」

提示词明确引导模型用「绝不 / 永远 / 当……就……」这种带具体触发条件的结构来写。原因很实在：一条「应当注意简洁」的笔记，下次 agent 读到了也不知道该怎么做；而一条「绝不超过 5 个要点」是能直接照着执行的。自我改进要有用，沉淀下来的东西必须是能落地的指令，不是正确的废话。

反思完，模型可以做几件事：新建或修改一个技能、更新对自己的认知、或者——如果这段时间确实没什么值得记的，就明说一句「没什么要存的」。允许它什么都不做，本身也是个重要的设计：不强行凑学习成果，免得攒下一堆没用的噪音。

沉淀成两类东西

反思的产出落到两个地方。

一类是技能。 每个技能就是一个带元信息的 Markdown 文档，开头一段 frontmatter 记着名字、描述、创建和更新时间、被改过几次、上次用是什么时候，正文是具体的操作步骤或要点：

---
name: "导出周报"
description: "把本周数据整理成固定格式的周报"
createdAt: "2025-01-01T00:00:00Z"
updatedAt: "2025-01-08T00:00:00Z"
patchCount: 2
lastUsedAt: "2025-01-09T10:00:00Z"
---

## 步骤
1. ……
2. ……

技能用文件来存，是个很务实的选择：人能直接看、能直接改，不锁在某个不透明的数据库里。

另一类是对自己的认知。 这部分更像 agent 写给自己的备忘，分两份：一份记「我擅长什么、在哪儿容易翻车」，一份记「面对这个用户、这个领域，我摸索出的打法」。两份都有字数上限，逼着它保持精炼——不是越长越好，是越准越好。下一轮对话开始时，这些内容会注进系统提示词，让 agent 带着「对自己的了解」上场。

技能不是只进不出

光会创建技能，攒着攒着就成了垃圾场。所以技能有完整的生命周期。

创建之外，更常用的其实是打补丁：在已有技能上改一小段，而不是推倒重来。每打一次补丁，计数加一、更新时间刷新。这让技能能随着经验慢慢长，而不是动不动整篇重写。

数量也有天花板。技能总数设了上限（比如两百个），到顶了再加新的，就按 LRU（最近最少使用） 淘汰一个旧的腾位置。淘汰还有个偏好：优先踢那些建了之后就再没被用过的——一个从没被读取过的技能，多半当初就没沉淀对，留着不如让位。

每次 agent 读取一个技能，它的「上次使用时间」就刷新一下。这个时间戳既喂给 LRU 做淘汰判断，也让本地的机制能分清哪些技能真在用、哪些只是占着位置。

怎么知道技能到底有没有用

这是很多「自动学习」系统会偷懒跳过的一环：学是学了，到底有没有用？Orkas 在本地把它做成了几个指标。这些指标算出来只给本机的演进机制自己用——判断哪个技能该改、该删，同样不出这台机器。

机制是这样的：每轮对话开始时，可用的技能会出现在系统提示词的索引里，这叫一次「曝光」；如果 agent 这轮真的去读取了某个技能，算一次「调用」。两者一比，就有了第一个指标——

调用率 = 被调用次数 / 被曝光次数。技能天天摆在那儿却没人用，调用率就低，说明它要么没用、要么描述写得让人看不出该什么时候用。
改动率 = 调用了该技能、但用户随后又动手改了结果的比例。高，说明这技能给出的东西不太对用户胃口。
失效率 = 调用了该技能、这轮却以（非瞬时的）错误收场的比例。高，说明技能本身可能有问题。

这里又能看到那条豁免线的影子：算失效率时，瞬时错误同样不算数，用户中途主动喊停的也不算——总不能因为一次网络抖动，就给一个本来好用的技能记上一笔黑账。

有了这几个数，技能从「黑盒地攒着」变成了「能被评估、能被优化」的东西。哪个技能该改、该删，不再靠拍脑袋。

闭环合上

把前面这些串起来，一次完整的循环是这样走的：

agent 在一次次真实任务里干活，顺手在本地记下运行情况、就地标出信号。后台的反思周期到点了，挑出有新动静、又过了冷却期的 agent，把它最近的活动整理成材料，让模型对照着自己当前的认知去复盘——该合并的合并、该淘汰的淘汰、该提炼成新技能的提炼。复盘的产出落成技能和自我认知。下一轮对话，这些技能进了提示词索引，自我认知进了系统提示词，agent 带着上一轮学到的东西重新上场。然后这一轮又产生新的指标和信号，喂回最开始。

环就这么一圈圈滚下去。每一圈不一定有惊天动地的进步，但方向是单向的：朝着更懂你、更少犯同样的错。

几个值得说的取舍

回头看，这套机制里有几个决定挺关键。

自省必须便宜。 观察自己用的是零模型成本的指标，真正费钱的反思被挪到后台、低频、还要先过脏检查。把「贵」的部分死死摁住，这套机制才跑得起。

宁可不学，不可学歪。 瞬时错误的豁免线、允许反思「什么都不存」、写成可执行祈使句而不是模糊描述——这几条都指向同一个判断：一个自我改进的系统，学错方向比学得慢危险得多。

学到的东西要看得见、改得动、且就在你手上。 技能是明文文件，自我认知是明文备忘，技能有效性有指标可查，而且这些文件全都躺在你自己的机器上，不上云。整套机制没有黑盒，人随时能打开看、随手能改。

给学习装上刹车。 数量上限、LRU 淘汰、字数限制——没有这些，「持续学习」迟早变成「持续膨胀」。会忘、会丢、会精简，和会记一样重要。

小结

Orkas 的自演进，本质上是给 agent 加了一条慢回路：快回路是每一次对话的即时响应，慢回路是隔一段时间回头看看、把经验沉淀成下次能用的东西。难点不在「让模型记住」，而在那些容易被忽略的工程判断——怎么判断哪些经验值得记、怎么不被偶发故障带歪、怎么让学到的东西真能执行、又怎么在它膨胀之前及时修剪。

这些判断合起来，让「越用越顺手」从一句产品宣传，变成了一套真的在转的机制。一个能从你身上学习、又不会学歪的助手，比一个单纯更聪明的助手，可能更接近大多数人真正想要的那个东西。

多くの AI アシスタントは「使って忘れる」ものです。今日ある癖を直しても、明日また同じ間違いをします。先週チーム固有の手順を教えても、今週には初めて聞くように振る舞います。会話は毎回ゼロから始まり、どれほど賢いモデルでも、記憶喪失の賢い人のようになります。

Orkas が目指すのは別の形です。agent が日々の使用から学び、繰り返し現れる経験を蒸留し、次回は自分で使えるようにすることです。平たく言えば、使うほど役に立つようになり、その「役立ち方」はモデルベンダーが全員向けに決めたものではなく、あなた、あなたの好み、あなたの領域へ向かって育ちます。

この記事では、その仕組みを分解します。単に「会話を覚えさせる」話ではありません。背後には、観察する → 反省すべきか判断する → 実際に反省する → 再利用できる形へ書き出す → 次回使う、という閉じた loop があります。

最初に一番重要な点を言います。ここで説明する観察、記録、reflection はすべてあなたのデバイス上で完結します。 実行データ、skills、agent の自己理解は普通のローカルファイルとして保存されます。Orkas のサーバーへアップロードされず、ユーザー横断の分析やモデル学習にも使われません。自己進化とは、プログラムが自分の実行記録をローカルで読み、ローカルで改善することです。

Loop 全体

実際の使用を繰り返す
      │  どの tools を呼んだか、失敗したか、修正されたかをローカル記録
      ▼
   signals が蓄積する
      │  会話からその場で抽出し、すべて on-device に保持
      ▼
  reflection すべきか判断する
      │  signals を重み付けして閾値を超えたときだけ発火。ネットワーク揺れは除外
      ▼
   バックグラウンドで reflection
      │  毎 turn ではなく、条件を満たす agent を周期的に選ぶ
      ▼
  二つのものへ蒸留する
      │  ① 再利用可能な skills   ② 自分自身への understanding
      ▼
  次の turn に自動で持ち込む
      └──────────► 最初へ戻る

この loop で間違えやすいのは、見た目には簡単そうな二点です。いつ reflection するか、そして reflection の結果として何を記録するかです。

Step 1: ほぼゼロコストで自分を観察する

経験から学ぶには、まず見るべき経験が必要です。各 agent run の終わりに、Orkas はその場で軽量な事実を数えます。この turn で tools がだいたい何回呼ばれたか、error があったか、それは network などの transient error か本当の失敗か、ユーザーがその場で修正したか。すべてローカルで計算され、モデル呼び出しも送信もありません。

これが重要なのは、モデル費用がかからないからです。もし毎 turn の自己分析に追加のモデル呼び出しが必要なら、コストと latency が膨らみ、この仕組みは日常利用に載りません。

「修正されたか」は完全な判定ではなく signal です。端末上で「不对」「应该是」「重新」や wrong、actually、instead のような表現を拾います。多少の誤検出は問題ありません。後段で他の signals と重み付けされ、単独で決定を下すわけではないからです。

Step 2: いつ reflection する価値があるか

単純な方法は「N 回たまったら reflection」ですが、それでは粗すぎます。三回の network timeout と三回のユーザー修正は同じではありません。Orkas は weighted multi-signal scoring を使います。各現象に weight を持たせ、この turn で発火した signal の合計が閾値、既定では 0.7、を超えたときだけ reflection します。

Signal	Weight	Trigger
User correction	0.9	この turn でユーザー修正を検出
Skill ineffective	0.85	skill を読んだのに失敗した
Recovered from error	0.8	失敗したが最終的に回復した
Hit a known weakness	0.7	自己評価にある弱点に当たった
Task complexity	0.5	tool-call count が一定数を超えた

ユーザー修正は 0.9 と高い重みを持ちます。明示的な「違う」は signal-to-noise が高く、記録する価値がある可能性が高いからです。一方、少し複雑だっただけの turn は 0.5 で閾値に届かず、無理に学習しません。

もっとも重要な例外

この scoring で最も大切な線引きは、transient errors を数えないことです。ネットワーク timeout、接続断、rate limit は環境の問題であって、agent の能力不足ではありません。

ここを間違えると、ひとつの偶発的な network error から「この tool は信頼できない」と学んでしまい、良い skill を壊したり削除したりする恐れがあります。自己改善システムにとって最も怖いのは、学ぶのが遅いことではなく、間違った方向へ学ぶことです。

そのため、recovered from error、skill ineffective、known weakness の signals は純粋な transient error を除外します。reflection prompt でも同じ注意を繰り返します。network 系の失敗を弱点として記録せず、関連 skill を触らないようにします。

Step 3: Reflection は表ではなく裏で走る

「反省すべき」と分かった瞬間に、その場で agent を止めて reflection すると、ユーザー体験は途切れます。Orkas は reflection を background に移し、固定 cadence で動かします。

一定時間ごとに reflection cycle を開始する。
同じ agent の reflection には cooldown を置き、過度に頻繁にしない。
長く reflection していない agent は強制的に対象にする。
一回の cycle で選ぶ agent 数を制限する。

さらに dirty gate があります。cycle の開始時に、前回 reflection から新しい signals や会話更新があるかを確認します。何も動いていなければ skip します。小さな仕組みですが、不要なモデル費用を大きく減らします。

Step 4: Reflection の中身

reflection が必要になったら、最近の活動を packet にまとめ、慎重に書いた prompt と一緒にモデルへ渡します。packet には予算があります。最近のいくつかの会話、重要な system events を時系列に並べ、token 上限を超えないようにします。全履歴を詰め込むのではありません。

prompt の肝は、出力を「説明」ではなく実行可能な命令にすることです。

✗ 「出力が長くなりがちなので注意する。」
✓ 「family-office の質問に答えるときは、箇条書きを 5 個以内にする。」

✗ 「ユーザーは簡潔な回答を好むようだ。」
✓ 「family-office 文脈では、必ず結論を先に出し、その後に理由を書く。」

「注意する」は次回読んでも行動に落ちません。「5 個以内にする」はそのまま実行できます。自己改善で残すべきものは、正しい感想ではなく、次回使える instruction です。

二つの保存先

reflection の結果は主に二つの場所へ入ります。

ひとつは skills。 各 skill は metadata つきの Markdown ファイルです。名前、説明、作成時刻、更新時刻、patch 回数、最後に使った時刻を frontmatter に持ち、その下に手順や要点を書きます。ファイルなので、人間が直接読めて編集できます。opaque な DB に閉じ込めません。

---
name: "Weekly Report Export"
description: "今週のデータを標準の週報フォーマットにまとめる"
createdAt: "2025-01-01T00:00:00Z"
updatedAt: "2025-01-08T00:00:00Z"
patchCount: 2
lastUsedAt: "2025-01-09T10:00:00Z"
---

## Steps
1. ...
2. ...

もうひとつは自分自身への understanding。 これは agent が自分に書く memo に近く、「自分は何が得意でどこでつまずくか」と「このユーザー、この domain でうまくいった play」を短く残します。次回の会話開始時に system prompt へ入るので、agent は自己理解を持って入ってきます。

Skills は書きっぱなしにしない

skill を作るだけでは、時間がたつほど散らかります。そこで lifecycle を持たせます。

作成より多い操作は patch です。既存 skill の小さな範囲だけを書き換え、patch count と update time を更新します。毎回丸ごと書き直すのではなく、経験に沿って少しずつ育てます。

数にも上限があります。一定数を超えると、LRU、つまり最近使われていないものから退避します。作成以来一度も読まれていない skill は、そもそも蒸留がうまくなかった可能性が高いので、先に消えます。agent が skill を読むたびに last used time が更新され、この値が整理の判断材料になります。

Skill が有効かどうかを測る

多くの自動学習システムは「学んだあと、それが役に立ったか」を見ません。Orkas はここを local metrics にします。これらの指標はオンマシンの進化機構のためだけに使われ、外へ出ません。

Invocation rate = invocations / impressions。提示されたのに読まれない skill は、役に立たないか、説明が悪くて使い時が分からない可能性があります。
Edit-after-hit rate = skill を使った後にユーザーが手で直した割合。高いなら、出力が好みに合っていません。
Ineffective rate = skill を使った turn が非 transient error で終わった割合。高いなら skill 自体に問題があるかもしれません。

ここでも transient errors は除外します。network の一時的な揺れで、良い skill に悪い評価をつけるべきではありません。

Loop が閉じる

agent は実際の仕事を進めながら run data と signals をローカルに残します。background reflection cycle が来ると、新しい動きがあり cooldown を過ぎた agent を選び、最近の活動を packet 化して、現在の自己理解と照らしてレビューします。統合すべきものは統合し、捨てるべきものは捨て、新しい skills や understanding へ蒸留します。次の会話では、それらが prompt index と system prompt に入ります。そしてまた新しい metrics と signals が生まれ、最初へ戻ります。

設計上の trade-offs

内省は安くなければならない。 観察は zero-model-cost の metrics で行い、高価な reflection は background で低頻度に、dirty check を通して走らせます。

間違って学ぶくらいなら学ばない。 transient-error exemption、何も保存しない選択、曖昧な説明ではなく実行可能な命令を書くことは、すべて同じ判断に基づいています。

学んだものは見えて、編集できて、手元にあるべき。 skills は plain-text files、self-understanding も plain text、効果は metrics で確認できます。すべてローカルにあります。

学習にはブレーキが必要。 数量上限、LRU、長さ制限がなければ、continuous learning は continuous bloat になります。忘れること、捨てること、刈り込むことは、覚えることと同じくらい重要です。

まとめ

Orkas の自己進化は、agent に遅い loop を足すことです。速い loop は各会話への即時応答であり、遅い loop は一定周期で振り返り、次回使える形へ経験を蒸留することです。難しいのは「モデルに覚えさせる」ことではありません。どの経験を残すか、偶発故障で学習を歪めないか、学んだものを実行可能にするか、膨らむ前に刈り込むかです。

これらの判断が合わさることで、「使うほど役に立つ」は宣伝文句ではなく、実際に回る mechanism になります。単に賢い助手よりも、あなたから学び、かつ間違ったことを学ばない助手の方が、多くの人が本当に欲しいものに近いのかもしれません。