Skip to content

Changelog ​

All notable changes to confused-ai are documented here.
Format: Keep a Changelog Β· Semantic Versioning

Full changelog

The authoritative CHANGELOG.md lives in the repository root.
View on GitHub β†’

v2.3.0 β€” Current ​

Added ​

  • 9 Extended Multi-Agent Orchestration Patterns (@confused-ai/orchestration) β€” Mixture-of-Agents (MoA), Actor-Critic loops, Socratic tutor guiding, Prompt Chaining pipelines, Program-of-Thought code sandbox runtimes, Skeleton-of-Thought parallel generation, Step-Back conceptual abstraction solvers, Rejection Sampling (Best-of-N) evaluations, and validation-driven Self-Correction.
  • createGSDCoordinator() (Get Shit Done) β€” spec-driven workflow coordinator that executes project goals inPlan-Execute-Verify phases, using a workspace .planning folder to isolate contexts.
  • createRalphLoop() (RALF) β€” autonomous cycle executor that leverages fresh session isolation to prevent context bloat while propagating iteration summaries.
  • Mastermind Context Compression (@confused-ai/compression) β€” a multi-stage intelligent context compression suite featuring:
    • CacheAligner (KV-cache prefix alignment).
    • Specialized crushers (smart-crusher for JSON, minifiers for Code, log timestamp/duplicate aggregators, XML, CSV).
    • Sliding-window group budget enforcers to prevent orphaned tool call/result pairs.
    • Code & Context Reduction (CCR) annotations with retrieveTool for on-demand details recall.

v1.1.7 ​

Added ​

  • DbScheduleStore (@confused-ai/scheduler) β€” bridges ScheduleManager with any AgentDb backend. Persist schedules to SQLite, Postgres, MySQL, MongoDB, Redis, DynamoDB, or Turso with no custom glue code. See Scheduler β†’ Production persistence.
  • DB health in /health endpoint β€” createHttpService now accepts a db?: AgentDb option. When provided, GET /health (and /v1/health) runs a live db.health() probe. Returns HTTP 503 with { status: 'degraded' } when the database is unreachable.

Fixed ​

  • @confused-ai/db β€” uuid() security β€” all 8 backends (InMemory, SQLite, Postgres, MongoDB, Redis, JSON, MySQL, DynamoDB, Turso) now generate IDs with crypto.randomUUID() instead of the previous Math.random()-based implementation.
  • @confused-ai/db β€” init() race condition β€” concurrent callers no longer double-initialize the connection. All async backends (Postgres, MongoDB, MySQL, DynamoDB, Turso) now share a single _initPromise guard.
  • PostgresAgentDb β€” getKnowledgeItems(), getTrace(), and getTraces() now correctly re-serialize JSONB content and metadata columns to string (the pg driver returns these as parsed objects).
  • MongoAgentDb β€” all findOne and find calls now include { projection: { _id: 0 } } so MongoDB's internal _id field is never included in returned rows.
  • DynamoDbAgentDb β€” constructor now calls validateTableNames() to catch invalid table names at construction time instead of silently failing at runtime.
  • DbSessionStore β€” now() helper now returns Unix epoch seconds (Math.floor(Date.now() / 1000)) to match the AgentDb timestamp contract (was returning milliseconds, causing created_at/updated_at to be off by Γ—1000).
  • TursoAgentDb β€” single-row casts (LibSqlRow β†’ SessionRow, MemoryRow, etc.) now use the as unknown as T double-cast pattern, fixing TypeScript strict-mode errors.
  • PostgresAgentDb β€” close() method was accidentally stripped during a refactor; restored.
  • Unified class API β€” SimpleAgent and LegacyAgent were removed from public exports; Agent is now the only class-based API and includes both legacy defaults and modern fluent methods.
  • Durable runtime lifecycle correctness β€” resume now rejects terminal workflows consistently; terminal-state handling no longer allows invalid resume paths.
  • CQRS error propagation β€” event-bus handler failures are now surfaced via AggregateError after handlers run.
  • State machine lifecycle hardening β€” start() is idempotent; transition commits in send() and jumpTo() are atomic (state changes apply only after target onEntry succeeds).
  • Snapshot restore semantics β€” snapshots persist startup status (started); restore defaults legacy snapshots to started to avoid duplicate initial onEntry execution.

v1.1.6 ​

Changed ​

Monorepo restructure β€” packages fully independent ​

  • All source code now lives in independently-built workspace packages under packages/. The src/ directory is retained as a backward-compatible re-export barrel β€” no breaking API changes.
  • packages/tools rewritten with clean functional defineTool implementations; removed all class-based files with broken relative imports.
  • packages/test-utils is now a fully standalone package: createMockLLM, createMockAgent, runScenario with zero cross-package dependencies.
  • CI pipeline updated to 4 sequential jobs: typecheck β†’ lint β†’ test (Node 18 / 20 / 22) β†’ build all packages.

Fixed ​

  • router/selectForBudget β€” removed incorrect Γ— 1,000,000 scaling; budget comparison is now a direct dollar-per-million comparison.
  • adapter-redis/session-store β€” removed unnecessary optional chain on non-null hGetAll result; fixed template literal number type.
  • tools/types.ts β€” migrated from deprecated ZodTypeAny β†’ z.ZodType; _def private field access replaced with .def.
  • Removed 33 broken package copies that had relative src/-path imports causing circular resolution failures.
  • Documentation URLs updated from rvuyyuru2.github.io/agent-framework to confused-ai.github.io/confused-ai throughout all docs.
  • Version consistency: ARCHITECTURE.md and SECURITY.md now match the package.json version.

Security ​

  • SECURITY.md: added ShellTool sandbox requirements section documenting blocked command patterns.
  • SECURITY.md: documented RedisRateLimiter as the required solution for multi-instance distributed rate limiting.
  • README.md: qualified audit-logging claim β€” removed unqualified SOC 2 / HIPAA label; added compliance footnote.

v1.1.0 ​

Added ​

agent.stream() β€” async iterable streaming on every agent ​

  • Every CreateAgentResult now has a built-in stream(prompt, options?) method
  • Returns an AsyncIterable<string> β€” chunks arrive as the LLM generates
  • Works with for await loops; no extra setup; accepts all run() options except onChunk
ts
for await (const chunk of agent.stream('Explain quantum computing')) {
  process.stdout.write(chunk);
}

defineAgent() builder β€” budget(), checkpoint(), adapters() ​

  • .budget(config) β€” USD spend caps without dropping down to createAgent()
  • .checkpoint(store) β€” durable crash recovery wired in one line
  • .adapters(registry) β€” plug in adapter registry or explicit bindings
  • Full builder method table now documented in Creating Agents

Performance ​

AgenticRunner β€” Zodβ†’JSON Schema cached per agent (not per run) ​

  • Tool definitions (Zod β†’ JSON Schema) are computed once in the constructor and reused
  • Zero toolToLLMDef() overhead on hot-path run() calls after initial agent creation

Tool execution β€” timer leak fixed ​

  • Promise.race timeout timer is now always cleared via .finally() on every tool call
  • Prevents 30-second timer handles accumulating in long-running processes
  • Timing now uses performance.now() for sub-millisecond accuracy

AuditPlugin β€” O(1) event queries ​

  • Internal Map indexes maintained on every onEvent() call
  • getEventsByType(), getEventsForNode(), getEventsForExecution() are all O(1) index lookups
  • Previously O(n) full scans β€” eliminates bottleneck on high-event-volume workflows

OpenTelemetryPlugin β€” OTel module imported once and cached ​

  • The @opentelemetry/api dynamic import is now cached after the first successful load
  • Previously re-imported on every onNodeStart() call

v1.0.0 ​

Added ​

Reasoning Module ​

  • ReasoningManager β€” chain-of-thought and self-critique loops over any generate function
  • ReasoningEventType discriminated union: step, action, complete, error
  • NextAction typed decision point: continue | finish | backtrack | escalate
  • ReasoningStore β€” pluggable trace persistence (audit, replay, fine-tuning)

Scheduler Module ​

  • ScheduleManager β€” CRUD for cron-based job schedules with pluggable ScheduleStore + ScheduleRunStore
  • InMemoryScheduleStore (dev) / SqliteScheduleStore (prod)
  • In-process handler registry β€” no HTTP endpoint required
  • Full lifecycle: create / update / delete / enable / disable / triggerNow / listRuns

CompressionManager ​

  • Transparent context-window compression before LLM calls; truncate | summarise | rolling strategies

ContextProvider ​

  • Retrieves and injects grounding documents into the system prompt or user message at run time
  • Pluggable ContextBackend: InMemoryContextBackend, SqliteContextBackend

Freedom Layer β€” bare / compose / pipe ​

  • bare(opts) β€” zero-defaults agent; caller owns LLM, tools, hooks, everything
  • compose(...agents, opts?) β€” sequential pipeline; output of each agent β†’ input of next
  • compose options: when (conditional routing) + transform (reshape data between steps)
  • pipe(agent).then(agent).run(prompt) β€” builder-style equivalent to compose()

Eval Regression Suite ​

  • runEvalSuite β€” labeled dataset, per-sample scoring, baseline comparison, CI exit code
  • InMemoryEvalStore / SqliteEvalStore β€” durable baseline persistence across CI jobs
  • setBaseline: true β€” promote current run as new reference; regressionThreshold β€” allowable score drop

Real-World Example Library (4 new runnable examples) ​

  • examples/reasoning-agent.ts β€” Incident Triage Bot (no API key needed)
  • examples/scheduled-agent.ts β€” Nightly Market Digest (no API key needed)
  • examples/code-review-pipeline.ts β€” PR Code Review Pipeline (no API key needed)
  • examples/eval-regression.ts β€” CI Eval Regression Guard (no API key needed)

Documentation ​


v0.7.0 ​

Added ​

Budget Enforcement ​

  • budget?: BudgetConfig added to CreateAgentOptions β€” hard USD caps per run and per user
  • BudgetEnforcer instantiated in factory, recordAndCheck(userId) called after the run loop
  • userId?: string added to AgenticRunConfig for per-user cap enforcement

HITL Approval HTTP Endpoints ​

  • GET /v1/approvals β€” list pending approvals
  • POST /v1/approvals/:id β€” submit decision { approved, comment, decidedBy }
  • approvalStore?: ApprovalStore added to CreateHttpServiceOptions

Distributed Trace Context ​

  • W3C traceparent / tracestate extraction from incoming HTTP request headers
  • traceId from incoming trace propagated in JSON and SSE responses
  • src/observability/trace-context.ts β€” extractTraceContext(), injectTraceContext()

v0.6.0 ​

Added ​

Testing Module (confused-ai/testing) ​

  • MockToolRegistry β€” records all invocations; calls(), lastCall(), reset()
  • createTestAgent() β€” zero-config test harness with MockLLMProvider + MockSessionStore
  • createTestHttpService() β€” integration test helper on a random port

HTTP Runtime ​

  • X-Request-ID correlation header on every response
  • rateLimit middleware option in CreateHttpServiceOptions
  • auditStore option β€” SQLite-backed persistent audit log
  • WebSocket transport (websocket: true) β€” attaches to existing http.Server
  • Admin API (adminApi: true) β€” /admin/health, /admin/agents, /admin/audit, /admin/stats, /admin/checkpoints

Adapter System (confused-ai/adapters) ​

  • 20-category adapter system covering SQL, NoSQL, vector, cache, object storage, message queues, observability, embedding, session, memory, guardrail, RAG, tool registry, auth, rate limiting, audit log
  • createProductionSetup() β€” opinionated full-stack wiring with progressive upgrade path

LLM Router (confused-ai/llm) ​

  • LLMRouter β€” intelligent routing by task type, complexity, and strategy
  • Four strategies: balanced, cost, quality, speed
  • Factories: createBalancedRouter, createCostOptimizedRouter, createQualityFirstRouter, createSpeedOptimizedRouter

Deployment Templates (/templates) ​

  • Dockerfile, docker-compose.yml, fly.toml, render.yaml, k8s.yaml
  • Grafana dashboard JSON (grafana-dashboard.json)

DX Improvements ​

  • defineTool() helper β€” AI SDK-style fluent builder with Zod schemas, needsApproval, streaming hooks
  • createWorkflow().then(step).commit() β€” Mastra-style typed step workflows
  • createStepWorkflow, StepWorkflow, StepWorkflowBuilder, StepWorkflowStep exports

Resilience ​

  • withResilience() β€” circuit breaker + rate limiter + retry + health check wrapper
  • RedisRateLimiter β€” distributed rate limiting via Redis

Secret Manager (confused-ai/config) ​

  • createSecretManager() with adapters: EnvSecretManagerAdapter, AwsSecretsManagerAdapter, AzureKeyVaultAdapter, VaultAdapter, GcpSecretManagerAdapter

Orchestration Extensions ​

  • AgentRouter β€” capability-based, round-robin, least-loaded routing
  • HandoffProtocol β€” structured agent-to-agent task handoff with tracing
  • ConsensusProtocol β€” multi-agent voting (majority, unanimous, weighted, best-of-n)

v0.5.0 ​

Added ​

  • Checkpoint/resume for long-running agents β€” checkpointStore? in AgenticRunnerConfig
  • createSqliteSessionStoreSync β€” sync init, safe for factory use
  • Persistent user profiles and learning modes
  • Eval dataset persistence β€” EvalStore, InMemoryEvalStore, SqliteEvalStore, runEvalSuite
  • Plugin system β€” confused-ai/plugins with built-in logging, rate-limit, telemetry plugins
  • Contracts layer β€” confused-ai/contracts for shared interfaces without runtime code

v0.4.0 ​

Added ​

  • Full adapter system for all infrastructure categories
  • Multi-tenancy with createTenantContext()
  • JWT RBAC on HTTP routes
  • SOC 2 / HIPAA audit trail

v0.3.0 ​

Added ​

  • ReAct agentic loop with createAgent
  • createHttpService HTTP runtime with OpenAPI
  • 50+ built-in tools
  • RAG / KnowledgeEngine

v0.1.0 ​

Initial release.

Released under the MIT License.