Documentation Index
Fetch the complete documentation index at: https://docs.bitfab.ai/llms.txt
Use this file to discover all available pages before exploring further.
Studio is now the single browser surface for all plugin flows
Every plugin CLI flow (login, trace plan confirmation, dataset review, template preview) now opens inside Studio instead of launching a separate browser window. If Studio is already open, the plugin navigates it in place rather than opening a new window. This means fewer browser tabs, a consistent UI, and the ability to stay in one window while working with the assistant.Headless login is now available for environments where a browser can’t reach your terminal (SSH, cloud IDEs, CI). Visit/studio/auth/claude in any browser, sign in, copy the token, and paste it back into your coding agent.Closing Studio during a trace plan confirmation now cleanly cancels the operation instead of leaving the CLI in an error state.Automatic package rename in update flow
Running/bitfab:update now detects the legacy bitfab npm package and offers to switch it to @bitfab/sdk. The update flow removes the old package, installs the new one, and rewrites imports in your source files. If you’re already on @bitfab/sdk, nothing changes; the flow works as before.Studio URL guard and auth verification
Studio now always opens at the correct/studio path. Previously, certain launch conditions could cause the Studio window to open at the site root instead of the Studio interface. A path guard now normalizes the URL before the browser window opens.The plugin’s auth status check now verifies your API key against the current server. If you switch between servers (e.g., local development to production), the status command correctly reports that re-authentication is needed instead of showing a stale “authenticated” state.Live-streaming dataset pages in Studio
When the Studio assistant creates or picks a dataset, the dataset review page now opens immediately instead of waiting for all traces to be labeled and attached first. Traces appear on the page in real time as the agent finds, labels, and attaches them. Label changes on traces already in a dataset also update live, so you can watch rows move between the “Agent labeled,” “Labeled,” and “Unlabeled” sections without refreshing.If the page is empty while the agent is still working, a “Building your dataset” indicator shows that traces are on the way.Trace plan review page scrolls
The trace plan review page now scrolls when a plan has more captured nodes than fit on screen. Before, long plans clipped the Advanced selection toggle and any inline error messages below the fold; opening a plan with around 40 captured nodes now scrolls cleanly through the full call tree.TypeScript SDK available as @bitfab/sdk
The TypeScript SDK is now published under the scoped package name@bitfab/sdk in addition to the existing bitfab package. Both names resolve to the same code and will stay in sync on every release. If you prefer scoped package names for clarity in your package.json, you can switch your import at any time:import { Bitfab } from "bitfab" and import { Bitfab } from "@bitfab/sdk" work identically.May 20, 2026
TypeScript SDKPython SDKRuby SDKPlugins
TypeScript SDK v0.13.1, Python SDK v0.13.1, Ruby SDK v0.12.1, Plugins v0.6.14
SDK serialization hardening
Trace spans now ship reliably even when function inputs or outputs are difficult to serialize. Objects with circular references, oversized payloads (over 512 KB), or classes that throw during serialization no longer cause lost spans. Instead, the SDK replaces the problematic value with a descriptive<unserializable: ClassName (reason)> stub so the span still appears in your traces with full timing and metadata.This fix applies to the TypeScript, Python, and Ruby SDKs. No code changes are needed on your side; update to the latest SDK version to get the improvement automatically.Plugin can query Studio browser state
Plugins can now check whether the Studio browser tab is connected and which page is currently active via the newgetStudioState function. This lets the plugin make smarter decisions before navigating, for example skipping a navigation command when Studio is already on the target page, or surfacing a connection warning when the browser tab has been closed.Studio crash recovery and agent navigation guardrails
Studio now shows a recoverable error screen when a runtime error or unexpected crash occurs during a session. Instead of a blank page or a full 404, you see a “Try again” button that retries without losing your session context.Agent-initiated navigation is now validated against a known route whitelist. When an agent tries to navigate to an invalid or out-of-scope path, it receives an immediatenavigation-blocked event with a reason string instead of waiting for a 12-second timeout. This helps agents self-correct faster when a requested page doesn’t exist.Studio stays connected through long conversations
The Bitfab plugin now persists the link between your coding agent conversation and your Studio session. Previously, when a long conversation triggered context compaction, the agent lost track of which Studio window it had opened, requiring you to reopen Studio manually. Now the mapping is written to disk and recovered automatically after compaction, so Studio commands continue working seamlessly in extended sessions.Code change diffs in experiments
When an experiment replays traces against a code change, you can now view the exact diff that was tested. Click the file stats on any experiment card or the code-change pill in the trace detail header to open a side-by-side diff modal. The modal also shows how the dataset reacted overall (fixed, regressed, still passing, still failing) or, when opened from a single trace, whether that specific trace flipped.Fixed plugin login falling through to manual paste flow
Signing in to a Bitfab plugin (Claude Code, Cursor, or Codex) via the browser now reliably completes the automatic handoff back to your terminal. Previously, the login page could lose the callback parameters during a redirect, causing every login to fall through to the manual “copy and paste this token” flow even when the browser and terminal were on the same machine.Investigate a trace function with /bitfab:assistant
Run/bitfab:assistant investigate [<key>] to characterize an issue in a trace function without going through the full assistant flow. The agent reads recent traces and your code based on what you describe, then offers three follow-ups: stop with an in-chat summary, save a written report under .bitfab/analysis/, or hand off to dataset building when the findings include reproducible failures worth labeling. The function key is optional; when omitted, the agent picks it from your description or asks.Agent-initiated Studio session close
Agents can now programmatically end a Studio session when their work is complete. ThecloseStudio() helper sends a completion event with an optional message, and the browser automatically closes or shows a “Session Complete” screen with the agent’s message. This replaces the need for users to manually click “End session” when the agent is done.Trace plan confirmation lands inside Studio
The Bitfab plugin’s trace-plan confirmation page (where you review which spans your function will capture) now renders inside your existing Studio tab during/bitfab:setup instead of spawning a second browser window. Studio’s header and agent indicator stay visible while you decide; Confirm or Cancel keeps the tab open for the rest of the flow, no more orphan windows.If no Studio is running (you invoked /bitfab:setup outside an /bitfab:assistant session), the confirmation falls back to the standalone chromeless window as before.Annotate a closed trace from any process
The TypeScript and Python SDKs now expose a detachedclient.getTrace(id) handle that lets you add context, merge metadata, or set the session id on a trace after its root span has closed. The handle works from any process, thread, or agent that knows the trace id, with no shared in-memory state. Useful when a downstream worker or a forked AI agent needs to attach information to the original conversation’s trace.bitfab v0.13.0 for TypeScript and Python.Smarter Studio session management
The assistant flow now reuses an existing Studio session instead of opening a new browser window each time. If the Studio becomes unresponsive (tab closed, page crashed), the agent detects this within 12 seconds and offers options to refresh the tab or open a fresh session.Live activity progress in Studio
Studio now shows which phase the assistant is working on in real time. As the skill progresses through steps like identifying the trace function, building a dataset, or running experiments, the header displays the active phase name with a live elapsed timer. When one phase completes and the next begins, you see the previous phase’s duration before it transitions.If the agent disconnects or crashes mid-phase, the activity indicator automatically resets within 30 seconds instead of showing stale state indefinitely.Open a trace plan from inside /bitfab:assistant
Ask the assistant to “open the trace plan for X” (or “show me what’s captured”) and it now routes your open Studio tab to that function’s most recent trace plan in place. The Studio shell stays mounted around the plan, so your agent session, header, and connection indicator persist across the navigation, and no new browser tab pops up. The canonical /trace-plan/[id] URL still works as a standalone shareable link outside Studio.Click any trace while reviewing a dataset
Fixed a bug that blocked clicks while a trace detail was open. You can now switch traces or press Done without closing the open one first.Redesigned trace planner
The trace planner now leads with what you actually need to know: a validation summary at the top that calls out anything blocking replay (live writes inside captured spans, missing samples, disconnected roots), then a flow diagram of the captured spans and a sample-trace preview of how the recorded trace will look in the viewer. The legacy two-pane tree picker is still there, tucked behind an Advanced selection toggle for power-users. Confirm and Cancel still flow through the same Cmd+Enter / Esc handoff, so muscle memory carries over.Live agent connection indicator in Studio
Studio now shows a real-time connection status in the header. A green dot with “Agent connected” appears when the coding agent is actively polling, and transitions to a gray dot with “Awaiting agent” if the agent disconnects. The indicator updates instantly when the agent reconnects, with no page refresh needed.Mutual presence detection for plugins
The agent plugin now receivesbrowserConnected in its poll response, indicating whether a user has Studio open in the browser. This enables plugins to adapt their behavior based on whether someone is actively watching the session.Studio is now the default assistant mode
The/assistant skill now opens Studio automatically on every invocation. You no longer need to pass a studio argument to get the companion browser surface. Studio is always there, from start to finish.Studio opens directly at the relevant page
When you start in dataset or experiment mode (/assistant dataset <key> or /assistant experiment <key>), Studio now opens directly at that function’s datasets or experiments page instead of opening at the root and navigating after. This shaves a few seconds off each focused session and puts you in context immediately.Logout redirects to sign-in page
Signing out no longer lands on a blank page. You’re now redirected to the sign-in page, where you can immediately log back in or close the tab.Session log capture fix and standalone opt-in
Session log capture now works correctly after opting in during setup. A configuration mismatch previously caused the plugin to silently skip session capture even when you’d consented, so no session data was being collected. You can also now toggle session log capture on or off by running/bitfab:setup session-logs, a standalone mode that doesn’t require authentication.Trace viewer skips empty spans on open
Opening any trace now lands on the first span that has data instead of a blank trace root or an empty span. This applies across the dashboard: trace detail pages, the labeling panel, the experiments comparison view, the dataset detail panel, and the template preview studio. The hard template filter still hides non-matching spans; you just no longer have to scroll past empty ones to see meaningful content.Studio navigation events for coding agents
When you navigate between pages in Studio, the coding agent now receives real-time navigation events with the current path. This gives the agent immediate awareness of where you are in Studio, so it can tailor its responses and actions to the page you’re viewing without needing to ask.Studio sign-in stays within the Studio shell
When your coding agent opens Studio and you’re not signed in, you now see a branded sign-in page inside the Studio window instead of being redirected to the main Bitfab login. The session context persists across the sign-in flow, so the agent picks up exactly where it left off once you authenticate. The CLI receives real-timeauth-required and authenticated events, letting it wait for sign-in without polling.Accurate offline SDK update checks
The plugin’s session-start update check now always reports the correct latest SDK versions. Previously the baked version snapshot could lag behind by one release, causing the plugin to miss update notifications or report you were up to date when a newer SDK was available.Hill-climb from existing labels in /bitfab:assistant
When you start a new dataset for a function that already has validated labels, the assistant now offers a Reuse option that seeds the dataset with those labels instead of starting from scratch. Pick Reuse when you’re spinning up a different cut for experimentation but want to keep the labeling work you already trust. Define and Open are still there for the from-scratch and broad-sample cases.Replay verdicts persist with a coverage gate
After a replay in Phase 5, the assistant writes its pass/fail verdicts on the replay traces through a bundled script that verifies every replay trace got a verdict before moving on. Previously a verdict could die mid-session if the agent forgot to persist it; now the script enforces full coverage before continuing. If a trace is genuinely ambiguous, you can record it as an explicit skip rather than leaving it silently unverdicted.Plugins surface which Bitfab org they’re writing to
The plugin MCP now flags which Bitfab org it reads and writes from, so you’ll catch mismatches between your project’sBITFAB_API_KEY and the org open in your Studio tab before traces land somewhere unexpected. Coding agents now call get_api_key_context at the start of a plugin MCP session, and again whenever you mention data you just wrote isn’t visible in Studio. The same tightening applies to the remote MCP server in the Dashboard for direct (non-plugin) callers.Ruby SDK: skip child spans during replay
When you replay historical traces throughclient.replay(...), you can now have child spans return their recorded outputs instead of running real code. Three strategies control which children get short-circuited:mock: "none"(default) reruns every child span as before.mock: "all"returns historical output for every child.mock: "marked"returns historical output only for spans declared withmock_on_replay: true, and runs everything else real.
mock: "marked" to iterate on agent logic without paying for the marked child calls on each replay. Use mock: "all" for the cheapest possible replay (only the root function runs real code). Brings the Ruby SDK to parity with the existing mock option in the Python and TypeScript SDKs.Ruby SDK: fluent wrapper for shared trace function keys
client.get_function(key) returns a wrapper bound to that trace function key, so you can wrap multiple methods or classes without repeating the key on every call.client.get_function in the Python SDK and client.getFunction in TypeScript.Accurate experiment counts with multi-label traces
Experiment pass/fail counts now correctly deduplicate traces that have labels from multiple sources (human review, approved agent, unapproved agent). Previously, a trace with both a human and an agent label could be double-counted in experiment totals. The viewer now picks the highest-priority label per trace: human labels take precedence over approved agent labels, which take precedence over unapproved ones.Experiments auto-label replayed traces
When you run an experiment through the assistant, replayed traces now receive agent labels automatically. The experiment viewer shows pass/fail results immediately after a replay completes, without requiring a manual labeling step first.May 14, 2026
PluginsTypeScript SDKPython SDK
Plugins v0.5.11, TypeScript SDK v0.12.2, Python SDK v0.12.1
Replay with mocks: shared-key spans return the correct output
Fixed an off-by-one in the replay mockTree when the function under test and one of its children share onetraceFunctionKey (the canonical getFunction(key).withSpan(...) pattern). The marked child was returning the root’s historical output instead of its own. The mockTree is now keyed by (traceFunctionKey, spanName, callIndex), which also unblocks recursive same-key replays.Mocked non-async Promise-returning functions stay Promises
If you wrap afunction fetchX() { return fetch(...) } (no async, but returns a Promise) and mock it during replay, the mocked return is now a Promise, not a raw value. Downstream .then(...) callers no longer crash. Detected at wrap time./bitfab:assistant experiments auto-pick parallel or serial
Phase 5 of the assistant skill now checks whether subagent worktrees inherit bypass permissions before forking parallel experiments. If permissions.defaultMode: "bypassPermissions" is set in committed .claude/settings.json or ~/.claude/settings.json, experiments fork to worktree-isolated subagents; otherwise they run serially in the main agent. Cursor and Codex always run serial since they don’t support worktree-isolated subagent calls.Organization switcher fix
Fixed the organization switcher dropdown not appearing in the header. After upgrading to Clerk v7, the switcher silently returned no memberships, making it impossible to switch between teams. The switcher now reliably shows all your organizations, with your personal workspace listed first and the rest sorted alphabetically.Live agent activity in Studio
The Studio home page now shows what your coding agent is doing in real time. While the assistant is working, the agent card highlights green and displays the current tool action (e.g., “Reading traces…”, “Creating grader…”). When the agent finishes or goes idle, the card fades back to its neutral state. Activity persists across page navigation within Studio, so you won’t lose track of the agent’s progress.Studio detects which coding agent opened it
When Studio is launched from Cursor or Codex, the UI now shows that agent’s logo and name instead of defaulting to Claude Code. The welcome page, header, and “Return to” button all reflect the agent that started the session.Replay failure handling in the assistant skill
/bitfab:assistant now separates infrastructure failures (missing DB rows, rejected writes) from real regressions during replay, and keeps unreplayable traces out of the pass-rate. When a child span fails environmentally, it suggests either flipping the span to mockOnReplay or pointing replay at the trace’s source environment.Replay mocks return the correct child span’s output
Fixed ordering bugs inmock: "marked" that caused a marked child span to return a sibling’s historical output instead of its own. Upgrade to TypeScript SDK 0.12.1 if you’re using mock: "marked" on 0.12.0.May 13, 2026
PluginsTypeScript SDKPython SDKDashboard
Plugins v0.5.7, TypeScript SDK v0.12.0, Python SDK v0.12.0
Mock child spans during replay
When you replay a recorded trace against new code, child spans sometimes fail locally for reasons unrelated to what you’re iterating on, like a paid API key you don’t have set, a flaky external service, or a production database row that isn’t seeded in your local environment. Replay now supports skipping those children and returning their recorded outputs instead, so the root function can still run.Pass amock strategy to replay() to control it. "none" (default) runs every child for real. "all" returns the historical output for every descendant. "marked" only short-circuits descendants you’ve tagged at definition time, leaving everything else to run real, which is the iteration-friendly mode.Tag a span with mockOnReplay: true in TypeScript or mock_on_replay=True in Python:mock: "marked" (TS) or mock="marked" (Python). The flagged child returns its recorded output and downstream spans run real code, so you can iterate on the analysis or formatting steps without standing up the upstream dependency.When the assistant skill is replaying a function and a child span fails environmentally, it’ll now suggest this fix directly. Full docs: TypeScript SDK and Python SDK reference under “Mocking child spans during replay”.Reliable focus restoration for macOS terminals
When clicking “Return to coding agent” in Studio, focus now reliably returns to the correct terminal app. The previous approach could target the wrong window if the terminal’s environment was modified (common inside Claude Code). The plugin now identifies your terminal by walking the process tree to find the parent application. For iTerm2 users with multiple windows, focus targets the exact session pane.Persistent Studio session for the assistant flow
Addstudio to any /bitfab:assistant invocation (e.g., /bitfab:assistant studio) to keep a single Studio window open for the entire flow. Dataset review and experiment results open inside the same window instead of launching separate ones, so you stay in one place while iterating. Without the studio argument, the flow works exactly as before.See which template renders each span at a glance
Iterating on the right template is faster when you can tell which one runs for the span you’re looking at. In the template preview, click or arrow-key through any span in the trace viewer and the matching card in the left rail lights up in that span’s color — that’s the template to edit.Know when your coding agent is mid-edit
Stay out of the agent’s way and watch its work land in context. When your agent saves a template, the studio names who is editing (Claude Code, Cursor, or Codex), pulses the affected card in the rail, and outlines the exact region inside the rendered span — even on instant saves, so you don’t miss it.Chat session capture
Bitfab plugins can now capture your coding-agent chat sessions and send them to the dashboard. Session capture is opt-in: enable it by settingBITFAB_CAPTURE_SESSIONS=true or adding "captureSessions": true to ~/.config/bitfab/config.json. Nothing is captured until you explicitly turn it on.Once enabled, sessions are only recorded after you invoke a Bitfab tool or slash command in the same conversation, so ordinary non-Bitfab conversations are never captured. Works across Claude Code, Cursor, and Codex.Cross-platform focus restoration
When a plugin opens a browser window (OAuth login, Studio preview), focus now returns to your terminal or editor automatically on Linux and Windows. Previously this only worked on macOS. If platform tools aren’t available (e.g., Wayland on Linux), the handoff completes normally without focus restoration.Studio connection errors surface immediately
When your coding agent opens the Studio preview, connection problems (expired API key, network timeout) are now caught before the browser window opens. Previously, errors could surface mid-session after you’d already started editing.Click-to-target template editing
In the template preview studio, you can now click directly on a rendered span to tell your coding agent exactly which region you want changed. No more “make the user message smaller” guesswork: point at the element and describe the change.Live preview auto-refresh
Templates saved in the studio now re-render in the preview automatically. Previously, you had to reload the page to see your changes.Template reference for coding agents
The newget_template_reference MCP tool returns a catalog of every editable region in the standard template, so coding agents can discover what’s available without you having to describe it.Template preview is faster on large functions
The template preview page loads significantly faster for functions with many spans. Pages that previously made dozens of parallel requests now resolve in a single batched call.Template rendering page: template-first layout
The template rendering page now starts from the templates instead of starting from a trace. You pick a template and see exactly which spans it affects.The new three-column layout shows all templates for a function on the left, the current trace in the center (with non-matching spans dimmed), and affected spans across recent traces on the right. If the current trace has no spans for the selected template, the viewer auto-navigates to one that does.API key context
Coding agents can now callget_api_key_context to find out which organization and environment their API key belongs to before sending traces. No more guesswork.API key descriptions
You can now add a description when creating API keys in the dashboard. Descriptions show up in the key list and are returned byget_api_key_context, so your coding agent can tell you which key it’s using without you having to check.