SDKs¶

Two first-party clients — Python (async) and TypeScript / browser. They wrap the WebSocket protocol, handle auth, and demux the binary audio stream.

Python — `audvoice-client`¶

pip install audvoice-client
pip install "audvoice-client[mic]"   # adds sounddevice + numpy for CLI demos

API surface:

Method	Purpose
`AudVoiceClient(base_url, api_key)`	Construct
`connect()` / `close()` / `async with`	Lifecycle
`update_session(**cfg)`	Set voice, languages, model, tools, instructions, RAG
`send_audio(pcm)`	PCM16 mono 16 kHz frames
`send_text(text)`	Inject user message (skip STT)
`commit_input()` / `cancel_response()`	Forced end-of-turn / cancel current reply
`send_tool_result(call_id, output)`	Return a tool result
`events() -> AsyncIterator[dict]`	All JSON events
`audio() -> AsyncIterator[(sid, bytes)]`	Decoded binary audio chunks (PCM16 24 kHz)

See packages/client_py/README.md and tests/live_ws.py for full examples.

TypeScript — `@audvoice/client`¶

npm install @audvoice/client

import { AudVoiceClient } from "@audvoice/client";

const client = new AudVoiceClient({ baseUrl: "https://voice.example", apiKey: "…" });
await client.connect();
await client.updateSession({ voice: "ar-AE-FatimaNeural", languages: ["ar-AE", "en-US"] });

client.on("transcript.final", (e) => console.log(e.text));
client.on("audio", ({ pcm }) => playPcm24k(pcm));   // Int16Array

await client.sendAudio(pcm16k);  // ArrayBuffer / Int16Array

Works in browsers and Node ≥ 18. For older Node, pass fetchImpl and WebSocketImpl (e.g. from ws).

Other languages¶

The wire is plain JSON-over-WebSocket plus binary PCM frames; any language can talk to it. See protocol.md. The minimum implementation is ~100 lines.

If you write one (Go, Rust, Swift, Kotlin, .NET), open a PR — we'll list it here.

Publishing¶

Python → PyPI¶

cd packages/client_py
pip install build twine
python -m build                                # produces dist/audvoice-client-*.whl
python -m twine upload dist/*                  # needs PYPI_API_TOKEN

TypeScript → npm¶

cd packages/client_js
npm install
npm run build
npm publish --access public                    # needs NPM_TOKEN

Versioning¶

Both packages follow SemVer. The wire protocol carries its own version under session.created.protocol_version (planned) — bump major when you change message shapes.

Embedding the orchestrator¶

You can also embed the FastAPI app (audvoice.main:app) inside a larger service:

from fastapi import FastAPI
from audvoice.main import app as voice_app

root = FastAPI()
root.mount("/voice", voice_app)         # now /voice/v1/sessions, /voice/v1/voice

Or import individual pieces (Session, LlmRunner, SttPipe, TtsPipe) if you want to write a different transport (Server-Sent Events, gRPC, ACS SIP, etc.).