AsyncFlow API
  1. Text-to-Speech
AsyncFlow API
  • Welcome to AsyncFlow API
  • Get Started
  • API Reference
    • API Status
      • API Status Check
    • Text-to-Speech
      • Text to Speech (WebSocket)
      • Text to Speech
        POST
      • Text to Speech with Word Timestamps
        POST
      • Text to Speech (Stream)
        POST
    • Voice Management
      • Clone Voice
      • List Voices
      • Get Voice
      • Update Voice
      • Delete Voice
      • Get Voice Preview
  • Integrations
    • Integrate with Twilio
  1. Text-to-Speech

Text to Speech (WebSocket)

wss://api.async.ai/text_to_speech/websocket/ws
The Text-to-Speech WebSockets API streams audio from partial text while preserving consistent prosody. Use it when your text arrives incrementally (real-time transcription, chat, etc.).
It may be less suitable when the full text is available upfront (HTTP is simpler / lower-latency) or when you need to prototype quickly (WebSockets are more involved).

Handshake#

WSS wss://api.async.ai/text_to_speech/websocket/ws

Path parameters#

NameTypeRequiredDescription
api_keystringYesAsync API key.
versionstringYesAPI version

Send#

initializeConnection object — Required
PropertyTypeRequiredDescription
model_idstringYesModel ID (example: "asyncflow_v2.0")
voiceobjectYesDictionary with keys "mode" and "id". (example: {"mode": "id", "id": "e0f39dc4-f691-4e78-bba5-5c636692cc04"}
output_formatobjectNoDictionary with keys "container" , "encoding", "sample_rate", "bit_rate". Defualts to {container="raw", encoding="pcm_s16le", sample_rate=44100}
languagestringNoGenerated speech langauge
For additional details, see the Text-to-Speech endpoint, which uses almost the same parameters.
sendText object — Required
PropertyTypeRequiredDescription
transcriptstringYesNew text chunk—always ends with a single space.
forcebooleanNoForce the TTS even if there is not enough characters in the buffer. Defaults to False.
closeConnection object — Required
PropertyTypeRequiredDescription
textstringYesEmpty string to finish.

Receive#

audioOutput object — streamed
FieldTypeRequiredDescription
audiostringYesBase-64 audio chunk.
finalbooleanYesWhether this is the final response for the request
finalOutput object
FieldTypeRequiredNotes
audiostringYesAlways "".
finalbooleanYesAlways true; generation complete.
Error Responses object
FieldTypeRequiredNotes
error_codestringYesError code identifying the type of error
messagestringYesHuman-readable error message
extraObjectNoAdditional error details

Example handshake & message flow#

Handshake — GET /text_to_speech/websocket/ws
↑ (send) initializeConnection — {"model_id": "asyncflow_v2.0",...
{
  "model_id": "asyncflow_v2.0",
  "voice": {
    "mode": "id",
    "id": "e0f39dc4-f691-4e78-bba5-5c636692cc04"
  },
  "output_format": {
    "container": "raw",
    "encoding": "pcm_f32le",
    "sample_rate": 44100
  }
}
↑ (send) sendText — {"text":"Welcome to Async."}
{"text":"Welcome to Async."}
↑ (send) closeConnection — {"text":""}
{ "text": "" }
↓ (receive) audioOutput — {"audio":"Y3Vya...",...}
{
  "audio": "Y3VyaW91cyBtaW5kcyB0aGluayBhbGlrZSA6KQ==",
  "final": false,
}
↓ (receive) finalOutput — {"audio":"", "final":true}
{ "audio": "", "final": true }

Request

Query Params
api_key
string 
required
Example:
<api-key>
version
string 
required
Example:
v1
Modified at 2025-07-15 14:32:50
Previous
API Status Check
Next
Text to Speech
Built with