ojin/oris-voice
High-quality multilingual text-to-speech with voice cloning, preset speakers, and promptable voice design
Overview
Oris Voice (ojin/oris-voice) is Ojin's streaming text-to-speech product. You get natural-sounding speech at 24 kHz with three voice modes: clone from a short reference clip, pick a built-in speaker, or describe the voice you want in natural language.
Key Features
Voice Cloning — Clone any voice from a short reference audio clip and optional transcript
Built-in Voices — Choose from a library of built-in speaker identities
Voice Design — Describe the desired voice characteristics in natural language
Streaming Output — Audio chunks stream in real time as the model generates, enabling low-latency playback
Multilingual — Supports Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian, with automatic language detection
High-Quality Audio — 24 kHz, 16-bit PCM mono output
Voice Modes
Clone
Reproduce a voice from a reference audio sample. Provide ref_audio and optionally ref_text.
Built-in Voices
Use a built-in speaker identity. Provide speaker name and optional instruct for style instructions.
Voice Design
Generate a voice from a natural language description. Provide instruct (e.g., "a warm female voice with a British accent").
Quick Start
Getting started with ojin/oris-voice is simple:
Create an API key — Set up authentication for the Ojin platform
Create a configuration — Set up a voice configuration in the dashboard
Integrate with your application — Use the WebSocket API
Use Cases
Conversational AI — Generate natural speech responses for chatbots and virtual assistants
Content Creation — Produce voiceovers for videos, podcasts, and audiobooks
Accessibility — Convert text content to speech for visually impaired users
Localization — Generate speech in multiple languages from the same text
Voice Cloning — Preserve and reproduce specific voice identities
Persona Pipelines — Feed generated audio into ojin/oris-portrait for lip-synced video personas
Supported Languages
Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian, Auto (automatic detection).
Last updated
Was this helpful?