# ojin/oris-voice

> High-quality multilingual text-to-speech with voice cloning, preset speakers, and promptable voice design

## Overview

**Oris Voice** (`ojin/oris-voice`) is Ojin's streaming text-to-speech product. You get natural-sounding speech at 24 kHz with three voice modes: clone from a short reference clip, pick a built-in speaker, or describe the voice you want in natural language.

## Key Features

* **Voice Cloning** — Clone any voice from a short reference audio clip and optional transcript
* **Built-in Voices** — Choose from a library of built-in speaker identities
* **Voice Design** — Describe the desired voice characteristics in natural language
* **Streaming Output** — Audio chunks stream in real time as the model generates, enabling low-latency playback
* **Multilingual** — Supports Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian, with automatic language detection
* **High-Quality Audio** — 24 kHz, 16-bit PCM mono output

## Voice Modes

| Mode                | Description                                                                                                                   |
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------- |
| **Clone**           | Reproduce a voice from a reference audio sample. Provide `ref_audio` and optionally `ref_text`.                               |
| **Built-in Voices** | Use a built-in speaker identity. Provide `speaker` name and optional `instruct` for style instructions.                       |
| **Voice Design**    | Generate a voice from a natural language description. Provide `instruct` (e.g., "a warm female voice with a British accent"). |

## Quick Start

Getting started with ojin/oris-voice is simple:

1. [**Create an API key**](/getting-started/authentication.md) — Set up authentication for the Ojin platform
2. [**Create a configuration**](/models/oris-voice/creating-configuration.md) — Set up a voice configuration in the dashboard
3. [**Integrate with your application**](/models/oris-voice/integrations.md) — Use the WebSocket API

## Use Cases

* **Conversational AI** — Generate natural speech responses for chatbots and virtual assistants
* **Content Creation** — Produce voiceovers for videos, podcasts, and audiobooks
* **Accessibility** — Convert text content to speech for visually impaired users
* **Localization** — Generate speech in multiple languages from the same text
* **Voice Cloning** — Preserve and reproduce specific voice identities
* **Persona Pipelines** — Feed generated audio into ojin/oris-portrait for lip-synced video personas

## Supported Languages

Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian, Auto (automatic detection).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ojin.ai/models/oris-voice.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
