API Reference
Technical reference for Rox AI API endpoints.
Base URL
https://Rox-Turbo-API.hf.space
Endpoints
POST /chat - Rox Core
POST /turbo - Rox 2.1 Turbo
POST /coder - Rox 3.5 Coder
POST /turbo45 - Rox 4.5 Turbo
POST /ultra - Rox 5 Ultra
POST /dyno - Rox 6 Dyno
POST /coder7 - Rox 7 Coder
POST /vision - Rox Vision Max
POST /hf/generate - HuggingFace Compatible
All endpoints use the same request/response format.
Request
URL: /chat
Method: POST
Content-Type: application/json
Body Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
messages |
Array | Yes | - | Array of conversation messages |
temperature |
Float | No | 0.7 | Controls randomness (0.0 - 2.0) |
top_p |
Float | No | 0.95 | Nucleus sampling parameter (0.0 - 1.0) |
max_tokens |
Integer | No | 8192 | Maximum tokens in response |
stream |
Boolean | No | false | Enable streaming responses |
Message Object:
{
role: "user" | "assistant",
content: string
}
Example Request:
{
"messages": [
{
"role": "user",
"content": "What is artificial intelligence?"
}
],
"temperature": 0.7,
"top_p": 0.95,
"max_tokens": 8192,
"stream": false
}
Response
Standard Response (200 OK):
{
"content": "Artificial intelligence (AI) refers to..."
}
Streaming Response (200 OK):
When stream: true is set, the response is sent as Server-Sent Events:
data: {"content": "Artificial"}
data: {"content": " intelligence"}
data: {"content": " (AI)"}
data: {"content": " refers"}
data: {"content": " to"}
data: {"content": "..."}
data: [DONE]
Each line starts with data: followed by a JSON object containing a content field with the next token. The stream ends with data: [DONE].
Response Fields:
| Field | Type | Description |
|---|---|---|
content |
String | The generated response from Rox Core |
Error Responses:
500 Internal Server Error:
{
"detail": "Internal server error while calling Rox Core."
}
502 Bad Gateway:
{
"detail": "Bad response from upstream model provider."
}
422 Unprocessable Entity:
{
"detail": [
{
"loc": ["body", "messages"],
"msg": "field required",
"type": "value_error.missing"
}
]
}
Example Usage
Standard Request (cURL):
curl -X POST https://Rox-Turbo-API.hf.space/chat \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 8192
}'
Streaming Request (cURL):
curl -X POST https://Rox-Turbo-API.hf.space/chat \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Hello!"}
],
"stream": true
}'
Standard Request (JavaScript):
const response = await fetch('https://Rox-Turbo-API.hf.space/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
messages: [{ role: 'user', content: 'Hello!' }],
temperature: 0.7,
max_tokens: 8192
})
});
const data = await response.json();
console.log(data.content);
Streaming Request (JavaScript):
const response = await fetch('https://Rox-Turbo-API.hf.space/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
messages: [{ role: 'user', content: 'Hello!' }],
stream: true
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6).trim();
if (data === '[DONE]') break;
try {
const parsed = JSON.parse(data);
if (parsed.content) {
process.stdout.write(parsed.content);
}
} catch (e) {}
}
}
}
Standard Request (Python):
import requests
response = requests.post('https://Rox-Turbo-API.hf.space/chat', json={
'messages': [{'role': 'user', 'content': 'Hello!'}],
'temperature': 0.7,
'max_tokens': 8192
})
print(response.json()['content'])
Streaming Request (Python):
import requests
import json
response = requests.post(
'https://Rox-Turbo-API.hf.space/chat',
json={
'messages': [{'role': 'user', 'content': 'Hello!'}],
'stream': True
},
stream=True
)
for line in response.iter_lines():
if line:
line = line.decode('utf-8')
if line.startswith('data: '):
data = line[6:]
if data == '[DONE]':
break
try:
parsed = json.loads(data)
if 'content' in parsed:
print(parsed['content'], end='', flush=True)
except json.JSONDecodeError:
pass
POST /hf/generate
Hugging Face compatible text generation endpoint for single-turn interactions.
Request
URL: /hf/generate
Method: POST
Content-Type: application/json
Body Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
inputs |
String | Yes | - | The input text/prompt |
parameters |
Object | No | {} | Generation parameters |
Parameters Object:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
temperature |
Float | No | 1.0 | Controls randomness (0.0 - 2.0) |
top_p |
Float | No | 0.95 | Nucleus sampling (0.0 - 1.0) |
max_new_tokens |
Integer | No | 8192 | Maximum tokens to generate |
Example Request:
{
"inputs": "Write a haiku about technology",
"parameters": {
"temperature": 0.7,
"top_p": 0.95,
"max_new_tokens": 256
}
}
Response
Success Response (200 OK):
[
{
"generated_text": "Silicon dreams awake\nCircuits pulse with electric life\nFuture in our hands"
}
]
Response Format:
Returns an array with a single object containing the generated text.
| Field | Type | Description |
|---|---|---|
generated_text |
String | The generated response |
Error Responses:
Same as /chat endpoint (500, 502, 422).
Example Usage
cURL:
curl -X POST https://Rox-Turbo-API.hf.space/hf/generate \
-H "Content-Type: application/json" \
-d '{
"inputs": "Explain quantum computing",
"parameters": {
"temperature": 0.7,
"max_new_tokens": 256
}
}'
JavaScript:
const response = await fetch('https://Rox-Turbo-API.hf.space/hf/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
inputs: 'Explain quantum computing',
parameters: {
temperature: 0.7,
max_new_tokens: 256
}
})
});
const data = await response.json();
console.log(data[0].generated_text);
Python:
import requests
response = requests.post('https://Rox-Turbo-API.hf.space/hf/generate', json={
'inputs': 'Explain quantum computing',
'parameters': {
'temperature': 0.7,
'max_new_tokens': 256
}
})
print(response.json()[0]['generated_text'])
Parameters
temperature
Controls output randomness.
- Range: 0.0 to 2.0
- Default: 1.0
- Lower (0.1-0.5): Focused, deterministic
- Medium (0.6-1.0): Balanced
- Higher (1.1-2.0): Creative, varied
Examples:
0.3: Math problems, factual questions, code generation0.7: General conversation, explanations1.2: Creative writing, brainstorming, storytelling
Example:
{
"messages": [{"role": "user", "content": "What is 2+2?"}],
"temperature": 0.2
}
top_p
Nucleus sampling parameter.
- Range: 0.0 to 1.0
- Default: 0.95 (/hf/generate), 1.0 (/chat)
- Lower: More focused
- Higher: More diverse
Example:
{
"messages": [{"role": "user", "content": "Tell me a story"}],
"top_p": 0.9
}
max_tokens / max_new_tokens
Maximum tokens in response.
- Range: 1 to 32768
- Default: 8192
Token estimation:
- ~1 token ≈ 4 characters
- ~1 token ≈ 0.75 words
Example:
{
"messages": [{"role": "user", "content": "Brief summary of AI"}],
"max_tokens": 150
}
stream
Enable streaming responses for real-time token delivery.
- Type: Boolean
- Default: false
- Values: true or false
When enabled, responses are sent as Server-Sent Events instead of a single JSON response. This allows you to display tokens as they are generated rather than waiting for the complete response.
Benefits of streaming:
- Lower perceived latency
- Better user experience for long responses
- Ability to cancel generation early
- Real-time feedback
Example:
{
"messages": [{"role": "user", "content": "Write a story"}],
"stream": true
}
Response format when streaming is enabled:
data: {"content": "Once"}
data: {"content": " upon"}
data: {"content": " a"}
data: {"content": " time"}
data: [DONE]
Error Handling
Status Codes
| Code | Meaning | Description |
|---|---|---|
| 200 | OK | Request successful |
| 422 | Unprocessable Entity | Invalid request format |
| 500 | Internal Server Error | Server-side error |
| 502 | Bad Gateway | Upstream model error |
Error Response Format
{
"detail": "Error message here"
}
Common Errors
Missing field:
{
"detail": [
{
"loc": ["body", "messages"],
"msg": "field required",
"type": "value_error.missing"
}
]
}
Invalid type:
{
"detail": [
{
"loc": ["body", "temperature"],
"msg": "value is not a valid float",
"type": "type_error.float"
}
]
}
Example error handler:
async function safeRequest(endpoint, body) {
try {
const response = await fetch(endpoint, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body)
});
if (!response.ok) {
const error = await response.json();
throw new Error(error.detail || `HTTP ${response.status}`);
}
return await response.json();
} catch (error) {
console.error('API Error:', error);
throw error;
}
}
Rate Limiting
No enforced rate limits. Implement client-side limiting as needed.
Client Wrapper Example
class RoxAI {
constructor(baseURL = 'https://Rox-Turbo-API.hf.space') {
this.baseURL = baseURL;
}
async chat(messages, options = {}) {
const response = await fetch(`${this.baseURL}/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
messages,
temperature: options.temperature || 0.7,
top_p: options.top_p || 0.95,
max_tokens: options.max_tokens || 8192,
stream: false
})
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}`);
}
const data = await response.json();
return data.content;
}
async chatStream(messages, onToken, options = {}) {
const response = await fetch(`${this.baseURL}/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
messages,
temperature: options.temperature || 0.7,
top_p: options.top_p || 0.95,
max_tokens: options.max_tokens || 8192,
stream: true
})
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}`);
}
const reader = response.body.getReader();
const decoder = new TextDecoder();
let fullContent = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6).trim();
if (data === '[DONE]') break;
try {
const parsed = JSON.parse(data);
if (parsed.content) {
fullContent += parsed.content;
onToken(parsed.content);
}
} catch (e) {}
}
}
}
return fullContent;
}
async generate(text, options = {}) {
const response = await fetch(`${this.baseURL}/hf/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
inputs: text,
parameters: options
})
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}`);
}
const data = await response.json();
return data[0].generated_text;
}
}
// Standard usage
const rox = new RoxAI();
const response = await rox.chat([
{ role: 'user', content: 'Hello!' }
]);
console.log(response);
// Streaming usage
await rox.chatStream(
[{ role: 'user', content: 'Tell me a story' }],
(token) => process.stdout.write(token)
);
Built by Mohammad Faiz