Documentation

Quick Start

T-Lang works as a drop-in proxy for the OpenAI API. Change your base URL, and your prompts are automatically compressed before being sent to the upstream API.

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_TLANG_API_KEY",
    base_url="https://your-worker.workers.dev/v1",
    default_headers={
        "X-Upstream-Api-Key": "YOUR_OPENAI_API_KEY"
    }
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Summarize this article in 100 words..."}
    ]
)

print(response.choices[0].message.content)

JavaScript (fetch)

const response = await fetch('https://your-worker.workers.dev/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_TLANG_API_KEY',
    'X-Upstream-Api-Key': 'YOUR_OPENAI_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Summarize this article...' }],
  }),
});

const data = await response.json();

// Check compression stats in response headers
console.log('Tokens saved:', response.headers.get('X-TLang-Saved-Tokens'));
console.log('Compression rate:', response.headers.get('X-TLang-Compression-Rate'));

CLI for AI Agents

T-Lang CLI lets AI agents (Claude Code, Cursor, etc.) make token-compressed LLM calls. Zero dependencies — just Node.js 18+. Your API keys never leave your machine.

Setup (one-time, agent runs these automatically)

# Step 1: Authorize — opens browser, user clicks login once
npx tlang-cli auth login

# Step 2: Add your LLM provider key
npx tlang-cli provider add openai --key sk-YOUR_KEY
# Built-in: openai, gemini, grok, deepseek, groq

Usage

# Send a compressed prompt (JSON output)
npx tlang-cli chat "Summarize this article about AI" --model gpt-4o-mini

# Pipe input
echo "Long document..." | npx tlang-cli chat --provider gemini

# Compress only (no API call)
npx tlang-cli compress "Please summarize the key points in 100 words"

How CLI Works

1. CLI sends messages to T-Lang Worker for server-side compression

2. Worker compresses and returns optimized messages + subscription status

3. CLI calls Provider API directly with your local key

4. JSON output: LLM response + compression stats

If free daily limit (20/day) is reached, messages are sent uncompressed — API still works, just no savings.

JSON Output Format

{
  "content": "LLM response text",
  "model": "gpt-4o-mini",
  "provider": "openai",
  "compression": {
    "original_tokens": 45,
    "compressed_tokens": 12,
    "saved_tokens": 33,
    "rate": "73.3%"
  },
  "subscription": { "active": true, "plan": "free", "dailyFreeUsed": 3, "dailyFreeLimit": 20 }
}

How It Works

T-Lang uses a Domain-Specific Language (DSL) to compress natural language prompts into compact tokens. The compression is applied transparently when your request passes through our proxy.

Example Compression

Original (18 tokens):

"Please summarize the following article, keeping it under 100 words"

Compressed (4 tokens):

!Sum [article] ?Len<100

78% token savings on this prompt

Smart compression automatically skips messages that shouldn't be compressed:

Code blocks and code generation requests
Very short messages (<20 characters)
System prompts (preserved as-is)

API Reference

POST/v1/chat/completions

OpenAI-compatible chat completions with automatic compression.

Headers:

Authorization: Bearer YOUR_TLANG_KEY
X-Upstream-Api-Key: YOUR_OPENAI_KEY

Response Headers:

X-TLang-Original-Tokens — Original token count
X-TLang-Compressed-Tokens — Compressed token count
X-TLang-Saved-Tokens — Tokens saved
X-TLang-Compression-Rate — Compression percentage
X-TLang-Saved-Cost — Estimated cost saved

POST/api/keys/generate

Generate a new API key. Requires Firebase authentication.

Body:

{ "name": "My Key" }

GET/api/keys

List all API keys for the authenticated user.

DELETE/api/keys/:key

Revoke an API key. Requires Firebase authentication.

GET/api/usage

Get monthly and daily usage statistics.

GET/health

Health check endpoint. No authentication required.

T-Lang Syntax Reference

The compression engine maps natural language verbs to compact operators:

T-Lang	Meaning	Example
!Sum	Summarize	!Sum [article] ?Len<100
!Ext	Extract	!Ext [dates] [names]
!Gen	Generate	!Gen [email] ?Style:formal
!Cvt	Convert	!Cvt [JSON] ->Table
!Trns	Translate	!Trns [text] ?Lang:en
!Exp	Explain	!Exp [concept] ?Len<50
!Fix	Fix / Debug	!Fix [code]
!Opt	Optimize	!Opt [query] ?Style:perf

Operators

Symbol	Meaning
?Len<N	Length limit
?Fmt:X	Format constraint (List, Table, etc.)
?Lang:X	Language constraint (zh, en, etc.)
->JSON	Output as JSON
->MD	Output as Markdown
[entity]	Entity / subject marker
+	And / combine
=>	Then / pipe

FAQ

Does T-Lang store my prompts?

No. T-Lang processes your prompts in-memory on Cloudflare's edge network and passes them through to the upstream API. We only store aggregate usage statistics (token counts), never prompt content.

Does compression affect response quality?

Modern LLMs understand compressed instructions well. T-Lang's DSL was designed to preserve semantic meaning. Code generation and system prompts are automatically skipped to avoid any quality impact.

Which LLM providers are supported?

Currently OpenAI (GPT-4, GPT-3.5, etc.). The proxy is compatible with any provider that uses the OpenAI API format (Azure OpenAI, Groq, Together, etc.).

Does it support streaming?

Yes. When you pass stream: true, the compressed request is proxied and the upstream SSE stream is passed through to your client.