T-Lang works as a drop-in proxy for the OpenAI API. Change your base URL, and your prompts are automatically compressed before being sent to the upstream API.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_TLANG_API_KEY",
base_url="https://your-worker.workers.dev/v1",
default_headers={
"X-Upstream-Api-Key": "YOUR_OPENAI_API_KEY"
}
)
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "user", "content": "Summarize this article in 100 words..."}
]
)
print(response.choices[0].message.content)const response = await fetch('https://your-worker.workers.dev/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_TLANG_API_KEY',
'X-Upstream-Api-Key': 'YOUR_OPENAI_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Summarize this article...' }],
}),
});
const data = await response.json();
// Check compression stats in response headers
console.log('Tokens saved:', response.headers.get('X-TLang-Saved-Tokens'));
console.log('Compression rate:', response.headers.get('X-TLang-Compression-Rate'));T-Lang CLI lets AI agents (Claude Code, Cursor, etc.) make token-compressed LLM calls. Zero dependencies — just Node.js 18+. Your API keys never leave your machine.
# Step 1: Authorize — opens browser, user clicks login once
npx tlang-cli auth login
# Step 2: Add your LLM provider key
npx tlang-cli provider add openai --key sk-YOUR_KEY
# Built-in: openai, gemini, grok, deepseek, groq# Send a compressed prompt (JSON output)
npx tlang-cli chat "Summarize this article about AI" --model gpt-4o-mini
# Pipe input
echo "Long document..." | npx tlang-cli chat --provider gemini
# Compress only (no API call)
npx tlang-cli compress "Please summarize the key points in 100 words"1. CLI sends messages to T-Lang Worker for server-side compression
2. Worker compresses and returns optimized messages + subscription status
3. CLI calls Provider API directly with your local key
4. JSON output: LLM response + compression stats
If free daily limit (20/day) is reached, messages are sent uncompressed — API still works, just no savings.
{
"content": "LLM response text",
"model": "gpt-4o-mini",
"provider": "openai",
"compression": {
"original_tokens": 45,
"compressed_tokens": 12,
"saved_tokens": 33,
"rate": "73.3%"
},
"subscription": { "active": true, "plan": "free", "dailyFreeUsed": 3, "dailyFreeLimit": 20 }
}T-Lang uses a Domain-Specific Language (DSL) to compress natural language prompts into compact tokens. The compression is applied transparently when your request passes through our proxy.
"Please summarize the following article, keeping it under 100 words"
!Sum [article] ?Len<100
Smart compression automatically skips messages that shouldn't be compressed:
/v1/chat/completionsOpenAI-compatible chat completions with automatic compression.
Headers:
Authorization: Bearer YOUR_TLANG_KEYX-Upstream-Api-Key: YOUR_OPENAI_KEYResponse Headers:
X-TLang-Original-Tokens — Original token countX-TLang-Compressed-Tokens — Compressed token countX-TLang-Saved-Tokens — Tokens savedX-TLang-Compression-Rate — Compression percentageX-TLang-Saved-Cost — Estimated cost saved/api/keys/generateGenerate a new API key. Requires Firebase authentication.
Body:
{ "name": "My Key" }/api/keysList all API keys for the authenticated user.
/api/keys/:keyRevoke an API key. Requires Firebase authentication.
/api/usageGet monthly and daily usage statistics.
/healthHealth check endpoint. No authentication required.
The compression engine maps natural language verbs to compact operators:
| T-Lang | Meaning | Example |
|---|---|---|
| !Sum | Summarize | !Sum [article] ?Len<100 |
| !Ext | Extract | !Ext [dates] [names] |
| !Gen | Generate | !Gen [email] ?Style:formal |
| !Cvt | Convert | !Cvt [JSON] ->Table |
| !Trns | Translate | !Trns [text] ?Lang:en |
| !Exp | Explain | !Exp [concept] ?Len<50 |
| !Fix | Fix / Debug | !Fix [code] |
| !Opt | Optimize | !Opt [query] ?Style:perf |
| Symbol | Meaning |
|---|---|
| ?Len<N | Length limit |
| ?Fmt:X | Format constraint (List, Table, etc.) |
| ?Lang:X | Language constraint (zh, en, etc.) |
| ->JSON | Output as JSON |
| ->MD | Output as Markdown |
| [entity] | Entity / subject marker |
| + | And / combine |
| => | Then / pipe |
No. T-Lang processes your prompts in-memory on Cloudflare's edge network and passes them through to the upstream API. We only store aggregate usage statistics (token counts), never prompt content.
Modern LLMs understand compressed instructions well. T-Lang's DSL was designed to preserve semantic meaning. Code generation and system prompts are automatically skipped to avoid any quality impact.
Currently OpenAI (GPT-4, GPT-3.5, etc.). The proxy is compatible with any provider that uses the OpenAI API format (Azure OpenAI, Groq, Together, etc.).
Yes. When you pass stream: true, the compressed request is proxied and the upstream SSE stream is passed through to your client.