Highlight any text on any website and listen in premium AI voices — paid for with your own API keys. Serverless, no account, and not a single word of your reading leaves your machine.
There is a particular kind of silence that settles over a data center at three in the morning. Not the absence of sound — the machines never truly rest — but the absence of intention. The day's traffic has thinned to a trickle, and the systems settle into a rhythm that feels almost meditative.
Rongo never resells synthesis or takes a cut. Drop in a key and you're billed by the provider at their rate — or use the built-in offline engine for nothing.
No copy-paste, no separate app, no dashboard. Rongo lives in the margin of whatever you're reading and only shows up when you ask.
Select any text and a Play widget fades in by your cursor. Or just hover a paragraph — Rongo outlines the reading scope and offers to read it, no selection needed.
Before a single character is sent, you see a live ~$ estimate and the raw character count. Over your cap, the badge turns terracotta and asks before it spends.
The widget becomes a compact player — play, pause, scrub, stop. On Pro, words light up in karaoke as they're spoken, right on the page you're reading.
Select text on any page and hear it instantly in a premium voice — or right-click for a quick native read.
Hover any block of text and a play icon appears with the reading scope outlined — zero selection required.
Swap between ElevenLabs, OpenAI, Google Gemini and the offline engine — with per-provider voice, speed and pitch.
A live ~$ estimate before playback, a high-cost confirmation gate, and an optional monthly spend cap.
Words on the page light up in sync as they're spoken, using ElevenLabs character alignment.
A second action: an English comprehension summary, an optional Polish translation, each playable on demand.
Audio, summaries and translations are cached in your browser's IndexedDB and merged into one History row per source.
API keys live in chrome.storage.local — never synced, never proxied, never seen by us.
A running, lifetime estimated total per provider — and for ElevenLabs, the authoritative characters-used and next invoice.
Rongo has no backend. Requests go straight from your browser to the provider you chose. There is no account, no telemetry, and nothing to breach — because there's nowhere for your data to sit.
We never see your text or your keys. Synthesis is a direct call you can verify in the network tab.
The provider charges your account directly at their published rate. Rongo takes nothing per character, ever.
The native Web Speech engine needs no key and no connection — and it never costs a cent.
No subscription. One license, yours for good. (You still pay your own providers for premium synthesis — Rongo just unlocks the software.)
Everything you need to listen to the web, with your own keys or the offline engine.
A single license — validated locally, no call home — that unlocks the comprehension layer for good.
Add Rongo to Chrome, drop in a key, highlight a sentence. You'll hear the difference in about three seconds.