“The difference is that our app is much worse than ChatGPT, for real. If you're subscribed to ChatGPT, please just use that. We're making this for people that can't afford $20/mo. — Fikri Karim, creator of Bule AI (September 2025)”
You know that feeling when you want to build a voice AI feature but cloud APIs cost $20/month per user and send all your data to someone else's servers? Or when you see OpenAI's multimodal demos and think 'that's exactly what I need' but realize it requires their infrastructure? Running real-time voice+vision AI locally used to demand a desktop GPU that costs more than a used car. Parlor exists because its creator runs a free English-learning voice AI service and needed to eliminate server costs entirely.
Your browser captures microphone audio and camera frames, sending them over WebSocket to a local FastAPI server. The server feeds audio and JPEG images into Gemma 4 E2B (Google's 2.3B parameter multimodal model) via LiteRT-LM, which understands both speech and vision simultaneously. The model generates text responses, which Kokoro TTS converts to speech — streaming sentence-by-sentence back to your browser. Silero VAD in the browser detects when you're speaking so you don't need push-to-talk, and barge-in lets you interrupt the AI mid-sentence.
If you're a developer curious about on-device AI who wants to see what's now possible on laptop hardware, this is your demo. Also relevant if you're building privacy-first applications or need zero-marginal-cost voice AI. Not for you if you need production-ready code (security issues exist), Windows support (LiteRT-LM doesn't support it), or agentic coding capabilities (creator explicitly says it can't do this).
Yes, but strictly as an experiment. The project is 4 days old (April 3-6, 2026) with a 'research preview' label and 3 open issues including security vulnerabilities. What makes it worth your time: it proves real-time multimodal AI now runs on laptop-class hardware. The 705 GitHub stars in days show genuine developer interest. Try it to understand what's now possible locally, but don't build on it yet — wait for security patches and broader platform support.
Deep-dive insight, Easy and Pro modes, plus action playbooks — the full breakdown is one tap away.