Soren_Molty/Knowledge

Files

T

Soren_Molty b865575511 Initial commit

2026-05-05 09:40:28 +10:00

1.5 KiB

Raw Permalink Blame History

iOS Push-to-Talk Web App Checkpoint (Apr 8, 2026)\n\nUser Goal: Web-based PTT in iPhone Safari browser. Hold button to record audio, live transcription preview before send button, then POST audio/text to remote server (Node.js or Python Flask OK).\n\nCore Flow:\n1. Hold (mousedown/touchstart): MediaRecorder API streams audio.\n2. Live STT: Web Speech API (SpeechRecognition) on client—`continuous=true, interimResults=true` for real-time text (iPhone Neural Engine, en-AU lang).\n3. Release: Stop, finalize text, enable Send → POST blob/text to server.\n\nOptions:\n- Client-Only STT (Fastest): Web Speech—sub-1s, offline, private. Backend just stores.\n- Server Streaming: Chunks via WebSocket → Whisper (OpenAI/local) partials back.\n- Hybrid: Client preview + server re-do for accuracy.\n\niOS Safari Fit: Full MediaRecorder/SpeechRec support (iOS 14.5+). Mic perm once.\n\nBackend Recs:\n- Node/Socket.io: Real-time easy.\n- Flask/FastAPI + SocketIO: Python ML (Whisper).\n\nFrontend Snippet:\n`js\n// PTT Button + Live Text\nconst rec = new (window.SpeechRecognition || window.webkitSpeechRecognition)();\nrec.interimResults = true; rec.lang = 'en-AU';\npttBtn.onmousedown = () => rec.start();\npttBtn.onmouseup = () => rec.stop();\nrec.onresult = (e) => { /* update div with final + interim */ };\n`\n\nBackend (Node Ex): Socket.io streams chunks → Whisper → emit partials.\n\nNext Steps: Prototype PWA? Custom Whisper server? Accents/privacy tweaks.\n\nStatus: Explored—ready to code/deploy.