Speech to Code

ReactPythonFastAPILLMSpeech to TextTailwind CSS

About this project

A web application that uses Large Language Models to convert spoken language into executable code.

The Story Behind It

I originally developed this project in the summer of 2024 when Claude Sonnet 3.5 (Anthropic's leading coding model at the time) was released with strict rate limits. As an active user of LLMs for code generation, I needed a way to continue working after hitting rate limits and efficiently provide context about my codebase to the language model.

Through experimentation, I discovered that specific system prompts were crucial for generating complete, working code rather than partial solutions. I also found that speech-to-text was essential for rapidly providing context to the language models. This led me to create a tool that combined all these elements - prompt management, speech input, and codebase context injection.

While the landscape has evolved significantly since then - with improved rate limits and advances in AI-assisted IDEs - I still find this tool valuable for specific use cases. Today, I primarily use it for quick speech-to-text conversion, sophisticated prompt construction, and analyzing repository structure (including token counts per file). The ability to dump entire codebases into prompts and understand their structure remains particularly useful.

Note: This application requires local installation as it needs direct filesystem access to analyze and interact with your codebase.

Key Features

Advanced Prompt Composer: Combine speech, repository files, and manual text input
Multi-Model Support: Integration with multiple LLM providers (OpenAI, Anthropic, Google, xAI)
Repository Integration: Interactive file viewer for repository navigation
Transcription Management: Real-time speech-to-text conversion
System Prompt Management: Create and edit system prompts

Screenshots

Tech Stack

Frontend: React, Tailwind CSS
Backend: Python, FastAPI
LLM Integration: OpenAI API, Anthropic API, Google API
Speech To Text: OpenAI Whisper API

View Source Code