I recently added voice support to aider, so now you can AI pair program with your voice. Aider uses the extremely accurate Whisper API to transcribe your voice and sends the resulting requests to GPT-4 along with relevant context about your codebase [0]. This combination is surprisingly effective.
GPT can better interpret casually phrased voice requests because aider sends along the relevant code context. When speaking about code, it's common to leave out many precise syntactic details that you might include when typing.
For example, you might type: "Use the Voice.record_and_transcribe() function to convert the audio to text"
But you would probably speak: "use the voice record and transcribe function to convert the audio to text".
Aider's code context allows GPT to recognize that the casually phrased spoken version is referring to a specific existing function.
GPT can better interpret casually phrased voice requests because aider sends along the relevant code context. When speaking about code, it's common to leave out many precise syntactic details that you might include when typing.
For example, you might type: "Use the Voice.record_and_transcribe() function to convert the audio to text"
But you would probably speak: "use the voice record and transcribe function to convert the audio to text".
Aider's code context allows GPT to recognize that the casually phrased spoken version is referring to a specific existing function.
[0] https://aider.chat/docs/ctags.html