Queuing multiple transcriptions with whisper.el speech recognition (opens in new tab)

I want to be able to talk out loud and have the ideas go into Emacs. I can do this in a number of different ways:

  1. I briefly demonstrated a step-by-step approach with natrys/whisper.el with a single file. I press a keyboard shortcut to start the recording, another shortcut to stop the recording, and it transcribes it in the background. But the way whisper.el is set up is that if I press the keyboard shortcut to start recording again it will offer to interrupt the transcription process, which is not what I want. I want to just keep talking and have it process results as things come in.
  2. I’m also experimenting with Google Chrome’s web speech API to do continuous speech recognition, which I can get into Emacs using a web socket.
  3. What I’ve just figured out is how to layer a semi-continuous interface for speech recognition on top of whisper.el so that while it’s processing in the background, I can just press a keyboard shortcut (I’m using numpad 9 to call my-whisper-continue) to stop the previous recording, queue it for processing, and start the next recording. If I use this keyboard shortcut to separate my thoughts, then Whisper has a much easier time making sense of the whole sentence or paragraph or whatever, instead of trying to use the sliding 30 second context window that many streaming approaches to speech recognition try to use.

Question: Did you fix the keyboard delay you’ve got while speech catches what you’re saying?

Sometimes, when the speed recognition kicks in, my computer gets busy. When my computer gets really busy, it doesn’t process my keystrokes in the right order, which is very annoying because then I have to delete the previous word and retype it. I haven’t sorted that out yet, but it seems like I probably have to lower the priority on different processes. On the plus side, as I mentioned, if I dictate things instead of typing them, then I don’t run into that problem at all.

Loading more...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help