Best current methods for finetuning whisper on domain specific vocabulary? [P]

Hey everyone,

I’m wondering whether there are any newer or more effective methods for fine tuning whisper on domain specific speech. I’m working on a project where the model needs to reliably detect certain specific words and technical terms. The vocabulary and context are mostly in spanish.

Does anyone have experience with a similar use case? Roughly how many hours of labeled audio would be needed before seeing the model converged?

I know about lora, qlora, and spectrum, but Im curious if there are any newer or better ways to adapt whisper to specific vocabulary.

any help is welcome!

submitted by /u/gothenjoyer_
[link] [comments]

Best current methods for finetuning whisper on domain specific vocabulary? [P]

Want to read more?

Tagged with