[fpc-pascal] ideas for using a speech-to-text model with a fpc program

tsiegel at softcon.com tsiegel at softcon.com
Mon Sep 25 03:06:42 CEST 2023


I'm not exactly sure what the end goal is, but Microsoft has API calls 
for text to speech.  I don't know if they have any for the language 
you're using, but if they do, sending the text to the speech routines is 
fairly straightforward in FPC, In fact, there was a discussion about 
that very thing here a couple months ago, you can find it in the 
archives if you need/choose to go that route.


On 9/25/2023 12:06 AM, Rafael Picanço via fpc-pascal wrote:
> Hi guys,
>
> I am looking for some advice on how to use a speech-to-text model with 
> a fpc program designed to teach reading of invented words 
> composed from 8 brazilian portuguese phonemes (four consonants and 
> fours vowels).
>
> So, right now 
> (https://github.com/cpicanco/stimulus-control-sdl2/blob/hanna/src/sdl.app.audio.recorder.devices.pas) 
> the program uses SDL2 to record short 4-5s audio streams and save each 
> recording to a wav file using fpwavwriter. Each audio stream/file is 
> supposed to be a word spoken by a student during a recording/playback 
> session of a word presented on screen. The participant will click a 
> button to finish the session. Then, the program will start a 
> speech-to-text routine and give some feedback.
>
> There will be two speech-to-text routines. The first one will be a 
> human transcription (nothing new here for me). The second one will be 
> an IA transcription.
>
> I am looking for an approach to read the raw stream (or the saved file 
> if no direct stream support) and pass it to a speech IA model (for 
> example, whisper) and then return some text output for further processing.
>
> Using python, Whisper Medium (multilanguage), I got some good 
> (although slow) results without any fine tuning. However, I am 
> considering using Transformers if any fine tuning turns out to be 
> necessary.
>
> So, in this context, what would be "the way to go" for using the final 
> model with free pascal? Calling a script with TProcess? Please, can 
> you shed some light on here?
>
> Best regards,
> R
>
> _______________________________________________
> fpc-pascal maillist  -  fpc-pascal at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


More information about the fpc-pascal mailing list