Speech to Text / Voice Recognition - Processing Forum

Loading...

Processing Forum

Library and Tool Development

florian-schulz

Florian S..

Speech to Text / Voice Recognition

in Library and Tool Development • 2 years ago

I’ve been working on a library that makes it easy to use microphone input and transcribe it to text. It uses the technology that Google Chrome uses for its input fields where you can optionally use your voice instead of your keyboard. You can read more about it on the project page

Please let me know if you have suggestions on how I could improve the design of the library or anything else!

1

Replies(19)

Florian Schulz

Florian S..

Re: Speech to Text / Voice Recognition

1 year ago

I just wanted to announce some improvements to the stability of the library. Please report any bugs that you come across! Also: I’m interested in projects that make use of my library.

luizap

Re: Speech to Text / Voice Recognition

1 year ago

I've been experimenting with this and hopefully I'll have some interesting results to share soon.

Thanks for the library, by the way. Nice documentation!

Florian Schulz

Florian S..

Re: Re: Speech to Text / Voice Recognition

1 year ago

I’m excited!

jeromesaintclair

jeromesai..

Re: Speech to Text / Voice Recognition

1 year ago

Hey.

The Graffiti Research Lab France is using your library in its tagEULE projet:

We use speech recognition to display graffitis using GML files based fonts along with the GML4U library.

It's been really useful and saved us some dev time.

However, we seem to run into problems when using our installation in a quite noisy environment (outside or in a crowd) and with "not as good as expected" wifi network covergae.

I tried different thinks as :

Case 1- Set the threshold manually

Case 2- Disable autorecord

Under those conditions my first feeling is that

Case 1 : the recording constantly starts/stops and we either get partial results or queue too many recordings.

Case 2 : don't work very well (probably because of the noise around).

We also noticed that deep voices works are better recognized than high-pitched ones (in French).

Dunno if there is a way to filter voices to fit in the right spectrum of voices that Google API can handle.

Is that something you are aware of or have workarounds?

I gonna try to debug under Eclipse to have a better understanding on how it works exactly and try to spot where the issues we have come from.

Will let you know if I find anything useful to improve the lib.

Thanks for lib and keep on the good work.

++

J

Florian Schulz

Florian S..

Re: Re: Speech to Text / Voice Recognition

1 year ago

Actually the service from Google is pretty limited regarding the length of the recording.

I’ve always tested under good conditions with only few background noise. If the recording itself is too bad (you could test it directly in chrome) you may need to work on the hardware. On the other hand I handle the auto-record really simple. If something is louder than the threshold value, I record until the volume falls below for at least half a second.

I’m sure that one could improve the auto record and better distinguish between noise and actual speech if you would constantly average the volume or the like.

Actually I have no resources to work on the library but there are a few other things that I need to work on (e.g. using the library to transcribe pre-recorded files).

I’ll keep you updated!

Manindra29

Manindra29

Re: Speech to Text / Voice Recognition

1 year ago

This is one of the coolest Processing libraries I've seen! Transcribing is so easy!

The results were decent enough for my use.

Great job! Thanks for sharing.

mimink21

Re: Re: Speech to Text / Voice Recognition

1 year ago

it says Speech could not be interpreted..any ideas why, I m using the example from fForian's site

dackdel

Re: Speech to Text / Voice Recognition

1 year ago

link to the updated library? or did you just replace your older one? also great work btw will let you know when and where i might be using it. also what do you think about the new dictation on mL? do you have a list of languages this supports?

Florian Schulz

Florian S..

Re: Speech to Text / Voice Recognition

1 year ago

The most recent library is available at www.stt.getflourish.com

I haven't looked into Mountain Lion's dictation yet, but hopefully we can utilize that to get even better results!

Supported languages for STT are all major ones that Google supports.

en, de, fr, es, and even Chinese which is zh.

nchandol

Re: Speech to Text / Voice Recognition

8 months ago

Hi guys,

I am trying to use this amazing library but I keep getting an Minim error. I have tried to fix it but no luck so far... Did anybody else had the same problem?

So the error I am getting is this:

13:17:59 STT info: Manual mode enabled. Use begin() / end() to manage recording.

==== JavaSound Minim Error ====

==== AudioRecorder.save: Error attempting to save buffer to /Users/nikolaoschandolias/Documents/workspace/STT/bin/data2013-01-27-13-17-59/0.wav, the output file is empty.

==== JavaSound Minim Error ====

==== Unsupported Audio File: not a MPEG stream:null

Exception in thread "Animation Thread" java.lang.NullPointerException

at ddf.minim.javasound.JSBufferedSampleRecorder.save(JSBufferedSampleRecorder.java:173)

at ddf.minim.AudioRecorder.save(AudioRecorder.java:107)

at com.getflourish.stt.STT.startListening(STT.java:422)

at com.getflourish.stt.STT.onBegin(STT.java:367)

at com.getflourish.stt.STT.begin(STT.java:133)

at com.getflourish.stt.LibTest.keyPressed(LibTest.java:30)

at processing.core.PApplet.handleKeyEvent(PApplet.java:2931)

at processing.core.PApplet.dequeueEvents(PApplet.java:2466)

at processing.core.PApplet.handleDraw(PApplet.java:2153)

at processing.core.PGraphicsJava2D.requestDraw(PGraphicsJava2D.java:193)

at processing.core.PApplet.run(PApplet.java:2020)

at java.lang.Thread.run(Thread.java:680)

I am using processing 2.07 (I have tried it with 1.51 too) and I am on Mountain Lion... do you have any idea or possible solution?

Regards,

Nikos

Florian Schulz

Florian S..

Re: Re: Speech to Text / Voice Recognition

8 months ago

Hey Nikos, it's working on my machine with the same version of Processing and OSX. Are you using it from Eclipse?

nchandol

Re: Speech to Text / Voice Recognition

8 months ago

Hey Florian,

Thanks for the answer! I have found the solution, there was a problem with the wav file encoding

It works perfectly now!

I run it directly at processing but I might transfer the whole project on eclipse.

I will keep you post it for further new on my project!

Thanks once again!

Florian Schulz

Florian S..

Re: Speech to Text / Voice Recognition

8 months ago

Awesome! I'm glad to hear that it works now. Good luck with your project and let me know when you got anything to try or show :)

yackob

Re: Speech to Text / Voice Recognition

8 months ago

Hi Florian, I have been trying to use the library I tried the example on the project page but I keep getting an error message saying class STT not found.

Exception in thread "Animation Thread" java.lang.NoClassDefFoundError: ddf/minim/Recordable

at sketch_130204b.setup(sketch_130204b.java:33)

at processing.core.PApplet.handleDraw(PApplet.java:2103)

at processing.core.PGraphicsJava2D.requestDraw(PGraphicsJava2D.java:190)

at processing.core.PApplet.run(PApplet.java:2006)

at java.lang.Thread.run(Thread.java:662)

Caused by: java.lang.ClassNotFoundException: ddf.minim.Recordable

at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:190)

at java.lang.ClassLoader.loadClass(ClassLoader.java:306)

at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)

at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

... 5 more

Thanks,

nchandol

Re: Speech to Text / Voice Recognition

7 months ago

Hi again everybody :)

I am interested to know if someone have attempted to implement this library in Java instead of Processing... Does anybody have any idea on where I should start...

Florian, have you done this first in Java and then transform it into a Processing lib?

Could you please let me know?

Thanks in advance!

Florian Schulz

Florian S..

Re: Speech to Text / Voice Recognition

7 months ago

Yes, I've built the library with Java in Eclipse and only the main class depends on PApplet to fire the transcription events. You should be able to extract that and get rid of the Processing part if you need to.

Check out the source code / java files: https://github.com/getflourish/STT/tree/master/src/com/getflourish/stt

And here is the full repository: https://github.com/getflourish/STT

ojik88

Re: Speech to Text / Voice Recognition

2 months ago

Thanks for the library, but when it works more than 20 minutes in autorecord mode OutOfMemoryError occurs ( increasing the amount of available memory does not help)

wemperor

Re: Speech to Text / Voice Recognition

17 days ago

This is a super awesome library!

It finally brings me near to an easy voice control of arduinos controlled over the processing sketch.

I wonder if I can use that for controlling an indoor airship, should be reasonable fast.

Florian Schulz

Florian S..

Re: Speech to Text / Voice Recognition

17 days ago

Thanks for your feedback! As others already mentioned, you should check out how the library performs after time. This is an ongoing issue that I haven't been able to fix yet. So better test before your airship drops ;)

Top Reply