Question about streaming on Ruby

I have a problem which I even don’t know where to start.
My professor asked me to build a system in Ruby which uses streaming in
order to play voice.
It is a system where the server will give responses to what the user says.
So, basically if the user says “hello”, the system will reply saying “Fine,
and you?”.
At the moment I don’t have to care about the voice recognition, I just have
to
build something that can save a sound file in the server and something where
the user can
press a button called “record my voice” and that as soon as the server
finishes recording the
user’s voice, the server will play the response file.

My question is: how to start doing this with Ruby?
Can Ruby do this?

I asked my friends at my lab, and one friend said I should better forget
Ruby and use active-X.
But my professor said I should try to develop it using Ruby.

Rob

Hello Rob,

Rob wrote:

My question is: how to start doing this with Ruby?
Can Ruby do this?

Yes! In fact, GStreamer is probably able to do this, and you can call
GStreamer via Ruby using Ruby-GStreamer.

I never tried this, but I think you could record audio from the
microphone using the osssrc element from the ossaudio plugin.

You will find more information about GStreamer on its homepage:

 http://www.gstreamer.net

And for Ruby bindings:

 http://www.freesoftware.fsf.org/ruby-gst

Current API (CVS):

 http://lrz.samika.net/ruby-gst/api/index.html

Ruby-GStreamer 0.1.1 will be released in a few days, featuring a lot of
improvements.

Cheers,

···


Laurent

“Rob” robson@magario.com schrieb im Newsbeitrag
news:064801c34081$3c812000$a8c70b85@CPQ22906761423…

I have a problem which I even don’t know where to start.
My professor asked me to build a system in Ruby which uses streaming in
order to play voice.
It is a system where the server will give responses to what the user
says.
So, basically if the user says “hello”, the system will reply saying
“Fine,
and you?”.
At the moment I don’t have to care about the voice recognition, I just
have
to
build something that can save a sound file in the server and something
where
the user can
press a button called “record my voice” and that as soon as the server
finishes recording the
user’s voice, the server will play the response file.

My question is: how to start doing this with Ruby?
Can Ruby do this?

I asked my friends at my lab, and one friend said I should better forget
Ruby and use active-X.
But my professor said I should try to develop it using Ruby.

Apart from the sound recognition stuff this sounds like an ordinary web
application. You can have a look at eruby and I think there’s an Apache
mod_ruby.

The difficult part - independent of programming language used - will be
the sound recording of the user’s voice.

robert

Sure, at worst something like:

 system("wavrec -r 5 myfile.wav")

Or I guess you could even open /dev/audio or /dev/dsp and read/write from
them directly.

Cheers,

Brian.

···

On Wed, Jul 02, 2003 at 07:03:57PM +0900, Rob wrote:

press a button called “record my voice” and that as soon as the server
finishes recording the

user’s voice, the server will play the response file.

My question is: how to start doing this with Ruby?

Can Ruby do this?

Hi,

At the moment I don’t have to care about the voice recognition, I just have to
build something that can save a sound file in the server and something where
the user can press a button called “record my voice” and that as soon as the
server finishes recording the user’s voice, the server will play the response file.

My question is: how to start doing this with Ruby?
Can Ruby do this?

I asked my friends at my lab, and one friend said I should better forget
Ruby and use active-X.
But my professor said I should try to develop it using Ruby.

Since you mention active-X I’m assuming you might be under Windows.

Here’s a way to Play sounds in Ruby under Windows:
http://www.ruby-talk.com/55423

As for recording… I haven’t recorded under Windows, but there’s a high-level
API available for recording, that we should be able to call from Ruby.

This page has examples using Windows’ winmm.dll MCISendString function to
both play and record sounds. The examples are in VB, but translate easily
to Ruby using Ruby’s Win32API module.

(Hmm… the page seems to be down at the moment… but it’s still in Google’s
cache. If you search for “MCISendString record” [without the quotes] in
Google, this site should come up as one of the first hits, called “Visual
Basic WAV”… then click on the ‘Cached’ link :slight_smile:

Here’s a Ruby example using MCISendString to play a sound:

require ‘Win32API’
MCISendString = Win32API.new(“winmm”, “mciSendString”, [‘P’, ‘P’, ‘L’, ‘L’], ‘L’)

MCISendString.call(‘open m:\download\snd\columbus.wav type waveaudio alias voice1’, nil, 0, 0)
MCISendString.call(‘play voice1’, nil, 0, 0)

…I’ve tried the above on my system and it worked.

I haven’t tried the following, but the command lines are translated from
the above web page:

prepare to record in buffer called “capture”

MCISendString(“open new type waveaudio alias capture”, nil, 0, 0)

set bits-per-sample to 8 or 16

MCISendString(“set capture bitspersample 8”, nil, 0, 0)

set sample rate (11025, 22050, or 44100)

MCISendString(“set capture samplespersec 11025”, nil, 0, 0)

set mono or stereo (1 or 2 channels)

MCISendString(“set capture channels 1”, nil, 0, 0)

apparently this will record until the user presses “stop”

MCISendString(“record capture”, nil, 0, 0)

alternately it seems you can specify a range in the buffer

in which to record:

MCISendString(“record capture from 2000 to 4000”, nil, 0, 0)

…so presumably this would allow you to record for a few

seconds and stop automatically without the user having to

click anything.

finally to save the buffer to a file:

MCISendString(“save capture c:\NewWave.wav”, nil, 0, 0)

…Hmmm… In your case, if you really need to do streaming,
a lower-level API would probably be needed. Are you saying you’d
need to record a sound (what the user speaks) locally, stream that
to the server, have the server do (eventual) voice recognition, and
respond with an audio stream back to the user?

Since the phrases spoken are probably short, I wonder if you might
be able to just use a non-streamed API like MCISendString? And then
just transmit the recorded audio file to the server?

If you really need streaming, you might want to look into this
lower-level WAV API:

Here’s some info on caling waveInOpen() from VB, which should
be easy to translate to Ruby’s Win32API like the MCISendString
stuff above…

Hope this helps,

Bill

···

“Rob” robson@magario.com wrote:

I forgot to mention that GStreamer is only available on POSIX
operating systems, such as GNU/Linux and FreeBSD for instance.

http://gstreamer.net/status/?category=7

···


Laurent

Hi, Bill.

Thanks a lot for the hints.
I am under Windows, but the problem is that my professor wants this
application to be compatible
with any kinds of servers (windows and linux).
I was thinking that this application is similar to Yahoo voice chat, since
it deals with streaming.
But Yahoo voice chat also needs Windows to run.
Is that possible to make an application for this which runs in both windows,
mac and linux and
can be accessed from a browser like microsoft internet explorer, mozilla or
netscape?

Rob

···

----- Original Message -----
From: “Bill Kelly” billk@cts.com
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Wednesday, July 02, 2003 11:01 PM
Subject: Re: Question about streaming on Ruby

Hi,

“Rob” robson@magario.com wrote:

At the moment I don’t have to care about the voice recognition, I just
have to
build something that can save a sound file in the server and something
where
the user can press a button called “record my voice” and that as soon as
the
server finishes recording the user’s voice, the server will play the
response file.

My question is: how to start doing this with Ruby?
Can Ruby do this?

I asked my friends at my lab, and one friend said I should better forget
Ruby and use active-X.
But my professor said I should try to develop it using Ruby.

Since you mention active-X I’m assuming you might be under Windows.

Here’s a way to Play sounds in Ruby under Windows:
http://www.ruby-talk.com/55423

As for recording… I haven’t recorded under Windows, but there’s a
high-level
API available for recording, that we should be able to call from Ruby.

This page has examples using Windows’ winmm.dll MCISendString function to
both play and record sounds. The examples are in VB, but translate easily
to Ruby using Ruby’s Win32API module.
Yahoo | Mail, Weather, Search, Politics, News, Finance, Sports & Videos
(Hmm… the page seems to be down at the moment… but it’s still in
Google’s
cache. If you search for “MCISendString record” [without the quotes] in
Google, this site should come up as one of the first hits, called “Visual
Basic WAV”… then click on the ‘Cached’ link :slight_smile:

Here’s a Ruby example using MCISendString to play a sound:

require ‘Win32API’
MCISendString = Win32API.new(“winmm”, “mciSendString”, [‘P’, ‘P’, ‘L’,
‘L’], ‘L’)

MCISendString.call(‘open m:\download\snd\columbus.wav type waveaudio alias
voice1’, nil, 0, 0)
MCISendString.call(‘play voice1’, nil, 0, 0)

…I’ve tried the above on my system and it worked.

I haven’t tried the following, but the command lines are translated from
the above web page:

prepare to record in buffer called “capture”

MCISendString(“open new type waveaudio alias capture”, nil, 0, 0)

set bits-per-sample to 8 or 16

MCISendString(“set capture bitspersample 8”, nil, 0, 0)

set sample rate (11025, 22050, or 44100)

MCISendString(“set capture samplespersec 11025”, nil, 0, 0)

set mono or stereo (1 or 2 channels)

MCISendString(“set capture channels 1”, nil, 0, 0)

apparently this will record until the user presses “stop”

MCISendString(“record capture”, nil, 0, 0)

alternately it seems you can specify a range in the buffer

in which to record:

MCISendString(“record capture from 2000 to 4000”, nil, 0, 0)

…so presumably this would allow you to record for a few

seconds and stop automatically without the user having to

click anything.

finally to save the buffer to a file:

MCISendString(“save capture c:\NewWave.wav”, nil, 0, 0)

…Hmmm… In your case, if you really need to do streaming,
a lower-level API would probably be needed. Are you saying you’d
need to record a sound (what the user speaks) locally, stream that
to the server, have the server do (eventual) voice recognition, and
respond with an audio stream back to the user?

Since the phrases spoken are probably short, I wonder if you might
be able to just use a non-streamed API like MCISendString? And then
just transmit the recorded audio file to the server?

If you really need streaming, you might want to look into this
lower-level WAV API:

Here’s some info on caling waveInOpen() from VB, which should
be easy to translate to Ruby’s Win32API like the MCISendString
stuff above…
AllAPI.net - Your #1 source for using API-functions in Visual Basic!

Hope this helps,

Bill

What you want todo is very easy todo in Squeak.

Ciao,
-A

so you have to

  • record sound on the client
  • send it to the server
  • analyze it on the server
  • figure out a reply on the server
  • send it back to the client
  • play the response on the client.

the stuff on the server, then, is a matter of

  • dealing with sound over the network
  • doing bunches of calculations on it.

the client is what needs to do all the recording and playing. i’m sure
SDL works across windows and unix and probably mac, to play sound, and
it has bindings to ruby, but i’m pretty sure it won’t get you far for
recording sound. you could use java and the java media framework on the
client; but the computer running it would require the right version of
the JRE and the JMF… >20MB download, plus hassle of installing.

on Linux you can use LinuxDevices for playing and recording and
ruby-audiofile for reading and writing .wav/.au/.aiff/etc. files. (both
are in RAA.) on Windows that funky MCI thingy seems more likely to work.
Neither of these happen inside the browser.

other than a java applet using JMF, an ActiveX control or a plugin of
your own i’m not sure how you would record from inside a browser. but
the server part should be easier: networking and math are better
supported across platforms than recording sound.

if you’re concerned about bandwidth between the client and server, you
might compress the audio with Speex (speex.org). except there aren’t any
ruby bindings yet… hmmm… ducks in workshop

···

On Thu, Jul 03, 2003 at 07:22:48PM +0900, Rob wrote:

Hi, Bill.

Thanks a lot for the hints.
I am under Windows, but the problem is that my professor wants this
application to be compatible with any kinds of servers (windows and
linux). I was thinking that this application is similar to Yahoo
voice chat, since it deals with streaming. But Yahoo voice chat also
needs Windows to run. Is that possible to make an application for
this which runs in both windows, mac and linux and can be accessed
from a browser like microsoft internet explorer, mozilla or netscape?

<paq_> tarzeau: how long are you gonna be depraved of the net?
paq_: like a month, i’ll play nethack i guess
that’s “deprived”. He’s going to be “depraved” forever.

#debian, freenode