Linux Audio and Streaming

From srevilak.net
Revision as of 19:56, 8 August 2016 by SteveR (talk | contribs) (ffmpeg)
Jump to navigation Jump to search

ALSA and Pulse Audio

I know of two major audio systems for linux: ALSA and Pulse audio. ALSA deals directly with hardware, and pulse audio is an abstraction layer that sits on top of Alsa. The process of getting audio into your computer will vary according to which system you use.

Getting audio in to alsa

In ALSA, you'll have to find a hardware device number. To see a list of hardware devices:

 $ arecord --list-devices
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: CX20561 Analog [CX20561 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0

Above, we see one audio card (numbered "card 0"), and one subdevice (numbered subdevices #0). We'll specify this as "hw:0,0" -- hardware: card 0 ,subdevice 0.

Now, try to record some audio:

 $ arecord -f cd --device hw:0,0 --vumeter=stereo sound.wav
 Recording WAVE 'sound.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo

This creates a file named sound.wav, using input from device hw:0,0. The "--vumeter=stereo" option instructs arecord to display a textual VU meter. This is a useful way to get an idea of what the audio levels are.

When setting audio levels (via something like alsamixer), you'll probably see "microphone", and "microphone boost". Start with "microphone boost" all the way down, and increase only if needed. Adding too much gain with microphone boost tends to lead to distortion, at least in my experience.

After recording for a little while, press Ctrl-C to stop the recording.

To play the file back:

 $ aplay sound.wav 
 Playing WAVE 'sound.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo 

One thing to note about arecord: it requires exclusive access to your computer's audio input device. This seems to be true of any ALSA-type input, and it means that only one program at a time can consume audio.

Pulse Audio

Pulse audio is an abstraction layer that sits above the hardware. This gives pulse some advantages. Pulse can (exclusively) access the audio hardware, then distribute the digital audio signal to several programs (called "sinks"). For example, one sink can stream audio, while another sink displays a vu meter.

To see a list of sources:

 $ pactl list sources | grep Name
       Name: alsa_output.pci-0000_00_1b.0.analog-stereo.monitor
       Name: alsa_input.pci-0000_00_1b.0.analog-stereo

To make a simple recording:

 $ parecord --device=alsa_input.pci-0000_00_1b.0.analog-stereo --record sound2.wav

While this is running,

 $ pavumeter --record

Now we've got two audo sinks ("consumers"). One is making a recording, and one is displaying a VU meter.

Press Ctrl-C to interrupt parecord. Then

 $ aplay sound2.wav

to play it back.

There: we've just identified audio sources, and used them to make a simple recording. This is the audio equivelent to "hello world", and it's the first step in getting ready to stream audio.


ffmpeg

ffmpeg can work very well for simple capture/encode/streaming scenarios ... depending on the linux distribution. To explain this, we have to talk about the two flavors of ffmpeg: the "mainline" version, and the one which comes with Debian (at least as of Debian 8). See https://answers.launchpad.net/ubuntu/+question/223855 and http://blog.pkh.me/p/13-the-ffmpeg-libav-situation.html for background.

Using ffmpeg on Debian will probably show you a warning:

The ffmpeg program is only provided for script compatibility and will
be removed in a future release. It has been deprecated in the Libav
project to allow for incompatible command line syntax improvements in
its replacement called avconv (see Changelog for details). Please use
avconv instead.

Unfortunately, libav doesn't work nearly as well as ffmpeg (at least in my experience).


Some (non-Debian) ffmpeg recipies for streaming to an icecast server:

vpx and vorbis in a webm container

 ffmpeg \
   -f v4l2 -video_size 640x480 -framerate 30 -i /dev/video1 \
   -f alsa -i hw:0,2 \
   -f webm -cluster_size_limit 2M -cluster_time_limit 5100 -content_type video/webm \
   -c:a libvorbis -b:a 96K \
   -c:v libvpx -b:v 1.5M -crf 30 -g 150 -deadline good -threads 4 \
   icecast://source:hackme@example.org:8000/demo.webm

vorbis (audio-only) in a webm container

 ffmpeg \
   -f alsa -sample_rate 41000 -channels 2 -thread_queue_size 64 -i hw:0,2 \
   -f webm  \
   -c:a libvorbis -b:a 96k \
   -ice_public 1 \
   -content_type audio/webm \
   -ice_name "my great stream" \
   -ice_description "this is a great stream" \
   icecast://source:hackme@example.org:8000/demo.webm


If you happen to be on Debian, you can try this:

 arecord -f cd -t wav -D hw:0,2  - |\
 avconv \
   -i pipe:0  \
   -c:a libvorbis -b:a 96k \
   -f webm  \
   -ice_public 1 \
   -content_type audio/webm \
   -ice_name "my great stream" \
   -ice_description "this is a great stream" \
   icecast://source:hackme@example.org:8000/demo.webm

Note that we're using arecord to capture audio, and sending this to avconv (the "replacement" ffmpeg) with a pipe. Trying to capture and stream with avconv will fill your screen with

 Non-monotonous DTS in output stream 0:0; previous: 88138, current: 87792; changing to 88139. This may result in incorrect timestamps in the output file.
 Non-monotonous DTS in output stream 0:0; previous: 88139, current: 88071; changing to 88140. This may result in incorrect timestamps in the output file.

In summary, if you've got basic requirements and a mainline ffmpeg, give it a try -- it may do everything you want.

If you have more elaborate requirements (or, if you're trying to stream from Debian), then I'd suggest giving gstreamer a try.