I wanted to record audio from a web page. I had assumed this would
require some sort of java or (shudder) ActiveX plug-in, but found that
HTML5 now includes a "UserMedia" interface that allows the browser and
javascript to capture audio on the user's machine. I found an example
(See ) that pointed me in the right direction and I was
able to write an audio capture and mp3 encoder using only
javascript.
TL;DR
I wrote my own PCM audio recorder and joined it to the
lamejs
library. The finished code is here: mp3.htm.
A demo page is here.
WebRTC currently only works with FireFox.
Temasys
provides a plugin for IE and Safari, but I haven't tried it yet.
Gory Details: ImOk
I created a simple page to experiment with audio capture and
encoding. To keep things simple, I am assuming the audio is 1 channel
at 44.1KHz sample rate. I am using FireFox to view the page.
This code only works with FireFox. Browser-compatibility
fixes would be most welcome.
The first step is to create a simple page to provide a UI and to
contain the javascript. The lamejs docs suggest the encoder should be
able to encode in real-time, so instead of start recording, capture,
stop recording, export to WAV, and encode I will be encoding each
audio buffer as it arrives and capturing only the MP3 stream. There
will be no separate encode step.
Note that the audio control and download link do not have a "src"
or "href" attribute, that will be supplied later when the output
stream has been created.
I saved this to webv6/sites/imok/pub/mp3.htm, and access it through
my local nginx web server using FireFox as http://localhost:8082/imok/mp3.htm
One of my goals is to avoid front-loading the page as much as
possible. I won't initialize any of the audio components or encoder
library until the user requests it. The lame.min.js library is 156KB.
I am embedding this in the page for now, later I want to load it only
when the user clicks the Power button.
I need to have some sort of debug feedback, so I create a simple
log window and status update.
mp3.htm:
<script id="logger" type="text/javascript">
function log(fmt,args) {
var text= this.sprintf.apply(this,arguments);
var element= document.querySelector('#log');
element.innerHTML+= text+"<br>\n";
}
function status(fmt,args) {
var text= this.sprintf.apply(this,arguments);
var element= document.getElementById('status');
element.innerHTML= text;
}
(function(window) {
log("Window loaded.");
})(window);
</script>
Clicking the Power button will create the audio interface.
mp3.htm:
<script id="taskUI" type="text/javascript">
var gAudio= null; //Audio context
var gAudioSrc= null; //Audio source
var gNode= null; //The audio processor node
var gIsRecording= false;
var gPcmCt= 0;
function onPower(btn) {
if(!gAudio) {
PowerOn();
} else {
PowerOff();
}
}
function PowerOn() {
try {
//Browser compatibility
if(!window.AudioContext)
window.AudioContext= window.webkitAudioContext;
navigator.getUserMedia= navigator.getUserMedia
|| navigator.webkitGetUserMedia
|| navigator.mozGetUserMedia
|| navigator.msGetUserMedia;
if(!window.AudioContext) {
log("ERR: No AudioContext.");
} else if(!navigator.getUserMedia) {
log("ERR: No UserMedia interface.");
} else if(!(gAudio= new AudioContext())) {
log("ERR: Unable to create Audio interface.");
} else {
var caps= { audio: true };
navigator.getUserMedia(caps,onUserMedia,onFail);
}
} catch(ex) {
log("ERR: Unable to find any audio support.");
}
function onFail(ex) {
log("ERR: getUserMedia failed: %s",ex);
}
}
function onUserMedia(stream) {
if(!(gAudioSrc= gAudio.createMediaStreamSource(stream))) {
log("ERR: Unable to create audio source.");
} else {
gPcmCt= 0;
log("Power ON.");
document.getElementById('btnRecord').disabled= false;
document.getElementById('btnStop').disabled= false;
}
}
function PowerOff() {
if(gIsRecording) {
log("ERR: PowerOff: You need to stop recording first.");
} else {
gAudioSrc= null;
gAudio= null;
log("Power OFF.");
document.getElementById('btnRecord').disabled= true;
document.getElementById('btnStop').disabled= true;
}
}
Clicking the Record button will create the audio source node. The
term "node" refers to its role within the media "graph" that controls
the flow of the media data from source, through various filters, to
its final destination. The media data will arrive in sample buffers
through the
onaudioprocesss
callback. As long as I am done processing each buffer before the next
one is ready, I can do everything synchronously in a single
thread. Later, I will move the encoder into a worker task that will
isolate the encoder from the UI task.
mp3.htm:
var gNode= null; //The audio processor node
var gPcmCt= 0;
function onRecord(btn) {
var creator;
if(!gAudio) {
log("ERR: No Audio source.");
} else if(gIsRecording) {
log("ERR: Already recording.");
} else {
if(!gNode) {
if(!(creator= gAudioSrc.context.createScriptProcessor || gAudioSrc.createJavaScriptNode)) {
log("ERR: No processor creator?");
} else if(!(gNode= creator.call(gAudioSrc.context,gCfg.bufSz,gCfg.chnlCt,gCfg.chnlCt))) {
log("ERR: Unable to create processor node.");
}
}
if(!gNode) {
log("ERR: onRecord: No processor node.");
} else {
gNode.onaudioprocess= onAudioProcess;
gAudioSrc.connect(gNode);
gNode.connect(gAudioSrc.context.destination);
gIsRecording= true;
log("RECORD");
}
}
}
function onStop(btn) {
if(!gAudio) {
log("ERR: onStop: No audio.");
} else if(!gAudioSrc) {
log("ERR: onStop: No audio source.");
} else if(!gIsRecording) {
log("ERR: onStop: Not recording.");
} else {
gNode.onaudioprocess= null;
gAudioSrc.disconnect(gNode);
gNode.disconnect();
gIsRecording= false;
log("STOP");
}
}
function onAudioProcess(e) {
var inBuf= e.inputBuffer;
var samples= inBuf.getChannelData(0);
var sampleCt= samples.length;
gPcmCt+= sampleCt;
status("%d",gPcmCt);
}
I should now be able to click the Power button, the browser will
ask if the page can use the microphone, and the audio source will be
created. Clicking Record will create the audio capture node and audio
buffers should start arriving until I click Stop. I am not doing
anything with the audio yet, excepting counting the bytes.
It is time to deploy the lamejs library. I want to initialize
lamejs during the PowerOn sequence, after the audio source has been
created and I know the audio parameters, such as sample rate and
number of channels. At this point, I am assuming the audio matches my
default configuration. Later I will configure the audio source or
adjust my configuration to match.
mp3.htm:
var gLame= null; //The LAME encoder library
var gEncoder= null; //The MP3 encoder object
var gStrmMp3= []; //Collection of MP3 buffers
var gCfg= {
chnlCt: 1,
bufSz: 4096,
sampleRate: 44100,
bitRate: 128
};
var gMp3Ct= 0;
function onUserMedia(stream) {
if(!(gAudioSrc= gAudio.createMediaStreamSource(stream))) {
log("ERR: Unable to create audio source.");
} else if(!(gEncoder= Mp3Create())) {
log("ERR: Unable to create MP3 encoder.");
} else {
gStrmMp3= [];
gMp3Ct= 0;
log("Power ON.");
}
}
function PowerOff() {
if(gIsRecording) {
log("ERR: PowerOff: You need to stop recording first.");
} else {
gEncoder= null;
gLame= null;
gNode= null;
gAudioSrc= null;
gAudio= null;
log("Power OFF.");
}
}
function onRecord(btn) {
var creator;
if(!gAudio) {
log("ERR: No Audio source.");
} else if(!gEncoder) {
log("ERR: No encoder.");
} else if(gIsRecording) {
log("ERR: Already recording.");
} else {
...
}
}
function Mp3Create() {
if(!(gLame= new lamejs())) {
log("ERR: Unable to create LAME object.");
} else if(!(gEncoder= new gLame.Mp3Encoder(gCfg.chnlCt,gCfg.sampleRate,gCfg.bitRate))) {
log("ERR: Unable to create MP3 encoder.");
} else {
log("MP3 encoder created.");
}
return(gEncoder);
}
Run this just to make sure the calls the lamejs library are
working. I am still not doing anything with the audio yet.
It is finally time to encode the audio buffers. The browser
delivers the audio samples as floating-point (which is nothing but a
tremendous waste of CPU cycles) so the first task is to convert them
back to 16bit signed integers, in the range of -32767 to +32767
(0x8000 in hex). I need to clamp the values at the extremes to avoid
nasty audio pops due to integer overflows. I submit the 16bit samples
to lamejs and add the output mp3 buffer to my mp3 stream. Note that
gStrmMp3 will be an array of buffers, not just a simple data
buffer.
mp3.htm:
function onAudioProcess(e) {
var inBuf= e.inputBuffer;
var samples= inBuf.getChannelData(0);
var sampleCt= samples.length;
var samples16= convertFloatToInt16(samples);
if(samples16.length > 0) {
gPcmCt+= samples16.length*2;
var mp3buf= gEncoder.encodeBuffer(samples16);
var mp3Ct= mp3buf.length;
if(mp3Ct>0) {
gStrmMp3.push(mp3buf);
gMp3Ct+= mp3Ct;
}
status("%d / %d: %2.2f%%",gPcmCt,gMp3Ct,(gMp3Ct*100)/gPcmCt);
}
}
function convertFloatToInt16(inFloat) {
var sampleCt= inFloat.length;
var outInt16= new Int16Array(sampleCt);
for(var n1=0;n1<sampleCt;n1++) {
//This is where I can apply waveform modifiers.
var sample16= 0x8000*inFloat[n1];
sample16= (sample16 < -32767) ? -32767 : (sample16 > 32767) ? 32767 : sample16;
outInt16[n1]= sample16;
}
return(outInt16);
}
Finally, I need to handle the stop operation by flushing out the
last mp3 buffer and present the final mp3 stream to the user. [TODO:
It would be better to stop the input source and wait for the
onend callback, but I could not find my way from gAudioSrc to a node
that had a stop interface. This code abruptly disconnects the audio
stream, dropping any buffer(s) that are in the pipeline, resulting in
the last half-second of audio being clipped.] I coalesce the mp3
stream buffers into a single data blob in memory (which will be a
complete mp3 file), create a local URL to the blob, then paste the URL
into the audio control and the download link.
mp3.htm:
function onStop(btn) {
if(!gAudio) {
log("ERR: onStop: No audio.");
} else if(!gAudioSrc) {
log("ERR: onStop: No audio source.");
} else if(!gIsRecording) {
log("ERR: onStop: Not recording.");
} else {
gAudioSrc.disconnect(gNode);
gNode.disconnect();
gIsRecording= false;
var mp3= gEncoder.flush();
if(mp3.length>0)
gStrmMp3.push(mp3);
showMp3(gStrmMp3);
log("STOP");
}
}
function showMp3(mp3) {
//Consolidate the collection of MP3 buffers into a single data Blob.
var blob= new Blob(gStrmMp3,{type: 'audio/mp3'});
//Create a URL to the blob.
var url= window.URL.createObjectURL(blob);
//Paste the URL into the audio control and download links.
var audio= document.getElementById('AudioPlayer');
var download= document.getElementById('DownloadLink');
audio.src= url;
download.href= url;
}
The polishing touch is to remove the reference to lamejs from the
document head and load it on demand during the PowerOn operation. I
add a new script to provide a function to load an external script. By
calling it from onUserMedia() it will only be loaded after a valid
audio device has been created. Loading the script takes time and I
will be notified when the script is available by way of a callback, so
I need to split onUserMedia() into pre- and post-load steps.
I tried loading the page with IE11 and it failed, I could not find
the AudioContext or getUserMedia interfaces. It may be that my version
of IE11 (11.0) is too old. My focus is on achieving demo-level
usability -- chasing after full browser compatibility is well outside
the scope.
WebV7 (C)2018 nlited | Rendered by tikope in 38.483ms | 3.21.46.24