I'm working on automatically generating weekly podcasts for Magnatune. I wanted it to be automated, so that I could support a large variety of podcasts without regular labor. However, I'm competing with hand-made podcasts, so the quality of the end product needs to be quite good in order to be competitive.
The first challenge was selecting the songs to include. Most podcasts are either 30 or 60 minutes, so I chose the more-or-less-60-minutes format.
From various articles, it seems many podcasters are taking tips from radio production, so I thought I'd do the same.
Specifically, I'm imitating a "rotation" concept for our podcasts, with four different kinds of rotation lists, each one commanding 25% of the songs selected:
|* Hot rotation (25%): 1st song on most recent 25% of albums in this genre
* High rotation (25%) - 1st song of all albums in this genre
* Medium rotation (25%) - other songs of most recent 25% of albums in this genre
* Low rotation (25%) - all other songs
What I've been told by many people is that they need to hear a song several times before it sinks in and they start to really enjoy it, and that this "hit making" is vital to the selling of an album to them. On the genre-mix playlists, I've been randomly mixing all songs, and this is counter to current radio thinking, and may not be as effective at selling our albums as a "hit songs get played often" strategy.
With this rotation strategy above, the 1st song of an album is presumed to be one of the better ones, the potential "hit" on the album (or at least, one of the better songs), and that's why 50% of the rotation on our podcasts will be the 1st song from our albums. After focussing on the first songs, next comes the other songs from our newest albums, and finally everything else.
When it comes to the podcasts themselves, there seem to be two main styles of product:
|1) mostly-music : a short introductory talking bit at the beginning, followed by 60 minutes of music, and at the end, a spoken list of all the songs that were played
2) mostly-DJ : a short intro at the beginning, and either between every song, or every other song, a spoken description of what was just played
For now, at least, I'm going to go with the "mostly-music" style, as that's what I prefer, and feedback from users seems to strongly weigh toward "less talk."
In terms of production values, a good podcast should:
|1) trim off the silence at the beginning and end of songs
2) normalize the volume of all the included songs so that the perceived volume changes are minimal
3) cross-fade between songs, so that there is a smooth movement between songs
4) If there is any talking, it should be at a volume comfortable compared to the music volume, and not uncomfortably louder
It took a bit of research and twiddling, but I did manage to do all these things in an automated fashion, and am fairly pleased with the result. The hardest part to decide seems to be the cross-fade duration, as if it's too short it feels sudden, but too long and the two songs are playing at the same time, which is bothering. I currently have it set at 5 seconds, which feels good, but I may tinker with this further.
The rest of this message contains technical details on the automation aspect of music production.
The first step (after deciding on the track listing) is extracting the WAV files from the wav.zip file for each album, ie:
unzip -d /tmp -j "/music/MRDC/Plethora/wav.zip" "*/*/01-Plethora-MRDC.wav"
this continues for all the WAV files for the podcast.
Next, I trim the silence off the beginning and end of the WAV file. The all-purpose sox http://sox.sourceforge.net/ program has a function to do this, but there are many complaints that the documentation is obscure, and I agree, I had trouble getting it to work right.
One source of confusion is that it turns out that Sox only will trim from silence the beginning of a WAV file, and not the end. To work around this (documented at http://article.gmane.org/gmane.comp.audio.sox/502/match=silence), you reverse the WAV file into a temporary file after trimming the beginning, so that the "end" is now the reversed beginning, trim the silence again, and reverse once more. Amazingly this works. The command is:
|sox "/tmp/01-Plethora-MRDC.wav" /tmp/tmp.wav silence 1 0:0:0.01 -50d reverse |
sox /tmp/tmp.wav "/tmp/01-Plethora-MRDC.wav" silence 1 0:0:0.01 -50d reverse
You can see the before/after results in these graphics below. The first graphic zooms on the beginning of the WAV file, with the original wav on top, and the trimmed one on the bottom. This is a piece of classical music which both fades out and has a little background hiss as well.
while this graphic shows the trimming that occurred on the end:
Next comes the normalizing process. I use Chris Vaill's excellent normalize program, which is fairly intelligent about how it does things, using a RMS (perceived loudness) rather than real loudness measurement, and takes all the WAV files you're going to play together as a single input so it can draw relative-loudness judgments. It also has a built-in limiter, to avoid creating digital distortion when volume is added.
The defaults for the "normalize" program worked well, so that's what I used, and you can see the volume changes it made to this rock podcast:
|normalize "/tmp/01-Plethora-MRDC.wav" "/tmp/01-Modern Anguish-Norine Braun.wav" "/tmp/11-A whisper for the others-Minstrel Spirit.wav" "/tmp/08-Peel-Jade Leary.wav" "/tmp/01-Fluid - Headphones-Magnatune.wav" "/tmp/01-Fountain Street (instrumental mix)-Mercy Machine.wav" "/tmp/07-Only you-Lizzi.wav" "/tmp/05-Beggar-William Brooks.wav" "/tmp/01-Bad Bad Luck-Burnshee Thornside.wav" "/tmp/01-It's All Moving Faster-Electric Frankenstein.wav" "/tmp/02-Fossildawn-Jade Leary.wav" "/tmp/13-Hanna to Hollywood-Norine Braun.wav" "/tmp/01-Till My Cup Runs Over-Four Stones.wav" "/tmp/01-Mirrored image-Cargo Cult.wav" "/tmp/05-Simply Put - Shane Jackman.wav"
Normalizing all WAVs in podcast
The difference before and after normalizing is very noticeable while listening and shows up clearly in the waveforms.
Here is a graphic of the original (non-normalized) rock podcast:
and here is the normalized version of the same podcast:
You can clearly see songs 13 and 14 being significantly louder and less different in the 2nd version. Looking at the changes normalize made, this is confirmed:
|Applying adjustment of 1.85dB to /tmp/13-Hanna to Hollywood-Norine Braun.wav...|
Applying adjustment of 2.20dB to /tmp/01-Till My Cup Runs Over-Four Stones.wav...
Finally, I need to glue all the songs together, and cross fade the transitions. The "crossfade_cat.sh" script included in the sox source distribution does this. Note that it only works if you compile sox from source code, and only if you run the script while the current directory is the ./sox/scripts directory. I had previously thought the script was broken because it gave strange errors, but it does work, if you build from source and run it inside the source directory.
What I do is start with the first WAV file, copy that to podcast.wav and then cross-fade that with the second song. The crossfade_cat.sh creates a new file named "mix.wav" which has the two WAV files combined with a cross-fade, and I then replace podcast.wav with this mix.wav file, then repeat for every WAV file in the podcast.
Here is the command that runs to cross fade and add one WAV file:
|./crossfade_cat.sh 5 /tmp/podcast.wav "/tmp/01-Modern Anguish-Norine Braun.wav" |
mv -f mix.wav /tmp/podcast.wav
The final step is to use "lame" to convert the WAV file into an mp3.
I also need to keep track of the start-time of each song as I go, subtracting the 5 seconds for the cross-fade, in order to build a human-readable playlist, which shows the order of songs, but more importantly, the start time for each song, so you can easily look up what's "currently playing."
Still for me to do is create the XML for these podcasts, so that people can subscribe and have the mp3 files downloaded automatically. That's work for me still to do, and the subject for another posting.
For those of you who are curious, I've made the script which makes a podcast .wav file available here: http://blogs.magnatune.com/buckman/mkpodcast.sh