Python Audio Tools History and Goals

History and Goals

These audio tools started as a small, ad-hoc collection of little C programs for audio conversion and tagging with the help of lots of external programs. If I needed a FLAC to MP3 converter, I'd build a 'flac2mp3' program for that purpose, and so forth.

It didn't take long before the C-based approach felt like more trouble than it was worth. Switching to Python made the collection easier to maintain with a negligible hit on performance. I transitioned the tools to an ad-hoc collection of little Python programs with an increasingly messy library of ad-hoc utility classes at its core. There was lots of programs like 'wav2mp3', 'flac2mp3', 'mp32wav' and so on and so forth. Each program did one thing reasonably well and I stuck with this approach for a very long time.

It couldn't possibly scale, however. A lot of formats had to go through a 'wav' intermediary for conversion because supporting every possible combination of formats would take an impractically large number of conversion programs. For example, going from Monkey's Audio to FLAC meant using 'ape2wav' and then 'wav2flac' - not to mention an 'ape2xmcd' if you wanted to port over the metadata.

It was a mess and needed a complete rewrite.

Learning from my mistakes, I shifted support for audio formats from external programs to an internal interface in the core library. Before, a tool like 'flac2mp3' had to use FLAC helper functions and MP3 helper functions from the core in order to make the conversion happen. Now, I had FlacAudio and MP3Audio classes, each a subclass of AudioFile and implementing AudioFile's interface. Since everything was now an AudioFile as far as programming the tools was concerned, I no longer needed 'flac2*' tools, or 'wav2*' tools, or 'mp32*' tools; the source audio format could be figured out by the Audio Tools themselves and would be compatible with any target format.

Since the targets were also nothing but AudioFile-compatible objects, I decided to take it one step further and eliminate the '*2flac' tools, the '*2wav' tools, the '*2mp3' tools and so forth; the target audio format could be a simple command-line option. 'wav2flac', 'wav2mp3', 'flac2wav', 'mp3wav', 'ape2wav', 'wav2ogg', 'ogg2wav' and more were all replaced by the 'track2track' program which not only worked just as well but was not as fragile should something in the core change.

That led the way to seeking out and adding more audio formats. This, in turn, led to the formats documentation. As I've learned the basics of how audio formats are laid out internally, I've added that information to the Python Audio Tools documentation as typeset notes. My hope is that a consistent set of format documentation will be useful to others who wish to write their own programs to handle audio.

I would like to do more in the future, however.

In an effort to reduce external dependencies and improve the audio format documentation, I would like to implement complete audio codecs and document them. This is guaranteed to be slower than using a bunch of optimized external programs (which is why I'll keep it optional), but I'd be happier knowing how the formats really work rather than treat them as data fed through "black boxes". I have no ETA on this implementation, but it remains a long-term goal.