Sound and music: hardware, software and file formats

By Jon: First published in Online Currents 2004 – 19(4) 6-87

Like computer graphics, sound and music files have developed outside the restraining influences of a single monolithic company, and as a result there are many different formats and almost as many ways to record them and play them back. In this article I will describe some of the most popular formats and software, and look at the hardware designed for playing sound, and especially music.


Sound cards

The earliest personal computers had a single monophonic speaker with a limited range which was mainly used for producing beeps during games or when the computer was in trouble. The first major development in sound technology came in the mid-1980s with the arrival of the SoundBlaster card, which allowed the user to attach external stereo speakers and to plug in a microphone and a ‘Line in’ plug from cassette players or other sources of sound. This made possible the use of sound in games and opened up the PC as a device for music fans and composers.

Sound cards today are usually included as part of a PC package. Most now come with a MIDI port and ‘wavetables’ (see below). Some can be linked up with modems to provide telephone answering services including voice mail and playing prerecordedmusic while a caller is on hold. Simple speakers and microphones are available for less than $10, while audiophiles can buy PC speakers as large and expensive as those of any hi-fi buff.

Sound controls

A volume control for your PC will normally appear in the tray at the bottom left of the screen. (If it is missing it can be made to appear through Control Panel/Sounds and Audio Devices/Volume.) Clicking once on this brings up a volume control slider. Double-clicking brings up volume control sliders for each possible source of sound or music output on your PC, usually including MIDI, CD playback, Wave (recorded sounds stored on your PC) and ‘Line in’.

Selecting Options/Properties/Recording will switch the display over to the sound and music input controls, including the microphone volume control. This determines the volume at which sound coming in is received by the computer. Most programs that use the microphone come with some sort of calibration system that sets this control to a suitable level.

MIDI files

There is a major division in sound file formats between recorded formats, which involve the storage for playback of an actual sound, and instructional formats, in which what is stored is not a sound itself but codes representing a sequence of predefined sounds. When an instructional format file is played back, the sounds are supplied by the computer (or other instrument) itself.

The best-known instructional format is MIDI (Musical Instrument Digital Interface) which has been used for over a decade to encode and store musical performances from electronic instruments, especially keyboards. MIDI is based on the Western musical scale and represents musical events (e.g. a keypress) in terms of the note played and its duration as a fraction of the bar length. This makes it very easy to play a melody by ear on a MIDI keyboard and have a computer display and print it as standard musical notation. The converse is also true; orchestral compositions on paper can be entered into a computer and played back musically either by the computer or by an attached MIDI instrument.

Nearly all new PCs have MIDI capabilities built into their sound cards. These will include a socket for the connection of a MIDI instrument cable, and one or more ‘wavetables’ of predefined sounds from various instruments which can be played back at appropriate lengths and pitches. The standard General MIDI wavetableincludes 128 of these samples including instruments such as Acoustic Grand Piano, Marimba, Ocarina, Bagpipe and Bird Tweet. Multiple MIDI channels make it possible to have up to 16 instruments playing simultaneously, including untuned percussion instruments such as woodblocks and triangles.

MIDI files are extremely compact: a 30-minute Bach sonata, for instance, takes up less than 150kb. This, and the ease with which they can be transposed, arranged for other instruments and converted to musical notation, make them extremely useful for traditional composers and musicians. Most major classical works are now available on the Internet as free MIDI downloads, and MIDI files of music which is still in copyright can be purchased over the Internet and elsewhere. An excellent shareware MIDI/musical notation program is on the Web athttp://www.noteworthysoftware.com.

Sequenced file formats

The idea behind MIDI can be applied to other sound collections as well. Many composers now have large libraries of samples or loops – small fragments of sound collected from instruments or other sources in the real world – which can be modified in length and pitch and strung together using a sequencing program. Sequencing programs can be complicated and expensive, but simple sequencers are readily available on the Internet and elsewhere for those who wish to experiment, including several designed for children to use.

There are many proprietary sequencer file formats corresponding to the large number of programs in this area. Sequenced files are normally converted into a recorded audio format like .WAV or .MP3 for distribution or sale.

Recording sounds on a PC

The simplest way to record sound is using the Windows Sound Recorder, a basic but free program that comes with Windows. You can usually find it underPrograms/Accessories/Entertainment. This opens up a small panel with a display window showing an oscilloscope view of the incoming sounds, and containing record, stop and playback buttons. There are also menu commands allowing for some basic editing – adding echo or reversing the sound, for instance. The Windows Sound Recorder can be used to record via a microphone or via the Line in connection – e.g. by connecting a cassette player.

File properties and formats

Sampling rate

Sound in the real world is a continuous sequence of tones. Computers cannot store all the information in a sound, so they sample it, taking ‘snapshots’ of what the sound is like at that point. When the snapshots are played back in rapid succession the brain hears a continuously changing sound.

The quality of the sound is thus determined by the sampling rate, which can extend from 8 KHz (8 kilohertz, or 8,000 samples per second) to 48 KHz or more. Naturally, the higher the sampling rate, the more space the file will take up on disk. Sampling rates may also be indicated as ‘CD quality’, ‘Radio quality’, etc. Stereo recording on a computer will take up twice as much space as a mono recording of the same quality.

File sizes and compression

Until recently the de facto standard for recorded sound files was the WAV file format. This is adjustable with regards to mono and stereo and the sampling rate, so WAV files can vary considerably in size, but a high quality stereo musical recording typically takes about 10Mb of storage space per minute of its length. About six years ago, however, the compressed MP3 format was developed, which gives good sound quality while providing much smaller files. A saving of 90% or more is possible, so a computer CD storing MP3 files can contain up to 650 minutes of music. Most music files distributed through the Internet (legally or otherwise) are in .MP3 format.

Other developers have since produced their own compressed formats – notably Microsoft, with WMA (Windows Media Audio), and RealOne, with RMJ. Users are often reluctant to use these as they are tied to specific programs and sometimes to specific types of players, while MP3, being an open format, is more generic.

Speech production

Getting a computer to read text aloud has been difficult, but modern reading programs are very sophisticated and have a large vocabulary of words and sounds to draw from. Text-reading programs are now in use on many telephone systems, for instance, where they can hold a ‘conversation’ with a user about airline bookings or bank statement details. Reading programs for ordinary users offer a variety of voices and pitches and can be taught to learn from their mistakes. They are particularly useful for the visually impaired.

A free shareware reading program called ReadPlease is available on the Web (http://www.readplease.com) for those who would like to hear one in action.

Understanding speech

Speech understanding programs vary from simple systems which interpret a limited number of orders (‘close the program’, ‘start Word’) to dictation programs which will attempt to recognise anything the user might say. Extensive training is required for this, and even a fully-trained speech recognition program will not always correctly ‘hear’ or interpret what the user has said. Some users will get consistently better results than others, and it also requires a powerful computer system to carry out the processing required. Nonetheless, there are several speech-recognition programs on the market, notably those from Dragon and IBM, and Microsoft has incorporated limited speech-recognition facilities into Office XP and Office 2003.

Music tracks

The vast majority of sound files are copied (‘ripped’) from music CDs and converted into formats that are compatible with computer use. This generally involves some sort of compression, either with MP3 or into a proprietary format like .WMA or .RMJ.

The legal situation in Australia is more restrictive than in many other countries, and technically it is illegal to copy any music track. Music distributors have indicated that they are happy with copies being made for backups and personal use, but will take action where copies are given away or sold to other people.

Many free programs including the Windows Media Player and the popular RealOnesystem now allow users to ‘rip’ copies of CD tracks on to their own hard disks,. Both of these use their own proprietary compression formats (WMA and RMJ respectively) and don’t support MP3, making the tracks they produce impossible to use on most external CD players and elsewhere. A reliable program which will produce MP3 files is Apple iTunes for Windows, but this will only run on a Windows XP or Macintosh system. MediaMonkey (http://www.mediamonkey.com) is a popular and user-friendly freeware ripper.

Information about the album and tracks being ripped can be typed in or – if an Internet connection is running –the program can look it up on a CDDB database. These are huge sets of information about albums, performers and track names, and nearly every CD ever made is on them.

Ripped files are usually stored in a subfolder representing the album within a folder representing the performer: e.g. Kraftwerk/The Mix/04 Autobahn.MP3. The ripping speed will depend on your equipment and the quality of the CD, but will usually be much faster than actually playing the track.

Tracks you burn from CDs will be automatically added to a Media Library. This appears in iTunes as a long list of track which can be sorted by name, album, performer or genre. Other files can be brought into this library with File/Add. Songs that belong together can be combined into a playlist. This can then be called up and played back at any time. Songs in a playlist can be played in their natural order or shuffled to play in a random order. Songs stored on external media (e.g. a CD or data DVD) can also be used in playlists.

Burning a CD

If you have a CD burner then there are two ways to produce CDs from ripped material. The first is to burn (create) a standard music CD which will work in music CD players (and DVD players) everywhere. However, because of the limitations of CD formats, you will only be able to fit about 65 minutes of music on each of these. The second is to copy your MP3 files directly across to a data CD. This means that they can be played back in other PCs, and in the increasing number of CD players and DVD players which are set up to play CDs containing MP3’s (a few are also beginning to support Microsoft WMA format). Because MP3 files are compressed, you will be able to fit many times as much music on one of these CDs. They will not, however, play on older (pre-2003) music devices.

Portable MP3 players

These allow the user to carry hundreds or thousands of tracks around at once in MP3 format and replay them, usually through headphones. Portable MP3 players can be grouped into four classes:

  1. Portable CD players which play MP3 tracks – Newer CD players often have the capacity to play MP3s (and sometimes WMA files). Thus a user can carry ten or more hours of music with them on a single data CD. Prices for these have recently dropped below $AU100.
  2. PDAs (personal digital assistants) which play music files. Many standard PDAs like the Palm Pilot have the capacity to play music files, usually in MP3 format. These can be copied to the PDA like text and other files when it is connected to your PC. A large memory card will be needed to store any length of music. Some PDAs use proprietary formats in an attempt to prevent copyright violations; e.g. the Palm Tungsten requires RealPlayer.
  3. Specialised MP3 players with electronic (RAM) memory. These tiny devices may hold up to 128Mb or more of memory, and this can often be supplemented with memory cards. New music files can be copied across from a PC at any time.
  4. Specialised MP3 players with hard disks. The main representative here is the Apple iPod which can hold up to 10,000 MP3 files on a small inbuilt hard disk. Other manufacturers like BenQ are now also entering the market, and Apple has announced that a ‘mini iPod’ is on the way. Users can store an entire music library in a package the size of a mobile phone. Both these and the RAM- based players can be used to store and transport other kinds of computer files as well.

Tracks online

After many attempts to restrict illegal copying, music distribution companies have finally started to make their material legally available online. Enormous amounts of music are available legally for free through sites like Epitonic(http://www.epitonic.com), which acts as a showcase for new bands. Epitonic tracks are neatly organised into genres and ranked by popularity, making it relatively easy for users to find music they like. A similar site is at http://www.mp3.com.

Paid music downloads were launched in a big way by Apple iTunes (http://www.itunes.com) which allows users in the US to buy for $US0.99 music tracks which can be played on their computer, burnt on to an audio CD, or loaded on to an iPod device. Copying restrictions mean that the file can only be reproduced three times (although there are already indications that this limitation has been ‘cracked’).iTunes has attracted enormous interest other companies have been quick to follow suit.

A similar service is available in Australia via a deal between Destra and CD sellers like Chaos Music (http://www.chaosmusic.com), though the number of tracks is limited and the price of $1.99 is 44% higher than the US equivalent. This should drop as other sellers enter the Australian market. Telstra BigPond(http://www.bigpond.com) is also offering music downloads, with tracks available for $1.99 to the public and $1.49 to its BigPond subscribers.

Internet radio

One interesting development has been the use of the Internet to broadcast the kind of programs that in the past have been transmitted by radio. These include interviews and other verbal material, but are mainly made up of music. Internet radio can be heard through a standard dial-up connection, but better quality is obtained through broadband. Unlike real radio, Internet radio can be heard around the world. Like real radio, it is usually provided free and often supported by advertising. Most music playing software includes access to Internet radio stations.

Internet radio arrives on the PC in the form of streaming audio; a file that is played back while it is being downloaded. Each station has one or more URLs to which the user can connect, corresponding to its broadcast streams. For streaming audio to work, the download speed should equal or exceed the rate at which it is played back; otherwise there will be gaps in the sound. The program will maintain a buffer of a few seconds to try and avoid gaps of this kind.

Some Internet radio stations are also real radio broadcasters, like the ABC’s Radio National (http://www.abc.net.au/rn); others operate on the Internet only. Of these, some archive their programs so that users can play them back later. Some popular radio shows like the Prairie Home Companion (http://www.prairiehome.org) maintain archives going back over several years.

Once radio stations begin to collect their material and make it available at any time rather than only at the time of broadcast, the distinction between ‘Internet radio’ and ‘sound files for download’ becomes unclear. Internet music radio may go out of fashion as downloaded music becomes more readily available. There may still be a market, however, for news, topical talks and documentary programs.

The Future

The main driving force in the MP3 revolution has been music, and especially popular music. In the last year or two MP3-playing capabilities have found their way into CD players, DVD players, PDAs and other entertainment devices. The enormous advantages of compressed music formats must ultimately force the music industry into massive changes. Meanwhile other audio material – old radio shows, for instance – are quietly appearing in the same portable formats. New technology in this area often benefits disabled people; for instance, it is possible now for anyone to make a ‘talking book’ by scanning in an existing printed book, having the PC read it back with a speech program and recording the output for conversion to a CD. In general terms we can expect things to get smaller, lighter and cheaper all the time.