|
Audio system measurements are made for several purposes. Designers take measurements so that they can specify the performance of a piece of equipment. Maintenance engineers make them to ensure equipment is still working to specification, or to ensure that the cumulative defects of an audio path are within limits considered acceptable. Some aspects of measurement and specification relate only to intended usage. For example, magnetic tape speeds and types, interface specifications, or power output.
Others are intended as an index of the quality, or 'fidelity', of reproduction perceivable by a human. It is important that such measurements accomodate psychoacoustic principles, so that they truly measure the system in a way that is 'subjectively valid'. Humans don't hear very low levels of sound, so there is reason to be concerned about the the precise nature of noise at very low levels, than at higher levels.
Subjectivity and frequency weighting
Measurements based on psychoachoustics, such as the measurement of noise, often use a weighting filter. It is well-established that human hearing is more sensitive to some frequencies than others, as demonstrated by equal-loudness contours, but it is not well appreciated that these contours vary depending on the type of sound. The measured curves for pure tones, for instance, are different from those for random noise. The ear also responds less to short bursts, below 100 to 200 ms, than to continuous sounds [1] such that a quasi-peak detector has been found to give the most representative results when noise contains click or bursts, as is often the case for noise in digital systems. [2] For these reasons a set of subjectively valid measurement techniques have been devised and incorporated into BS, IEC, EBU and ITU standards. These newer methods of audio quality measurement are used by broadcast engineers throughout most of the world, as well as by some audio professionals, though the older A-weighting standard for continuous tones is still commonly used by others. [1] Subjectively valid methods came to prominence in consumer audio in the UK and Europe in the 1970s, when the introduction of compact cassette tape and DBX and Dolby noise reduction techniques revealed the unsatisfactory nature of many basic engineering measurements. The specification of weighted CCIR-468 quasi-peak noise, and weighted quasi-peak wow and flutter became particulary widely used and attempts were made to find more valid methods for distortion measurement.
No single measurement can assess audio quality. Instead, it is usual to take a series of measurements to test for the various types of degradation that can reduce fidelity. Thus, when testing an analogue tape machine it is necessary to test for wow and flutter and tape speed variations over longer periods, as well as for distortion and noise. When testing a digital system, testing for speed variations is normally considered unnecessary given the nearly ubiquitious accurate clocks in digital circuitry, but testing for aliasing and timing jitter is often desirable, as these have caused audible degradation in many systems. The claim is often made that different methods of measuring noise, or distortion, are better suited to different items of equipment is not widely believed among professional audio engineers.
Once subjectively valid methods have been shown to correlate well with listening tests over a wide range of conditions, then such methods are generally adopted as preferred. But it's important to realise that engineering methods are not always sufficient to when comparing like with like. One CD player, for example, might have higher measured noise than another CD player when measured RMS, or even A-weighted RMS, yet sound quieter and measure lower when 468-weighting is used. This could be because it has more noise at high frequencies, or even at frequencies beyond 20 kHz, both of which are less important since human ears are less sensitive to them. See noise shaping.) This effect is how Dolby B works and why it was introduced. Cassette noise, which was predominately high frequency and unavoidable given the small size and speed of the recorded track could be made subjectively much less important. The noise sounded 10 dB quieter, but failed to measure much better unless 468-weighting was used rather than A-weighting
Measurable performance
[edit] Analog electrical
Frequency response
The signal should be passed at least over the audible range (usually quoted as 20 Hz to 20 kHz) with no significant peaks or dips. The human ear can discern differences in level of about 3 dB in some frequency ranges, so peaks and troughs must be less than this. Much modern equipment is capable of less than ±1 dB variation over the entire audible frequency range. Rapid variations over a small frequency range (ripple), or very steep rolloffs are considered undesirable as they can correspond to resonances associated with energy storage which produce delayed echoes and hence colouration, or decreased quality, of the sound.
Total harmonic distortion (THD)
In music material, there are distinct tones, and some kinds of distortion involve spurious double or triple the frequencies of those tones. Such harmonically related distortion is called harmonic distortion. For high fidelity, this is usually expected to be < 1% for electronic devices; mechanical elements such as loudspeakers usually have inescapable higher levels. Low distortion is relatively easy to achieve in electronics with use of negative feedback, but the use of high levels of feedback in this manner has been the topic of much controversy among audiophiles — see electronic amplifier. Essentially all loudspeakers produce more distortion than electronics, and 1–5% distortion is not unheard of at moderately loud listening levels. Human ears are less sensitive to distortion in the bass frequencies, and levels are usually expected to be under 10% at loud playback. Distortion which creates only even-order harmonics for a sine wave input is sometimes considered less bothersome than odd-order distortion.
Output power
Output power for amplifiers is ideally measured and quoted as maximum sinewave (ie, RMS) power output per channel, at a specified distortion level at a particular load, which by convention and government regulation, is considered the most meaningful measure of power available on music signals, though real, non-clipping music has a high peak-to-average ratio, and usually averages well below the maximum possible. The commonly given measurement of PMPO (peak music power out) is largely meaningless and often used in marketing literature; in the late 1960s there was much controversy over this point and the US Government (FTA) required that RMS figures be quoted for all high fidelity equipment. Music power has been making a comeback in recent years. See also Audio power.
Power specifications require the load impedance to be specified, and in some cases two figures will be given (for instance, a power amplifier for loudspeakers will be typically measured at 4 and 8 ohms). Any amplifier will drive more current to a lower impedance load. For example, it will deliver more power into a 4-ohm load, as compared to 8-ohm, but it must not be assumed that it is capable of sustaining the extra current unless it is specified so. Power supply limitations may limit high current performance.
Intermodulation distortion (IMD)
Distortion which is not harmonically related to the signal being amplified is intermodulation distortion. It is a measure of the level of spurious signals resulting from unwanted combination of different frequency input signals. This effect results form non-linearities in the system. Again, sufficiently high levels of negative feedback can reduce this effect, as for instance in an amplifier. Many believe it is better to design electronics to minimise feedback levels. Low intermodulation equipment is difficult to design while meeting other high accuracy requirments. Intermodulation in speaker drivers is, as with harmonic distortion, almost always larger than in most electronics. Reducing cone excursion is one way to reduce intermodulation distrotion as is designing and building crossovers so that out of band signals are reduced quickly. This raises other problems related to crossover designs and is an example of the tradeoffs which must be made in high quality audio design.
Noise
The level of unwanted noise generated by the system itself, or by interference from external sources added to the signal. Hum usually refers to noise only at power line frequencies (as opposed to broadband white noise), which is introduced through induction of power line signals into the inputs of gain stages. Or from inadequately regulated power supplies.
Crosstalk
The introduction of noise (from another signal channel) caused by stray inductance or capacitance between components or lines. Crosstalk reduces, sometimes noticeablly, separation between channels (eg, in a stereo system). It is given in dB relative to a nominal level of signal in the path receiving interference. Crosstalk is normally only a problem in equipment in which several channels are handled in the same chassis.
Common-mode rejection ratio (CMRR)
All electronic equipment with inputs is susceptible to this problem. In balanced audio systems, there are equal and opposite signals (difference-mode) in inputs, and any interference imposed on both leads will be subtracted, canceling out that interference (ie, the common-mode). CMRR is a measure of a system's ability to ignore any such interference and especially hum which arises at its input. It is generally only significant with long lines on an input, or when some kinds of ground loop problems exist. Unbalanced inputs do not have common mode resistance; induced noise on their inputs appears directly as noise or hum.
Dynamic range and Signal-to-noise ratio (SNR)
The difference between the maximum level a component can accomodate and the noise level it produces. Input noise is not counted in this measurement. It is measured in dB.
Dynamic range refers to the ratio of maximum to mimimum loudness in a given signal source (eg, music or programme material), and this measurement also quantifies the maximum dynamic range an audio system can carry. This is the ratio (usually expressed in dB) between the noise floor of the device with no signal and the maximum signal (usually a sine wave) that can be output at a specified (low) distortion level.
Since the early 1990s it has been recommened by several authorities including the Audio Engeineering Society that measurements of dynamic range be made with an audio signal present. This avoids questionable measurements based on the use of blank media, or muting circuits.
Signal-to-noise ratio (SNR), however, is the ratio between the noise floor and an arbitrary reference level or alignment level. In "professional" recording equipment, this reference level is usually +4 dBu (IEC 60268-17), though sometimes 0 dBu (UK and Europe - EBU standard Alignment level). 'Test level', 'measurement level' and 'line-up level' mean different things, often leading to confusion. In "consumer" equipment, no standard exists, though −10 dBV and −6 dBu are common.
Different media characteristically exhibit different amounts of noise and headroom. Though the values vary widely between units, a typical analogue cassette might give 60 dB, a CD almost 100 dB. Most modern quality amplifiers have >110 dB dynamic range, which approaches that of the human ear, usually taken as around 160 dB. See Programme levels.
Phase distortion, Group delay, and Phase delay
A perfect audio component will maintain the phase coherency of a signal over the full range of frequencies. Phase distortion can be extremely difficult to reduce or eliminate. The human ear is largely insensitive to phase distortion, though it is equisitly sensitive to relative phase relationships within heard sounds. For many this figure lacks importance; however, there are many who argue its significance. Multi-driver loudspeaker systems have complex phase distortions, caused by crossovers, by driver placment relative to other drivers, and by internal driver characteristics.
Transient distortion
A system may have low distortion for a steady-state signal, but not on sudden transients. This problem can be traced to amplifier power supplies in some instances, to insufficient high frequency performance in amplifiers, to negative feedback in amplifiers, or in loudspeakers to the mass and resonances of drivers and enclosures. Related measurements are slew rate and rise time. Transient distortion can be hard to measure. Many otherwise good power amplifier designs have foudn to have inadequate slew rates, by modern standards. Most loudspeakers generate significant amounts of transient distortion, though some designs are less prone to this (e.g. electrostatic loudspeakers, plasma arc tweeters, ribbon tweeters).
Damping factor
A higher number is generally thought better. This is a measure of how well a power amplifier can control the undesired motion of a loudspeaker driver due largely to mechanical reactance. The amplifier must be able to damp out resonances caused by the mechanical motions (eg, inertia) of the moving parts of the speaker. For the common voice coil drivers, this essentially involves ensuring that the output impedance of the amplifier is close to zero. Damping factor is actually a relative way of specifying the output impedance of an amplifier with a particular load. It is affected by the cables used to connect the speakers to the amplifier, and by the amount of negative feedback especially in solid state amplifiers.
[edit] Mechanical
Wow and flutter
These measurements are related to physical motion in a component, largely the drive mechanism of analogue media, such as vinyl records and magnetic tape. "Wow" is slow speed (a few Hz) variation, caused by longer term drift of the drive motor speed, whereas "flutter" is faster speed (a few tens of Hz) variations, usually caused by mechanical defects such as out-of-roundness of the capstan of a tape transport mechanism. The measurement is given in % and a lower number is better.
Rumble
The measure of the low frequency (many tens of Hz) noise contributed by the turntable of an analogue playback system. It is caused by imperfect bearings, by uneven motor windings, by vibrations in driving bands in some turntables, by room vibrations (eg, from traffic) which is transmitted by the turntable mounting and so to the phono cartridge. A lower number is better.
[edit] Digital
Note that digital systems do not suffer from many of these effects at a signal level, though the same processes occur in the circuitry, since the data being handled is symbolic. As long as the symbol survives the transfer between components, and can be perfectly regenerated (eg, by pulse shaping techniques) the data itself is perfectly maintained. The data is typically buffered in a memory, and is clocked out by a very precise crystal oscillator. The data usually does not degenerate as it passes through many stages, because each stage regenerates new symbols for transmission.
But digital systems have their own problems. Digitizing adds noise which is measurable, and which depends on the resolution ('number of bits") of the system, regardless of other quality issues. Clock timing errors (jitter) result in non-linear distortion of the signal. The quality measurement for a digital system centers on the probability of an error in transmission or reception. Otherwise the quality of the system is defined more by design intent (ie, specifications) than measurements, such as the sample rate and bit depth. In general, digital systems are much less prone to error than analog systems. However, nearly all digital systems contain analog inputs and/or outputs, and certainly all of those which interact with the analog world do so. These analog components of the digital system can suffer analog effects and potentially compromise the integrity of a well designed digital system.
Jitter
A measurement of the variation in period between clock cycles, which should theoretically be exactly the same period. Less jitter is better.
Sample rate
A specification of the rate at which measurements are taken of the analog signal. This is measured in samples per second, or hertz. A higher sampling rate allows a greater total bandwidth or flatband frequency response. It can also reduce the effects of jitter.
Bit depth
A specification of the accuracy of each measurement. For example, a 3-bit system would be able to measure 23 = 8 different levels, so it would round the actual level at each point to the nearest representable. Typical values for audio are 8-bit, 16-bit, 24-bit, and 32-bit. The bit depth determines the theoretical maximum signal-to-noise ratio or dynamic range for the system. It is common for devices to create more noise than the minimum possible noise floor, however. Sometimes this is done intentionally; dither noise is added to decrease the negative effects of quantization noise by converting it into a higher level of uncorrelated noise.
To calculate the maximum theoretical dynamic range of a digital system, find the total number of levels in the system. Dynamic Range = 20·log(# of different levels). Note: the log function has a base of 10. Example: An 8-bit system has 256 different possibilities, from 0 – 255. The smallest signal is 1 and the largest is 255. Dynamic Range = 20·log(255) = 48 dB.
Sample accuracy/synchronization
Not as much a specification as an ability. Since independent digital audio devices are each run by their own crystal oscillator, and no two crystals are exactly the same, the sample rate will be slightly different. This will cause the devices to drift apart over time. The effects of this can vary. If one digital device is used to monitor another digital device, this will cause dropouts in the audio, as one device will be producing more or less data than the other per unit time. If two independent devices record at the same time, one will lag the other more and more over time. This effect can be circumvented with a wordclock synchronization.
Linearity
Differential non-linearity and integral non-linearity are two measurements of the accuracy of an analog-to-digital converter. Basically, they measure how close the threshold levels for each bit are to the theoretical equally-spaced levels. |
|