Scherz & Monk, Chapter 16

Audio Electronics

Audio is a chain of energy conversions: pressure waves become tiny voltages, amplifiers boost them, and a speaker pushes air again. Because the ear spans twelve orders of magnitude in intensity, the whole field runs on the logarithmic decibel and on one humble fact — a speaker is just an 8-ohm resistor that obeys Ohm's law.

Prerequisites: Ohm's law & series/parallel R + AC amplitude/RMS + op-amp gain (Ch 8) + RC filters fc=1/(2πRC) (Ch 9)
17
Chapters
5
Simulations
0
Assumed Knowledge

Chapter 0: The Quiet Megaphone

You wire up a weekend megaphone. An electret microphone catches your voice, an LM386 chip on a 9-volt battery boosts the signal, and an 8Ω speaker turns it back into sound. You flip the switch. It works — but it is disappointingly quiet. People three meters away cup their ears.

So you try two "obvious" fixes. First, you reach for a second speaker — a little tweeter — and wire it in parallel with the first to add crispness and volume. Within a minute the bass speaker is uncomfortably hot, the battery sags, and the sound turns to mush. What went wrong?

Every symptom here is a door into this chapter. The quietness is about gain — the LM386 has a fixed voltage gain of 20, and most builders never learn it can be cranked to 200. The overheating is about impedance: two 8Ω speakers in parallel present 4Ω, which at the same voltage doubles the current and doubles the power the amp must source. And the mush is about filtering: you cannot just parallel a woofer and a tweeter — each needs its own slice of the frequency band, carved out by a crossover.

One load, three lessons. The speaker is the crux. Treat it as a plain resistor (its nominal impedance, usually 8Ω) and Ohm's law tells you everything: the current the amp delivers, the power dumped into the cone, and the heat. Halving the impedance from 8Ω to 4Ω at a fixed 4 V output takes power from 2 W to 4 W and current from 0.5 A to 1 A. The amp was never designed for that. The megaphone did not break — physics did exactly what it always does.
The Megaphone Signal Chain

The full path from voice to sound. Move the gain slider to scale the boosted signal, and the load slider to halve the speaker impedance by paralleling a tweeter. Watch the current and heat climb as impedance drops.

LM386 gain 20
Speaker load (Ω)

By the end of this chapter you will be able to look at that hot speaker and predict the exact current before touching it — and design the megaphone so it never happens.

When you add a second 8Ω speaker in parallel with the first, why does the amplifier get hot?

Chapter 1: Sound as a Signal

Sound is a pressure wave traveling through air — regions of compression and rarefaction marching outward from a vibrating source at about 343 m/s. Audio electronics never touches the air directly; it works with an electrical analog of that wave: a voltage that wiggles up and down in lockstep with the pressure. The whole discipline is the art of capturing, reshaping, and re-emitting that wiggle faithfully.

Three properties of the waveform map onto three things you hear. The frequency (cycles per second, Hz) is pitch — how high or low the note. The amplitude (the height of the swing) is loudness. And the shape of the wave — the mix of overtones riding on the fundamental — is timbre, the quality that distinguishes a violin from a flute playing the same note.

The human ear responds from roughly 20 Hz to 20 kHz, a thousand-to-one span, and it is most sensitive around 1–2 kHz where speech lives. A pure tone is a single sine. Real sounds are sums of sines — Fourier's insight — so a 200 Hz note from a trumpet is a 200 Hz fundamental plus partials at 400, 600, 800 Hz and beyond. An amplifier that boosts all those partials by the same factor preserves timbre; one that favors some over others colors the sound.

Worked example: counting cycles

Middle C is about 262 Hz. In the time a 20 kHz cymbal shimmer completes one cycle (50 μs), middle C has barely moved — it needs 1/262 s ≈ 3.8 ms per cycle, about 76 times longer. This is why bass and treble demand different speakers: a woofer's heavy cone can shove enough air for a slow 3.8 ms swing, but it physically cannot reverse direction every 50 μs. Mass that helps the bass kills the treble.

One number, two worlds. A 1 kHz tone and a 1 kHz square wave have the same pitch but wildly different timbre, because the square wave hides odd harmonics at 3 kHz, 5 kHz, 7 kHz… The ear hears the fundamental as pitch and the harmonic stack as "brightness." Distortion in an amplifier is exactly the unwanted manufacture of new harmonics — which is why a clipped sine sounds harsh: clipping pushes a pure tone toward a square wave.
Sound-Wave Shaper

Set the frequency (pitch) and amplitude (loudness). The sine on the left drives the speaker cone on the right — bigger amplitude pushes the cone farther; higher frequency packs more cycles into the window. Readouts name the pitch register and the band.

Frequency (Hz) 440
Amplitude 0.60
Two waveforms have identical frequency but different shapes (one sine, one with strong harmonics). What differs to your ear?

Chapter 2: Decibels & Loudness

The ear is logarithmic. The quietest sound you can detect carries about 10−12 W/m2; a sound on the edge of pain carries about 1 W/m2. That is a ratio of a trillion. Writing loudness on a linear scale would be hopeless — so audio uses the decibel, a logarithm of a ratio, which compresses that trillion-fold range into a tidy 0 to 120.

For sound intensity (a power-like quantity), the sound-pressure-level decibel is:

dB SPL = 10 · log10(I / I0),   I0 = 10−12 W/m2

The reference I0 is the threshold of hearing, so 0 dB is "barely audible," 60 dB is conversation, 90 dB is a lawnmower, and 120 dB is a jet at takeoff. Every 10 dB is a ten-fold jump in intensity that the ear perceives as merely "about twice as loud."

Power form vs voltage form

In a circuit you compare two powers or two voltages. For power, dB = 10·log10(P2/P1). But power goes as the square of voltage (P = V2/R), so when you compare voltages the factor of 2 from the square comes out of the log and you get a leading 20:

dB = 10 · log10(P2/P1) = 20 · log10(V2/V1)

Two milestones to memorize: +3 dB is twice the power (10·log 2 = 3.01), and +6 dB is twice the voltage (20·log 2 = 6.02) — which is also four times the power. Doubling your amplifier's output voltage feels like a solid step up; doubling its power barely registers.

Worked example: the LM386's gain in dB

An LM386 at its default gain of 20 turns a 10 mV input into 0.2 V out. In decibels that voltage gain is 20·log10(20) = 20·(1.301) ≈ 26 dB. Strap the boost components across pins 1 and 8 to raise the gain to 200, and the dB becomes 20·log10(200) = 20·(2.301) ≈ 46 dB. The 10× jump in gain is exactly a +20 dB step — logs turn multiplication into addition, so ×10 always adds 20 dB of voltage gain.

Why log saves your sanity. A 100 W amp is not "ten times louder" than a 10 W amp — it is 10·log(10) = 10 dB louder, perceived as roughly twice as loud. Going from 10 W to 20 W (a 3 dB step, a doubling of power) is barely a noticeable nudge. This is the cruel arithmetic of loudspeakers: chasing the last few decibels of loudness costs enormous power, which is why efficiency and impedance matter more than raw wattage.
Decibel Calculator & Comparator

Enter two voltages (V2 vs reference V1). The gauge computes 20·log10(V2/V1), and the bar places the result on the real-world loudness scale from a whisper (20 dB) to a jet (120 dB). Toggle to compare powers instead (factor 10).

V₂ / P₂ (out) 40
V₁ / P₁ (reference) 10
You double the output voltage of your amplifier. By how many decibels does the signal rise?

Chapter 3: Microphones

A microphone is a transducer that runs the speaker in reverse: incoming pressure waves move a membrane, and that motion is converted into a tiny AC voltage — typically a few millivolts. There are two dominant ways to do the conversion, and the difference decides whether your mic needs a power supply.

Dynamic microphones

A dynamic mic glues a small coil to the back of a diaphragm and suspends it in a permanent magnet's field. When sound moves the diaphragm, the coil moves through the field and generates a voltage by electromagnetic induction (Faraday's law) — exactly like a speaker used backwards. Because it makes its own voltage, a dynamic mic needs no power supply. It is rugged, handles loud sources, and has a low output impedance (often < 600Ω), which is why it dominates live stages.

Condenser and electret microphones

A condenser mic makes the diaphragm one plate of a capacitor. As sound flexes the plate, the spacing — and thus the capacitance C — changes. Hold the charge Q fixed, and since V = Q/C, the voltage across the plates wiggles with the sound. But you must first put charge on the plates: a condenser mic needs a DC bias (phantom power, often 48 V). An electret is a clever condenser whose plate is permanently charged at the factory, so it needs only a trickle of current to run a built-in FET buffer — the 3-terminal capsule in nearly every phone, laptop, and the megaphone of Chapter 0.

Condenser: V = Q / C  →  C changes with sound  →  V changes (needs bias)

Worked example: why the electret needs a resistor

The electret capsule's internal FET acts like a current-controlled switch. Feed it through a load resistor — say 2.2 kΩ to the 9 V rail — and the audio appears as a voltage swing across that resistor. If the capsule idles at 0.5 mA, the resistor drops 0.5 mA × 2.2 kΩ = 1.1 V, biasing the output near 7.9 V with room to swing both ways. Pick the resistor too small and there is no swing; too large and the FET starves. This bias resistor is the reason an electret has three pins where a dynamic mic has two.

Impedance, not just voltage. Mics are classed as low impedance (< 600Ω) or high impedance (> 10 kΩ). Low-impedance mics drive long cables without picking up hum, which is why pro audio is low-Z and balanced. The output is feeble either way — millivolts — so the very next stage must be a high-gain, high-input-impedance preamp that loads the mic lightly while boosting it a hundredfold. That is Chapter 4.

Dynamic vs Electret: Capsule Comparator

Switch between a dynamic (coil + magnet, self-powered) and an electret (biased capacitor) capsule. The animation shows the diaphragm vibrating and the resulting output. Slide the sound level to see the millivolt output grow.

Sound level (dB SPL) 74
Why does a dynamic microphone need no power supply, while a condenser/electret does?

Chapter 4: Preamps & Voltage Gain

A few millivolts from a microphone is far too small to drive anything. The preamplifier raises that signal to a comfortable "line level" of around 1 V while loading the source gently. Its job is voltage gain, and the op-amp from Chapter 8 is the natural tool: high input impedance, low output impedance, and a gain you set with two resistors.

Non-inverting and inverting gain

The non-inverting configuration feeds the signal into the + input and sets gain with a feedback divider. The inverting configuration feeds the signal through an input resistor into the − input:

Anon-inv = 1 + R2/R1      Ainv = −R2/R1

The non-inverting circuit is favored for mic preamps because its input impedance is essentially the op-amp's own — megohms — so it barely loads the mic. The minus sign on the inverting gain just means the output is flipped top-to-bottom; for audio, a 180° flip is inaudible, so designers choose the topology for impedance reasons, not polarity.

Worked example: a 100× mic preamp

You want to lift a 10 mV mic signal to 1 V — a gain of 100. Using the non-inverting form, 1 + R2/R1 = 100 means R2/R1 = 99. Pick R1 = 1 kΩ and R2 = 100 kΩ (giving 1 + 100 = 101, close enough). In decibels that is 20·log10(100) = 40 dB. The 10 mV input becomes 10 mV × 101 ≈ 1.01 V — right at line level.

The AC-coupling capacitor and single-supply bias

Battery gear has one supply rail, not the ±15 V a textbook op-amp assumes. To swing an AC audio signal on a single 9 V rail, you bias the input to the midpoint (4.5 V) with a resistor divider, so the signal can swing up and down around it. But the mic's signal must ride on that 4.5 V without dragging its own DC level in — so you place a coupling capacitor in series. The cap blocks DC (its reactance is infinite at 0 Hz) while passing AC audio (low reactance at kilohertz). It also sets a low-frequency cutoff fc = 1/(2πRC) — too small a cap and you lose the bass.

The capacitor is a high-pass filter you forgot you built. A 1 μF coupling cap into a 10 kΩ input gives fc = 1/(2π·10000·1×10−6) ≈ 16 Hz — just below the audible band, perfect. Use 0.1 μF instead and fc jumps to 160 Hz, and your voice loses its body. Every coupling cap in the signal chain is silently rolling off bass; the art is keeping every corner frequency well below 20 Hz.
Op-Amp Preamp Gain

Set R1 and R2 for a non-inverting preamp. The display computes the gain 1 + R2/R1, its dB value, and shows a 10 mV input being lifted to line level — clipping at the rail if you push too far.

R₁ (kΩ) 1
R₂ (kΩ) 100
A non-inverting op-amp preamp uses R1 = 1 kΩ and R2 = 100 kΩ. What is its approximate voltage gain?

Chapter 5: Power Amplifiers & the LM386

The preamp gave you a clean 1 V signal, but try to drive an 8Ω speaker with it and nothing moves — the op-amp can supply only milliamps, and a speaker wants amps. You need a power amplifier: a stage with modest voltage gain but the muscle to source large current into a low-impedance load. The classic hobby chip is the LM386.

The LM386, by the numbers

The LM386 runs from a single supply of +4 V to +15 V (9 V is typical), draws a few milliamps quiescent, and internally biases its output to half the supply so the signal can swing both ways. Its default voltage gain is fixed at 20. Bridge a 10 μF capacitor across pins 1 and 8 and the internal feedback resistor is shorted, raising the gain all the way to 200. A resistor in series with that cap sets any gain between 20 and 200. That single capacitor is the answer to the "too quiet" half of the Chapter 0 megaphone.

Worked example: gain 20 to gain 200

Feed the LM386 a 10 mV signal. At gain 20 the output is 10 mV × 20 = 0.2 Vpk — that is 20·log(20) = 26 dB. Strap pins 1–8 to reach gain 200 and the output wants to be 10 mV × 200 = 2 Vpk — 46 dB, a 20 dB jump. But on a 9 V supply the output can swing only to about ±4 V around the 4.5 V bias, so a 2 Vpk swing fits, while a 20 mV input at gain 200 (wanting 4 Vpk) sits right at the edge of clipping. Push past the rail and the sine's tops flatten into a square — new harmonics, harsh distortion.

Efficiency: class A/B vs class D

A traditional class-A/B output stage (what the LM386 is) burns power as heat in its transistors whenever they are partly on — theoretical best is about 78% efficiency, often far less in practice. A class-D amp instead chops the signal into a high-frequency PWM stream (comparing the audio against a triangle wave), switches MOSFETs fully on or fully off, then low-pass filters the result back to audio. Because a fully-on or fully-off switch dissipates almost nothing, class-D reaches 90%+ efficiency — which is why every phone, laptop, and battery speaker uses it.

Clipping is distortion you can hear coming. An amplifier cannot output more voltage than its supply rails. Demand more and it simply stops following the input at the peaks — the sine's caps go flat. A flat-topped sine is mathematically a sine plus a stack of odd harmonics, so clipping manufactures high-frequency content. Worse, that content can fry a tweeter not built to absorb it. "Turn it down before it clips" is not just about taste — it protects the speakers.
LM386 Gain Knob (Showcase — clipping)

Toggle the pins-1&8 capacitor between gain 20 and gain 200, and set the input amplitude. The output waveform scales and clips against the ±4 V supply rails. Watch the dB readout and the harsh flat tops appear.

Input amplitude (mV) 10
The LM386's voltage gain is fixed at 20 by default. How do you raise it, and to what maximum?

Chapter 6: Speakers as Loads

Here is the single most useful simplification in audio: a speaker is just a resistor. Yes, a real driver has inductance and a resonance, so its impedance wiggles with frequency, but its nominal impedance — the number printed on the magnet, almost always 8Ω (sometimes 4Ω or 16Ω) — lets you treat it as a plain resistive load and apply Ohm's law to find current, power, and heat.

P = V2/Z = I2·Z = V·I,    I = Vout / Zspeaker

Worked example: 8Ω vs 4Ω at the same voltage

Your amp puts out 4 V RMS. Into the 8Ω speaker that is P = V2/Z = 16/8 = 2 W, drawing I = 4/8 = 0.5 A. Now you parallel a second 8Ω speaker. Two 8Ω in parallel make 4Ω, so at the same 4 V the power becomes 16/4 = 4 W and the current 4/4 = 1 A. You did not turn anything up — you halved the impedance, and that alone doubled both the current and the power the amplifier must source. That extra ampere is exactly the heat that cooked the Chapter 0 megaphone.

Series and parallel combinations

Speakers combine like resistors. Two 8Ω in parallel give 8/2 = 4Ω (lower — more current, riskier for the amp). Two 4Ω in series give 4 + 4 = 8Ω (safe). If you must run multiple drivers, series wiring keeps the impedance up and the amp happy; parallel wiring is the trap. Always know the combined impedance before you connect, and never take it below the amp's rated minimum (often 4Ω).

The frequency bands

No single cone covers 20 Hz to 20 kHz well, so speakers specialize. A woofer (heavy, large) handles bass below ~200 Hz; a midrange covers ~500–3000 Hz where voices live; a tweeter (light, small) handles the highs above. Send a tweeter the bass and its tiny cone tries to make huge excursions and tears itself apart; send a woofer the treble and its mass simply cannot keep up. Each driver must get only its band — which is the crossover, Chapter 7.

The amp does not "push" power — the load "pulls" it. An amplifier is a voltage source. It sets a voltage; the load's impedance then decides the current via I = V/Z, and the power follows. Lower the impedance and you demand more current for the same voltage. This is why specs read "50 W into 8Ω, 80 W into 4Ω" — the same amp delivers more power into a lower load because the load draws it, until the amp's current limit (or its heatsink) gives out.
Speaker Power Dial

Set the amp's output voltage and pick the speaker impedance (2 / 4 / 8Ω). Bars show power P = V2/Z, the current I = V/Z, and an amplifier "heat" gauge that climbs with current — watch it redline as you drop to 2Ω.

Output voltage (V RMS) 4.0
An amplifier outputs 4 V RMS into an 8Ω speaker. What power does it deliver?

Chapter 7: Crossovers & Filters

Now we solve the last megaphone failure: you cannot just parallel a woofer and a tweeter. You must split the spectrum — send the lows to the woofer and the highs to the tweeter — with a crossover, a pair of simple filters built from a capacitor and an inductor.

The two halves of the split

A capacitor's reactance falls as frequency rises (XC = 1/(2πfC)), so a series capacitor passes highs and blocks lows — a high-pass filter for the tweeter. An inductor does the opposite (XL = 2πfL), passing lows and blocking highs — a low-pass filter for the woofer. Choose the components so both filters reach their −3 dB point at the same crossover frequency fc, and the two response curves cross exactly there, handing the signal off cleanly.

fc = 1/(2πRC)  →  C = 1/(2π fc R),   L = R/(2π fc)

Worked example: an 1.8 kHz crossover into 8Ω

You want to cross over at fc = 1.8 kHz with 8Ω drivers. The tweeter's series capacitor is

C = 1/(2π · 1800 · 8) = 1/(90478) ≈ 11 μF

(the nearest standard value is about 8–10 μF — the book's practical pick). The woofer's series inductor is

L = 8/(2π · 1800) = 8/(11310) ≈ 0.71 mH

So a single 11 μF cap in series with the tweeter and a 0.71 mH coil in series with the woofer turn one input into a clean two-way split. Below 1.8 kHz the woofer dominates; above it the tweeter takes over; right at 1.8 kHz they share equally. The tweeter is now protected from the bass that would have destroyed it.

Why the curves cross at fc. At the cutoff, each filter is down 3 dB — passing 1/√2 ≈ 0.707 of the voltage, which is half the power. The woofer is fading out as the tweeter fades in, and at fc they contribute equally. Their powers sum back to the full signal, so the listener hears a seamless handoff with no dip and no bump. That −3 dB crossing point is the whole design target — matching C and L to land both filters there.
2-Way Crossover Splitter (Showcase)

Drag the crossover frequency to retune the cap and coil. The blue curve is the woofer's low-pass response, the warm curve is the tweeter's high-pass; they cross at fc at −3 dB. The input frequency slider drops a test tone and lights up which driver receives it. Live values of C and L are computed for 8Ω.

Crossover fc (Hz) 1800
Input tone (Hz) 500
For an 8Ω tweeter crossed over at 1.8 kHz, what series capacitor does the crossover need?

Chapter 8: Impedance, Connections & Summary

One last principle ties the chain together: how to connect stages so the signal survives. Old audio (and RF) obsessed over impedance matching — making source and load equal to transfer maximum power. Modern audio does the opposite: impedance bridging, where the load is made far larger than the source (the rule of thumb is load ≥ 10× source) to transfer maximum voltage.

Why bridging, not matching

When source resistance equals load resistance, the source's own internal resistance drops half the voltage — the load only ever sees half, a loss of 6 dB, and you have thrown away half your signal as heat inside the source. Since every stage after a mic cares about voltage, not power, you instead make each input impedance at least ten times the previous output impedance. A 600Ω mic into a 10 kΩ preamp loses almost nothing. The one place you still match is the speaker, because there you genuinely want power delivery.

Bridging: Zload ≥ 10 × Zsource  (max voltage); Matching: Zload = Zsource  (max power, −6 dB)

The whole chain, end to end

Mic (mV, low-Z)
Pressure → voltage. Dynamic self-powers; electret needs a bias resistor.
Preamp (×100, 40 dB)
Op-amp voltage gain; AC-coupled, high input-Z to bridge the mic.
Power amp (LM386, 20→200)
Modest voltage gain, big current. Biased to ½ supply; clips at the rails.
Crossover (C & L)
Splits the band: lows to woofer, highs to tweeter, crossing at fc.
Speakers (8Ω loads)
P = V²/Z. Match here for power; never drop below the amp's rated Z.

Equation cheat-sheet

QuantityFormulaWorked value
Sound leveldB SPL = 10·log10(I/I0)I0=10−12 W/m²; range 0–120 dB
dB (power)10·log10(P2/P1)+3 dB = 2× power
dB (voltage)20·log10(V2/V1)+6 dB = 2× voltage = 4× power
Power into loadP = V²/Z = I²Z = VI4 V into 8Ω → 2 W, 0.5 A
Halved load2×8Ω parallel = 4Ω4 V into 4Ω → 4 W, 1 A
Non-inv. gain1 + R2/R11+100k/1k = 101 (40 dB)
Inverting gain−R2/R1polarity flip, same magnitude
LM386 gain20 default → 200 (pins 1–8 cap)26 dB → 46 dB
RC / crossover capC = 1/(2π fc R)1.8 kHz, 8Ω → ≈11 μF
Crossover inductorL = R/(2π fc)1.8 kHz, 8Ω → ≈0.71 mH
Bridging ruleZload ≥ 10×Zsourcematched costs −6 dB
The megaphone, solved. Too quiet? Strap the pins-1–8 capacitor on the LM386 for gain 200 (+20 dB). Adding a tweeter cooked the amp? Do not parallel speakers — that drops to 4Ω and doubles the current; instead give the tweeter its own ~11 μF series cap (high-pass) so it sees only the highs, leaving the woofer at a safe 8Ω. Every fix is one equation from this chapter.

Connections to other chapters

"A speaker is an 8-ohm resistor that happens to move air. Respect the ohm and the watt, and the music takes care of itself."
— the one rule of audio electronics

You can now trace any audio path from microphone to cone, predict the current before you connect, and split a spectrum without frying a tweeter.

Modern audio uses impedance bridging rather than matching. What is the rule and why?
← Chapter 15: Motors Chapter 17: Modular Electronics →