Audio is a chain of energy conversions: pressure waves become tiny voltages, amplifiers boost them, and a speaker pushes air again. Because the ear spans twelve orders of magnitude in intensity, the whole field runs on the logarithmic decibel and on one humble fact — a speaker is just an 8-ohm resistor that obeys Ohm's law.
You wire up a weekend megaphone. An electret microphone catches your voice, an LM386 chip on a 9-volt battery boosts the signal, and an 8Ω speaker turns it back into sound. You flip the switch. It works — but it is disappointingly quiet. People three meters away cup their ears.
So you try two "obvious" fixes. First, you reach for a second speaker — a little tweeter — and wire it in parallel with the first to add crispness and volume. Within a minute the bass speaker is uncomfortably hot, the battery sags, and the sound turns to mush. What went wrong?
Every symptom here is a door into this chapter. The quietness is about gain — the LM386 has a fixed voltage gain of 20, and most builders never learn it can be cranked to 200. The overheating is about impedance: two 8Ω speakers in parallel present 4Ω, which at the same voltage doubles the current and doubles the power the amp must source. And the mush is about filtering: you cannot just parallel a woofer and a tweeter — each needs its own slice of the frequency band, carved out by a crossover.
The full path from voice to sound. Move the gain slider to scale the boosted signal, and the load slider to halve the speaker impedance by paralleling a tweeter. Watch the current and heat climb as impedance drops.
By the end of this chapter you will be able to look at that hot speaker and predict the exact current before touching it — and design the megaphone so it never happens.
Sound is a pressure wave traveling through air — regions of compression and rarefaction marching outward from a vibrating source at about 343 m/s. Audio electronics never touches the air directly; it works with an electrical analog of that wave: a voltage that wiggles up and down in lockstep with the pressure. The whole discipline is the art of capturing, reshaping, and re-emitting that wiggle faithfully.
Three properties of the waveform map onto three things you hear. The frequency (cycles per second, Hz) is pitch — how high or low the note. The amplitude (the height of the swing) is loudness. And the shape of the wave — the mix of overtones riding on the fundamental — is timbre, the quality that distinguishes a violin from a flute playing the same note.
The human ear responds from roughly 20 Hz to 20 kHz, a thousand-to-one span, and it is most sensitive around 1–2 kHz where speech lives. A pure tone is a single sine. Real sounds are sums of sines — Fourier's insight — so a 200 Hz note from a trumpet is a 200 Hz fundamental plus partials at 400, 600, 800 Hz and beyond. An amplifier that boosts all those partials by the same factor preserves timbre; one that favors some over others colors the sound.
Middle C is about 262 Hz. In the time a 20 kHz cymbal shimmer completes one cycle (50 μs), middle C has barely moved — it needs 1/262 s ≈ 3.8 ms per cycle, about 76 times longer. This is why bass and treble demand different speakers: a woofer's heavy cone can shove enough air for a slow 3.8 ms swing, but it physically cannot reverse direction every 50 μs. Mass that helps the bass kills the treble.
Set the frequency (pitch) and amplitude (loudness). The sine on the left drives the speaker cone on the right — bigger amplitude pushes the cone farther; higher frequency packs more cycles into the window. Readouts name the pitch register and the band.
The ear is logarithmic. The quietest sound you can detect carries about 10−12 W/m2; a sound on the edge of pain carries about 1 W/m2. That is a ratio of a trillion. Writing loudness on a linear scale would be hopeless — so audio uses the decibel, a logarithm of a ratio, which compresses that trillion-fold range into a tidy 0 to 120.
For sound intensity (a power-like quantity), the sound-pressure-level decibel is:
The reference I0 is the threshold of hearing, so 0 dB is "barely audible," 60 dB is conversation, 90 dB is a lawnmower, and 120 dB is a jet at takeoff. Every 10 dB is a ten-fold jump in intensity that the ear perceives as merely "about twice as loud."
In a circuit you compare two powers or two voltages. For power, dB = 10·log10(P2/P1). But power goes as the square of voltage (P = V2/R), so when you compare voltages the factor of 2 from the square comes out of the log and you get a leading 20:
Two milestones to memorize: +3 dB is twice the power (10·log 2 = 3.01), and +6 dB is twice the voltage (20·log 2 = 6.02) — which is also four times the power. Doubling your amplifier's output voltage feels like a solid step up; doubling its power barely registers.
An LM386 at its default gain of 20 turns a 10 mV input into 0.2 V out. In decibels that voltage gain is 20·log10(20) = 20·(1.301) ≈ 26 dB. Strap the boost components across pins 1 and 8 to raise the gain to 200, and the dB becomes 20·log10(200) = 20·(2.301) ≈ 46 dB. The 10× jump in gain is exactly a +20 dB step — logs turn multiplication into addition, so ×10 always adds 20 dB of voltage gain.
Enter two voltages (V2 vs reference V1). The gauge computes 20·log10(V2/V1), and the bar places the result on the real-world loudness scale from a whisper (20 dB) to a jet (120 dB). Toggle to compare powers instead (factor 10).
A microphone is a transducer that runs the speaker in reverse: incoming pressure waves move a membrane, and that motion is converted into a tiny AC voltage — typically a few millivolts. There are two dominant ways to do the conversion, and the difference decides whether your mic needs a power supply.
A dynamic mic glues a small coil to the back of a diaphragm and suspends it in a permanent magnet's field. When sound moves the diaphragm, the coil moves through the field and generates a voltage by electromagnetic induction (Faraday's law) — exactly like a speaker used backwards. Because it makes its own voltage, a dynamic mic needs no power supply. It is rugged, handles loud sources, and has a low output impedance (often < 600Ω), which is why it dominates live stages.
A condenser mic makes the diaphragm one plate of a capacitor. As sound flexes the plate, the spacing — and thus the capacitance C — changes. Hold the charge Q fixed, and since V = Q/C, the voltage across the plates wiggles with the sound. But you must first put charge on the plates: a condenser mic needs a DC bias (phantom power, often 48 V). An electret is a clever condenser whose plate is permanently charged at the factory, so it needs only a trickle of current to run a built-in FET buffer — the 3-terminal capsule in nearly every phone, laptop, and the megaphone of Chapter 0.
The electret capsule's internal FET acts like a current-controlled switch. Feed it through a load resistor — say 2.2 kΩ to the 9 V rail — and the audio appears as a voltage swing across that resistor. If the capsule idles at 0.5 mA, the resistor drops 0.5 mA × 2.2 kΩ = 1.1 V, biasing the output near 7.9 V with room to swing both ways. Pick the resistor too small and there is no swing; too large and the FET starves. This bias resistor is the reason an electret has three pins where a dynamic mic has two.
Switch between a dynamic (coil + magnet, self-powered) and an electret (biased capacitor) capsule. The animation shows the diaphragm vibrating and the resulting output. Slide the sound level to see the millivolt output grow.
A few millivolts from a microphone is far too small to drive anything. The preamplifier raises that signal to a comfortable "line level" of around 1 V while loading the source gently. Its job is voltage gain, and the op-amp from Chapter 8 is the natural tool: high input impedance, low output impedance, and a gain you set with two resistors.
The non-inverting configuration feeds the signal into the + input and sets gain with a feedback divider. The inverting configuration feeds the signal through an input resistor into the − input:
The non-inverting circuit is favored for mic preamps because its input impedance is essentially the op-amp's own — megohms — so it barely loads the mic. The minus sign on the inverting gain just means the output is flipped top-to-bottom; for audio, a 180° flip is inaudible, so designers choose the topology for impedance reasons, not polarity.
You want to lift a 10 mV mic signal to 1 V — a gain of 100. Using the non-inverting form, 1 + R2/R1 = 100 means R2/R1 = 99. Pick R1 = 1 kΩ and R2 = 100 kΩ (giving 1 + 100 = 101, close enough). In decibels that is 20·log10(100) = 40 dB. The 10 mV input becomes 10 mV × 101 ≈ 1.01 V — right at line level.
Battery gear has one supply rail, not the ±15 V a textbook op-amp assumes. To swing an AC audio signal on a single 9 V rail, you bias the input to the midpoint (4.5 V) with a resistor divider, so the signal can swing up and down around it. But the mic's signal must ride on that 4.5 V without dragging its own DC level in — so you place a coupling capacitor in series. The cap blocks DC (its reactance is infinite at 0 Hz) while passing AC audio (low reactance at kilohertz). It also sets a low-frequency cutoff fc = 1/(2πRC) — too small a cap and you lose the bass.
Set R1 and R2 for a non-inverting preamp. The display computes the gain 1 + R2/R1, its dB value, and shows a 10 mV input being lifted to line level — clipping at the rail if you push too far.
The preamp gave you a clean 1 V signal, but try to drive an 8Ω speaker with it and nothing moves — the op-amp can supply only milliamps, and a speaker wants amps. You need a power amplifier: a stage with modest voltage gain but the muscle to source large current into a low-impedance load. The classic hobby chip is the LM386.
The LM386 runs from a single supply of +4 V to +15 V (9 V is typical), draws a few milliamps quiescent, and internally biases its output to half the supply so the signal can swing both ways. Its default voltage gain is fixed at 20. Bridge a 10 μF capacitor across pins 1 and 8 and the internal feedback resistor is shorted, raising the gain all the way to 200. A resistor in series with that cap sets any gain between 20 and 200. That single capacitor is the answer to the "too quiet" half of the Chapter 0 megaphone.
Feed the LM386 a 10 mV signal. At gain 20 the output is 10 mV × 20 = 0.2 Vpk — that is 20·log(20) = 26 dB. Strap pins 1–8 to reach gain 200 and the output wants to be 10 mV × 200 = 2 Vpk — 46 dB, a 20 dB jump. But on a 9 V supply the output can swing only to about ±4 V around the 4.5 V bias, so a 2 Vpk swing fits, while a 20 mV input at gain 200 (wanting 4 Vpk) sits right at the edge of clipping. Push past the rail and the sine's tops flatten into a square — new harmonics, harsh distortion.
A traditional class-A/B output stage (what the LM386 is) burns power as heat in its transistors whenever they are partly on — theoretical best is about 78% efficiency, often far less in practice. A class-D amp instead chops the signal into a high-frequency PWM stream (comparing the audio against a triangle wave), switches MOSFETs fully on or fully off, then low-pass filters the result back to audio. Because a fully-on or fully-off switch dissipates almost nothing, class-D reaches 90%+ efficiency — which is why every phone, laptop, and battery speaker uses it.
Toggle the pins-1&8 capacitor between gain 20 and gain 200, and set the input amplitude. The output waveform scales and clips against the ±4 V supply rails. Watch the dB readout and the harsh flat tops appear.
Here is the single most useful simplification in audio: a speaker is just a resistor. Yes, a real driver has inductance and a resonance, so its impedance wiggles with frequency, but its nominal impedance — the number printed on the magnet, almost always 8Ω (sometimes 4Ω or 16Ω) — lets you treat it as a plain resistive load and apply Ohm's law to find current, power, and heat.
Your amp puts out 4 V RMS. Into the 8Ω speaker that is P = V2/Z = 16/8 = 2 W, drawing I = 4/8 = 0.5 A. Now you parallel a second 8Ω speaker. Two 8Ω in parallel make 4Ω, so at the same 4 V the power becomes 16/4 = 4 W and the current 4/4 = 1 A. You did not turn anything up — you halved the impedance, and that alone doubled both the current and the power the amplifier must source. That extra ampere is exactly the heat that cooked the Chapter 0 megaphone.
Speakers combine like resistors. Two 8Ω in parallel give 8/2 = 4Ω (lower — more current, riskier for the amp). Two 4Ω in series give 4 + 4 = 8Ω (safe). If you must run multiple drivers, series wiring keeps the impedance up and the amp happy; parallel wiring is the trap. Always know the combined impedance before you connect, and never take it below the amp's rated minimum (often 4Ω).
No single cone covers 20 Hz to 20 kHz well, so speakers specialize. A woofer (heavy, large) handles bass below ~200 Hz; a midrange covers ~500–3000 Hz where voices live; a tweeter (light, small) handles the highs above. Send a tweeter the bass and its tiny cone tries to make huge excursions and tears itself apart; send a woofer the treble and its mass simply cannot keep up. Each driver must get only its band — which is the crossover, Chapter 7.
Set the amp's output voltage and pick the speaker impedance (2 / 4 / 8Ω). Bars show power P = V2/Z, the current I = V/Z, and an amplifier "heat" gauge that climbs with current — watch it redline as you drop to 2Ω.
Now we solve the last megaphone failure: you cannot just parallel a woofer and a tweeter. You must split the spectrum — send the lows to the woofer and the highs to the tweeter — with a crossover, a pair of simple filters built from a capacitor and an inductor.
A capacitor's reactance falls as frequency rises (XC = 1/(2πfC)), so a series capacitor passes highs and blocks lows — a high-pass filter for the tweeter. An inductor does the opposite (XL = 2πfL), passing lows and blocking highs — a low-pass filter for the woofer. Choose the components so both filters reach their −3 dB point at the same crossover frequency fc, and the two response curves cross exactly there, handing the signal off cleanly.
You want to cross over at fc = 1.8 kHz with 8Ω drivers. The tweeter's series capacitor is
(the nearest standard value is about 8–10 μF — the book's practical pick). The woofer's series inductor is
So a single 11 μF cap in series with the tweeter and a 0.71 mH coil in series with the woofer turn one input into a clean two-way split. Below 1.8 kHz the woofer dominates; above it the tweeter takes over; right at 1.8 kHz they share equally. The tweeter is now protected from the bass that would have destroyed it.
Drag the crossover frequency to retune the cap and coil. The blue curve is the woofer's low-pass response, the warm curve is the tweeter's high-pass; they cross at fc at −3 dB. The input frequency slider drops a test tone and lights up which driver receives it. Live values of C and L are computed for 8Ω.
One last principle ties the chain together: how to connect stages so the signal survives. Old audio (and RF) obsessed over impedance matching — making source and load equal to transfer maximum power. Modern audio does the opposite: impedance bridging, where the load is made far larger than the source (the rule of thumb is load ≥ 10× source) to transfer maximum voltage.
When source resistance equals load resistance, the source's own internal resistance drops half the voltage — the load only ever sees half, a loss of 6 dB, and you have thrown away half your signal as heat inside the source. Since every stage after a mic cares about voltage, not power, you instead make each input impedance at least ten times the previous output impedance. A 600Ω mic into a 10 kΩ preamp loses almost nothing. The one place you still match is the speaker, because there you genuinely want power delivery.
| Quantity | Formula | Worked value |
|---|---|---|
| Sound level | dB SPL = 10·log10(I/I0) | I0=10−12 W/m²; range 0–120 dB |
| dB (power) | 10·log10(P2/P1) | +3 dB = 2× power |
| dB (voltage) | 20·log10(V2/V1) | +6 dB = 2× voltage = 4× power |
| Power into load | P = V²/Z = I²Z = VI | 4 V into 8Ω → 2 W, 0.5 A |
| Halved load | 2×8Ω parallel = 4Ω | 4 V into 4Ω → 4 W, 1 A |
| Non-inv. gain | 1 + R2/R1 | 1+100k/1k = 101 (40 dB) |
| Inverting gain | −R2/R1 | polarity flip, same magnitude |
| LM386 gain | 20 default → 200 (pins 1–8 cap) | 26 dB → 46 dB |
| RC / crossover cap | C = 1/(2π fc R) | 1.8 kHz, 8Ω → ≈11 μF |
| Crossover inductor | L = R/(2π fc) | 1.8 kHz, 8Ω → ≈0.71 mH |
| Bridging rule | Zload ≥ 10×Zsource | matched costs −6 dB |
You can now trace any audio path from microphone to cone, predict the current before you connect, and split a spectrum without frying a tweeter.