Visual Navigation for Autonomous Vehicles · MIT 16.485 · Lecture 15

RANSAC: Robust Estimation Amid Outliers

Your feature tracker returns 200 matches. Forty percent are wrong — mismatched windows, repeated textures, reflections. Feed all 200 into the 8-point algorithm and it returns garbage: a single outlier pair can drag the entire least-squares fit off the true solution. You need a method that ignores the bad matches. RANSAC does exactly that: repeatedly guess a model from the tiniest possible sample, count who agrees, keep the best guess. MIT 16.485 by Luca Carlone, Lecture 15.

Prerequisites: VNAV L6 (feature correspondences, descriptor matching) · VNAV L7 (8-point algorithm, E/F matrix) · linear algebra (least squares, SVD). No new math tools needed.
10
Chapters
5
Live Canvases
Derived
From First Principles

Chapter 0: Why Least-Squares Fails

Imagine you're building a visual odometry pipeline. Your feature tracker hands you 200 matched pixel pairs between two frames. Most of them are real: a corner on a building, a pole edge, a window latch. But 80 of them are rubbish — the descriptor matched a wrong window, a reflected puddle, or a repeated brick pattern. You don't know which 80 are bad. You just have 200 pairs labeled "probably a match."

The 8-point algorithm (Lecture 14) does the right thing when all matches are good: it stacks them into a matrix A and finds the vector e that minimizes ||Ae||. Add a handful of wrong correspondences and the minimum still exists — but it's the wrong minimum. The correct essential matrix E satisfies ỹ2T E ỹ1 = 0 for all true matches. An outlier pair violates this equation badly. The least-squares solver treats it as gospel evidence and shifts the fit toward it.

The core problem with squared error: outliers hurt quadratically. If a true inlier residual is r ≈ 0.01 and an outlier residual is R ≈ 5.0, the outlier contributes R2/r2 = 250,000× as much to the loss. The entire fit is distorted to partially satisfy those 80 bad pairs. One gross outlier can ruin a fit even when 99% of points are perfect.

A concrete line-fitting example

Before going to cameras and epipolar geometry, let's make the problem tangible. You have 20 points that genuinely lie on a line y = 2x + 1 (with small Gaussian noise). You also have 6 points whose y-values were randomly corrupted — they're nowhere near the true line. These are your outliers: they exist in your dataset but don't conform to the model you're trying to fit.

Run ordinary least-squares on all 26 points. The fit minimizes the sum of squared vertical distances. Because the 6 outliers are far from the true line, they pull the fitted line toward themselves — the LS slope and intercept are visibly wrong. The RANSAC line, by contrast, ignores those 6 points and recovers the true line almost exactly. Both fits use the same 26 data points; only the fitting strategy differs.

Least-Squares vs RANSAC on a line with outliers

Drag the Outlier Fraction slider. Watch the green LS line get dragged off the true model as outliers increase. The teal RANSAC line stays correct. The "pull" on LS is proportional to the squared distance — a single far outlier dominates.

Outlier fraction 25%
You fit a line through 100 points, 5 of which are outliers at distance 10 from the true line. The remaining 95 have noise σ ≈ 0.1. Roughly how much does one outlier inflate the least-squares loss compared to one inlier?

Chapter 1: The RANSAC Idea

RANSAC (Random Sample Consensus) is a probabilistic algorithm introduced by Fischler and Bolles in 1981. The insight is brutally simple: if you randomly pick the tiniest possible sample of points and it happens to contain only inliers, the model you fit to that sample will be correct. Then you can count how many of the remaining N points agree with that model — that count is your evidence for correctness.

The word "random" is load-bearing. You can't efficiently search all possible subsets — for 200 points choosing 8 there are C(200,8) ≈ 1.4 × 1014 subsets. Exhaustive search is hopeless. But if the inlier fraction is w = 0.6, then a random sample of 8 is all-inlier with probability w8 = 0.68 ≈ 1.7%. That's small — but 100 independent random trials give you a 1 − (1 − 0.017)100 ≈ 82% chance of seeing at least one all-inlier sample. 500 trials give 99.98%. Randomness plus repetition replaces exhaustive search.

Why "consensus"? A model is correct if it commands consensus — a large fraction of the data agrees with it. Wrong models, built from outlier-contaminated samples, will be disagreed with by most of the data. The inlier count is a voting mechanism: the winning model is the one that wins the most votes.

The four-step loop

1. Sample
Randomly draw the minimal number of points needed to define a model (e.g., 2 for a line, 4 for a homography, 8 for E)
2. Fit
Compute the unique model that exactly passes through your sampled points — no minimization needed yet
3. Count
For every other data point, compute the residual under your model. Count how many are within threshold ε — the consensus set
4. Keep
If this consensus set is larger than the current best, update the best model. Repeat from Step 1 up to kmax iterations.
↻ after kmax iterations ↓
5. Refit
Fit the final model using ALL points in the best consensus set (not just the minimal sample) — this is where quality counts
Misconception: RANSAC "removes" outliers. It does not. It finds the largest agreeing subset — the biggest group that is consistent with a single model. Points outside the consensus set are not discarded globally; they are simply not used for the final model fit. If the scene genuinely has two separate groups of inliers (e.g., two planes visible), RANSAC will find one of them depending on which gets voted in first.
In the RANSAC loop, Step 2 says "compute the unique model that exactly passes through your sampled points." For a 2D line fit, how many points do you sample in Step 1, and why exactly that many?

Chapter 2: Minimal Sample Sizes

The number of points you sample in each RANSAC iteration is called the minimal sample size, denoted s. "Minimal" means: the fewest points that uniquely determine your model. Use too few and the model is underdetermined (infinitely many solutions). Use too many and you're likely to include outliers, defeating the whole purpose.

Why minimal matters: probability math

The probability that a random sample of s points is all inliers is ws, where w is the inlier fraction. For w = 0.6:

Modelsws (w=0.6)ws (w=0.4)
2D line20.360.16
2D homography40.130.026
3D plane30.220.064
Essential matrix (8-pt)80.0170.00065
Essential matrix (5-pt)50.0780.010
Fundamental matrix (7-pt)70.0280.0016

Each extra point you add cuts the probability of an all-inlier sample by a factor of w. At w = 0.4, the 8-point sampler succeeds only 0.065% of the time — you need thousands of iterations. The 5-point algorithm (the minimal solver for the essential matrix) is not just faster per iteration; it drastically reduces the required iteration count.

The connection to L7. In the last lecture you learned the 8-point algorithm: given 8 correspondences, fit E by solving Ae = 0 via SVD. In RANSAC, we call this exact subroutine inside the inner loop — but we call it on only 8 (or 5) randomly chosen pairs, not all N. The same SVD machinery, now inside a probabilistic wrapper that tolerates bad pairs.

Degeneracy: when minimal samples fail

Even an all-inlier sample can give a garbage model if the points are geometrically degenerate. For a line fit, sampling two coincident points gives no slope information. For the essential matrix, sampling 8 points that all lie on a plane leads to a degenerate configuration (the "planar case" where E can't be recovered from the linear system). RANSAC has no built-in protection against degenerate samples — you need to check and re-sample if the model is degenerate (e.g., rank of A is too low, or the fitted model has very small singular values).

Misconception: larger minimal sample = better model in Step 2. False. The model from 2 points is not worse than the model from 8 points in terms of fitting those specific points — it's determined by the same constraints. The difference is noise robustness: with more points the overdetermined system averages out noise. But in Step 2 you want uniqueness and outlier-free-ness. Overdetermination (using all inliers) comes in Step 5 (refit), not Step 2.
You're fitting a homography H (4×4 DOF, 8 constraints, so s=4 point correspondences). Your inlier fraction drops from w=0.6 to w=0.3. By what factor does the required number of RANSAC iterations grow (approximately)?

Chapter 3: How Many Iterations?

You want to run RANSAC until you're confident you've seen at least one all-inlier sample. Let p be your desired success probability (e.g., 0.99). Let w be the inlier ratio — fraction of the N points that are genuine inliers. Let s be the minimal sample size. On each iteration, the probability of drawing an all-inlier sample is ws. The probability of failing on all N iterations is (1 − ws)N. We want this failure probability to be at most 1 − p:

(1 − ws)N ≤ 1 − p

Taking logarithms of both sides (both are less than 1, so logs are negative — inequality flips twice and cancels):

N ≥ log(1 − p) ⁄ log(1 − ws)

This is the RANSAC iteration formula. Since both log(1−p) and log(1−ws) are negative, their ratio is positive. The formula gives the minimum number of iterations to guarantee success probability p.

Worked numbers

With p = 0.99 (99% confidence), here are the required iterations for various (w, s) pairs:

Inlier ratio wSample s=2Sample s=4Sample s=8
0.9 (10% outliers)235
0.7 (30% outliers)621265
0.5 (50% outliers)17721177
0.3 (70% outliers)6569276,518
0.1 (90% outliers)1050110,3501.2×109

The table reveals two key facts. First: iterations grow exponentially with s — the minimal solver matters enormously. Second: below about w = 0.3, RANSAC with large s becomes computationally infeasible. This is why the 5-point algorithm for E was such a landmark — it cut s from 8 to 5, reducing required iterations by 0.63 ≈ 5× at w=0.6.

Misconception: more iterations is always safe. Up to a point, yes — but if the inlier ratio is genuinely very low (w < 0.2) with a large model (s = 8), no practical number of iterations will reliably succeed. The formula blows up. In that regime, you need to either reduce s (use a better minimal solver), improve your data (better feature matching), or accept probabilistic failure.

Adaptive iteration count

You don't have to fix kmax in advance. After each successful iteration you can update your estimate of w from the current best inlier count: ŵ = |S*| / N. Then recompute kmax using the formula above. This adaptive RANSAC terminates early when a very good model is found early (high inlier count → updated ŵ is large → required iterations drops → algorithm terminates). It's the standard implementation in OpenCV.

Required iterations N as a function of inlier ratio and sample size

Move the sliders. The readout shows N = log(1−p) / log(1−ws) for p=0.99. Notice how N explodes when w drops or s grows. The red zone marks impractical iteration counts (>10,000).

Inlier ratio w 50%
Sample size s 4
You run RANSAC with s=4, w=0.5, p=0.99. How many iterations does the formula predict? (log(1−0.99) / log(1−0.54) = log(0.01) / log(0.9375))

Chapter 4: The Inlier Threshold ε

After fitting a candidate model in Step 2, you classify each data point as inlier or outlier by computing its residual — how far it is from the model prediction — and comparing it to a threshold ε. A point is an inlier if its residual is below ε.

For a 2D line, the residual of a point (xi, yi) from line ax + by + c = 0 is the perpendicular distance |axi + byi + c| / √(a2 + b2). For an essential matrix E and a correspondence (ỹ1, ỹ2), the standard residual is the Sampson distance: |ỹ2TEỹ1|2 / (||Eỹ1||2 + ||ET2||2), which approximates the squared reprojection error.

Too tight vs too loose

Setting ε is the RANSAC practitioner's hardest judgment call:

The threshold is not universal. In pixel coordinates, ε might be 1.5 pixels. In normalized coordinates (after multiplying by K−1), the scale changes. In the Sampson distance for E, the units are not pixels at all — they depend on the focal length. Always match your threshold to the units of your residual. A common mistake is copying ε = 1.0 from a tutorial and forgetting to adapt to your camera's resolution or calibration.
Inlier threshold explorer

Adjust the threshold ε. Green = inlier, red = outlier. Watch the inlier count and the verdict change. The true line is shown in teal; the noisy points are drawn from the actual data distribution. The blue bar shows the decision boundary.

Threshold ε 0.5
Your camera has pixel noise σ = 0.5 pixels. You want to set RANSAC's inlier threshold ε for a line fit in pixel space. Which value is most appropriate?

Chapter 5: Refinement — Refitting on the Consensus Set

After kmax iterations, you have a best consensus set S* — all points that agreed with the best candidate model within threshold ε. That candidate model was fit to only s points (your minimal sample). Now you refit: run the least-squares estimator on all |S*| points. This step is crucial and easily overlooked by beginners.

Why does it matter? Your candidate model from s points is the exact-fit model — it passes perfectly through those s points. With only 2 points you can fit a line, but any Gaussian noise in those 2 points means the model carries that noise. When you refit on 50 inliers, the least-squares solution averages out the noise from all 50 — variance of the estimate drops by a factor of 50/2 = 25 relative to the 2-point fit. The refit is where you actually get a quality estimate, not just a rough direction.

Don't confuse finding with fitting. RANSAC's loop (Steps 1–4) is about finding the right inlier set. The loop doesn't need to produce a high-quality fit — it only needs to identify which points are inliers. The high-quality fit comes from least-squares on S* in Step 5. This separation is intentional: use a simple, fast criterion (inlier count) to find the right group, then use the expensive estimator (SVD, DLT) once on the final group.

Iterative refinement (IRLS-style)

You can iterate further. After refitting on S*, your new model may classify slightly different points as inliers (some borderline points may flip). Re-classify all N points against the new model, refit again, and repeat until the consensus set stabilizes. This is called iterative reweighted least squares refinement and is used in production RANSAC implementations (e.g., LO-RANSAC, "Locally Optimized RANSAC"). The overhead is small — you converge in 2–5 outer iterations — and the quality improvement is significant, especially near the inlier/outlier boundary.

Worked numbers: noise reduction from refit

Suppose you have N=100 points, w=0.6, so |S*|=60 inliers. Your per-point angular noise is σ = 0.01 rad. Fitting from s=2 points: variance of slope estimate ∝ σ2/2. Refitting from 60 inliers: variance ∝ σ2/60. That's a 30× variance reduction — a factor of √30 ≈ 5.5 in standard deviation. The refit step is not optional; it's where the precision comes from.

python
import numpy as np

def ransac_line(pts, n_iter=200, eps=0.3):
    """RANSAC line fit: pts is (N,2) array."""
    best_inliers, best_model = [], None
    N = len(pts)
    for _ in range(n_iter):
        # Step 1: minimal sample — 2 points
        idx = np.random.choice(N, 2, replace=False)
        p1, p2 = pts[idx[0]], pts[idx[1]]
        # Step 2: fit line through 2 points (ax+by+c=0, ||[a,b]||=1)
        dx, dy = p2[0]-p1[0], p2[1]-p1[1]
        if abs(dx) + abs(dy) < 1e-9: continue  # degenerate
        norm = np.hypot(dx, dy)
        a, b = -dy/norm, dx/norm  # unit normal
        c = -(a*p1[0] + b*p1[1])
        # Step 3: count inliers (perpendicular distance)
        dists = abs(a*pts[:,0] + b*pts[:,1] + c)
        inliers = np.where(dists < eps)[0]
        # Step 4: keep best
        if len(inliers) > len(best_inliers):
            best_inliers, best_model = inliers, (a, b, c)
    # Step 5: refit on full consensus set using least squares
    if len(best_inliers) >= 2:
        in_pts = pts[best_inliers]
        A = np.column_stack([in_pts[:,0], np.ones(len(in_pts))])
        slope, intercept = np.linalg.lstsq(A, in_pts[:,1], rcond=None)[0]
        best_model = (slope, intercept, len(best_inliers))
    return best_model, best_inliers

def ransac_n_iters(w, s, p=0.99):
    """Minimum iterations for success prob p given inlier rate w and sample size s."""
    import math
    if w >= 1.0: return 1
    return int(np.ceil(math.log(1 - p) / math.log(1 - w**s)))

def adaptive_ransac(pts, s, fit_fn, residual_fn, eps=0.3, p=0.99):
    """RANSAC with adaptive iteration count update."""
    N = len(pts); k_max = 10000; k = 0; best = []
    while k < k_max:
        idx = np.random.choice(N, s, replace=False)
        model = fit_fn(pts[idx])
        residuals = residual_fn(model, pts)
        inliers = np.where(residuals < eps)[0]
        if len(inliers) > len(best):
            best = inliers
            w_hat = len(best) / N  # update inlier estimate
            k_max = ransac_n_iters(w_hat, s, p)  # re-compute bound
        k += 1
    return best
You run RANSAC and find a consensus set of 60 inliers. You refit the line using those 60 points via least squares. Compared to the initial 2-point fit, the standard deviation of the estimated slope is reduced by approximately what factor?

Chapter 6: Step-Through Simulation

Watch RANSAC iterate step by step. Each click on "Next Iteration" draws a fresh random 2-point sample, fits a candidate line, draws its ε-band, counts inliers, and updates the best model if the count improves. The running best-model is shown in teal; the current candidate in warm orange. After a few iterations you'll see the algorithm zero in on the true line despite the outlier cloud.

What to notice: Most iterations produce bad candidate lines (sample hit at least one outlier → line tilts away from the true direction → inlier count is low). Occasionally a lucky all-inlier sample produces a near-perfect line with high inlier count — and that's the one that gets kept. RANSAC's efficiency comes from the asymmetry: a correct model collects many votes; a wrong model collects few.
RANSAC step-by-step line fitter

Click Next Iteration to advance one RANSAC step. The orange line is the current candidate; teal is the all-time best. Colored dots show inlier (teal) / outlier (red) classification for the current candidate. The counter shows best inlier count so far.

Outlier fraction 40%
Threshold ε 0.35
In the step-through sim, you observe that even after 30 iterations the best inlier count is still suspiciously low (say, 12 out of 30 inlier points). What is the most likely cause?

Chapter 7: Variants & Limitations

The basic RANSAC from 1981 works well but can be improved. Several variants address specific weaknesses.

MSAC — M-estimator SAC

Basic RANSAC scores a model by counting inliers: every point either counts 1 (inside ε) or 0 (outside). MSAC replaces this with a soft scoring function: inliers contribute their squared residual (r2) while outliers contribute a constant penalty (ε2). This is equivalent to an M-estimator with a Huber-like truncation. The key benefit: two models with the same inlier count can be ranked by their inlier quality. The model whose inliers have smaller residuals wins.

PROSAC — Progressive SAC

PROSAC exploits quality ordering in your data. Feature descriptors come with a match confidence score — better matches are more likely to be inliers. PROSAC samples from the top-k best matches first, expanding the pool as iterations progress. This dramatically reduces iterations when the inlier fraction is higher among top-ranked matches (which is usually the case with good descriptor distances).

LO-RANSAC — Locally Optimized RANSAC

When RANSAC finds a new best model, LO-RANSAC immediately runs a local optimization: re-classify all points under the new model, refit on the inliers, and repeat until convergence. This catches cases where a slightly wrong initial model misses borderline inliers that, once included, improve the fit enough to classify even more inliers — a positive feedback loop toward the true model.

Limitation: degenerate configurations. RANSAC has no awareness of model degeneracy. If every random sample happens to be degenerate (e.g., all sampled points are on the same plane in a setting where RANSAC is trying to fit a general homography), the algorithm produces garbage. Solutions include: (1) explicitly check for degeneracy after sampling and re-draw; (2) use a validated minimal solver; (3) DEGENSAC — a specialized variant that detects the planar degeneracy for the fundamental matrix specifically.

Limitations summary

LimitationWhen it bitesMitigation
Iteration blowupw < 0.3, s largeUse smaller minimal solver (5-pt vs 8-pt)
Threshold sensitivityUnknown noise scaleUse MSAC; calibrate ε from noise model
Single structureTwo competing modelsRun RANSAC twice; use multi-model RANSAC
Degenerate configsPure rotation, planar sceneDEGENSAC, explicit degeneracy check
No ordering of matchesPoor matches mixed with goodPROSAC (exploit descriptor confidence)
MSAC differs from basic RANSAC in its scoring. Which statement correctly describes the difference?

Chapter 8: Showcase — Full RANSAC on a Noisy Point Cloud

This showcase runs the complete RANSAC line-fitting algorithm on a 2D point cloud with adjustable outlier ratio and threshold. You can watch the algorithm find the consensus set in real time, see the final refit line, and break it by cranking up the outlier fraction or narrowing the threshold. The live trial counter shows the required iteration count from the formula N = log(1−p)/log(1−ws).

Full RANSAC showcase — live consensus & refit

Hit Run RANSAC to execute. Adjust outlier ratio and threshold to explore the tradeoffs. Teal dots = final inliers; red = outliers. The teal line = RANSAC refit. The dashed warm line = least-squares on all points (note how it's dragged off). The iteration counter shows how many trials the formula requires.

Outlier ratio 40%
Threshold ε 0.4
N points 60

Chapter 9: Connections & Cheat Sheet

RANSAC is not isolated — it is a wrapper that makes any model-fitting procedure robust to outliers. In VNAV it wraps the 8-point (or 5-point) algorithm from Lecture 14, transforming a noise-sensitive linear solver into a correspondence-robust estimator.

RANSAC cheat sheet

ConceptFormula / Rule
Minimal sample sizesLine: s=2   Homography: s=4   F (7-pt): s=7   E (8-pt): s=8   E (5-pt): s=5
Required iterationsN = log(1−p) / log(1−ws)    (p=desired confidence, w=inlier ratio)
Inlier thresholdε ≈ 3σ where σ is per-point noise (match units of residual!)
Inlier count threshold TOften set to T > N/2; theoretical minimum T = s + 5
Adaptive stoppingAfter each iteration: ŵ = |S*|/N; recompute N from formula
Residual for line|ax + by + c| / √(a2+b2) — perpendicular distance
Residual for ESampson: |y2TEy1|2 / (||Ey1||2 + ||ETy2||2)
Full algorithmSample → Fit minimal → Count inliers → Keep best → Refit on S*

How RANSAC wraps the 8-point algorithm

In Lecture 14 (VNAV L7), you learned the 8-point algorithm: given 8 correspondences, solve the linear system Ae = 0 for the essential matrix, enforce rank-2 and unit norm. In RANSAC for camera pose, you call this same subroutine in Step 2 — but on a random subset of only 8 correspondences. The consensus set in Step 3 uses the Sampson distance against the candidate E. After kmax iterations you refit E via the full least-squares solver (15.2) on all inliers in S*. Then you recover R and t via SVD decomposition of the best E, using the cheirality test from L7 to resolve the 4-fold ambiguity. RANSAC doesn't change the estimator — it changes which points the estimator is applied to.

RANSAC and SLAM. In visual odometry and SLAM (Lecture 14+), RANSAC runs every frame: your feature tracker produces N candidate correspondences, RANSAC filters out mismatches, and the surviving inlier set goes into the pose graph. The inlier fraction is a runtime signal: if it drops below 30%, you may be in a texture-poor region, undergoing rapid motion, or facing a loop-closure failure. Tracking that fraction is one of the cheapest robustness indicators in any SLAM system.

What comes next

RANSAC gives you a robust relative pose from a single frame pair. But a vehicle moves through a sequence of frames. How do you accumulate consistent pose estimates over time without drift? That's the subject of SLAM — Simultaneous Localization and Mapping (Lecture 14+), which fuses RANSAC-filtered odometry with loop closure to bound long-term error.

Builds on:
  • VNAV L6 — Feature correspondences (the N input pairs)
  • VNAV L7 — 8-point algorithm inside the RANSAC loop
Leads to:
"What I cannot create, I do not understand." — Richard Feynman. Implement ransac_line() from scratch. Then swap in a homography estimator (4 points, DLT). Then wrap the 8-point algorithm. Each swap requires only changing s and the fit/residual functions. That's the power of the RANSAC abstraction.
In a visual odometry pipeline, RANSAC's inlier fraction drops from a healthy 70% to 25% between two consecutive frames. What is the most likely explanation?