DOE vs OFAT: Why One-Factor-at-a-Time Fails

Ask a biologist how they optimized a process and the honest answer is usually one-factor-at-a-time: hold everything constant, change temperature until yield peaks, lock it in, then change pH, and so on. It feels rigorous and controlled. It is also the reason so many "optimized" processes sit well below their real ceiling. This is the heart of the DOE vs OFAT debate, and it is not close. This guide shows exactly why one-factor-at-a-time fails, where the runs go to waste, and how a designed experiment finds a better optimum from the same bench effort. If you want to try the alternative as you read, you can build a design in a free design of experiments calculator and compare it against your current OFAT plan.

What OFAT is (and why it feels safe)

OFAT, or one-factor-at-a-time, is a strategy where you vary a single input while holding all other inputs fixed, find its best level, freeze it, and repeat for the next input. It is the intuitive scientific method most of us are taught: change one thing so you know what caused the change.

The appeal is real. Each result is easy to attribute, the runs are simple to set up, and a controlled comparison feels like good science. The problem is hidden in the phrase "holding all other inputs fixed." That single assumption, that a factor's best level does not depend on where the other factors sit, is exactly what fails in biological systems, where temperature, pH, dissolved oxygen, and feed rate all push on the same cellular machinery at once.

Design of experiments (DOE) takes the opposite stance: it deliberately changes several factors together in a structured pattern, so a single set of runs reveals each factor's main effect and how factors modify each other. That structure is what OFAT throws away. New to the terminology? Start with our plain-language design of experiments where do I start primer, then come back here for the head-to-head.

The interaction OFAT can't see

An interaction means the effect of one factor depends on the level of another, and one-factor-at-a-time is structurally incapable of detecting it. Because OFAT fixes every other factor while it moves one, it only ever measures a factor's effect at a single setting of the rest, never learning that the effect would flip or grow somewhere else.

Picture a response surface as a hillside. When factors are independent, the hill is a simple dome and OFAT climbs straight to the top: optimize x, then optimize y, done. But when factors interact, the dome tilts into a diagonal ridge. OFAT moves east until it hits the ridge, then moves north along a line that runs off the ridge crest, and stops. It has found a local high point, not the summit, and it has no way of knowing the summit lies in the diagonal direction it never tried.

OFAT (red) climbs one axis, then the other, and halts on the flank of a diagonal ridge. The factorial DOE grid (blue) samples the corners and centre, so the interaction, and the direction to the true top-right optimum, is visible in a single set of runs.

Interactions are not a corner case in cell culture and fermentation, they are the norm. Temperature shifts change how strongly an inducer or feed acts; pH changes the response to dissolved oxygen. Miss them and you optimize a caricature of your process rather than the process itself.

Runs vs information: the efficiency gap

The efficiency gap between DOE and OFAT is not mainly about run count, it is about information per run. A designed experiment extracts main effects, interactions, and an error estimate from the same points, while OFAT spends its runs learning main effects only.

Count it out. For k factors, OFAT needs roughly 2k+1 runs to estimate all main effects: a baseline, plus a high and low probe for each factor. A two-level full factorial needs 2^k runs but returns every main effect and every interaction. At small k the run counts are close, and DOE wins outright on information; at larger k you switch to a fractional factorial or a Plackett-Burman screening design, which estimates the main effects of many factors in a fraction of the runs OFAT would burn.

Figure 2. Runs by strategy and factor count. OFAT runs (red) buy main effects only; factorial (blue) and fractional/screening (teal) designs also buy interactions. Beyond ~4 factors, screening designs beat OFAT on both runs and information.

Two more advantages rarely show up on a run-count spreadsheet. First, because every DOE run varies several factors, each data point contributes to several effect estimates at once, a property called hidden replication that tightens precision for free. Second, replicated centre points give DOE an honest estimate of experimental error and a check for curvature, neither of which OFAT provides unless you deliberately repeat runs. To see how run count scales with the design you pick, work through how many experiments a DOE needs.

Table 1. DOE vs OFAT across the dimensions that decide the outcome.
Dimension	OFAT	DOE
Detects interactions	No, ever	Yes (factorial & RSM)
Factor-space coverage	Thin cross-shaped path	Corners + centre of the region
Runs for 5 factors (main effects)	11	12–16
Error / noise estimate	Only if runs replicated	Built in (centre points)
Optimum quality when factors interact	Local / false optimum	True optimum
Robustness & design space info	None	Yes (map the surface)

Swap OFAT for a real design in 5 minutes

Enter your factors and levels, and the free DOE generator builds the full factorial, fractional, or screening design with a randomized run order, ready to run at the bench.

Open the free DOE generator →

Side-by-side worked example

Here is the failure made concrete: a two-factor induction study where temperature and inducer concentration interact, run once by OFAT and once by DOE. The underlying truth (which the experimenter does not know) is that titer rises most when both temperature and inducer are high together, a positive interaction.

Suppose the true response in g/L over the tested ranges is approximately titer = 1.0 + 0.6·A + 0.4·B + 0.8·A·B, where A and B are temperature and inducer coded from −1 (low) to +1 (high). The A·B term is the interaction that OFAT will never see.

OFAT run, then DOE run

OFAT approach. Start at low inducer (B = −1). Vary temperature: at A = −1, titer = 1.0 − 0.6 − 0.4 + 0.8 = 0.8; at A = +1, titer = 1.0 + 0.6 − 0.4 − 0.8 = 0.4. Temperature "looks" better low, so OFAT freezes A = −1. Now vary inducer at A = −1: at B = +1, titer = 1.0 − 0.6 + 0.4 − 0.8 = 0.0; at B = −1, titer = 0.8. Inducer "looks" better low too. OFAT declares the optimum at low-temperature, low-inducer, predicted titer 0.8 g/L.

DOE approach (2² factorial, 4 corner runs). Run all four corners. The high-high corner (A = +1, B = +1) gives titer = 1.0 + 0.6 + 0.4 + 0.8 = 2.8 g/L, more than three times OFAT's answer. Fitting the four corners recovers the +0.8 interaction term, so the analysis explicitly reports that temperature and inducer must be raised together. Add a replicated centre point and you also get a curvature check and an error estimate, still inside a handful of runs.

The verdict. OFAT did not merely find a slightly worse condition, it was actively misled: because it probed temperature only at low inducer, it concluded temperature should be low, the exact opposite of the truth. The interaction inverted the apparent main effect. The factorial DOE found the 2.8 g/L corner, and explained why, from four runs that OFAT's cross-shaped path never reaches.

This inversion, where a factor's apparent direction flips depending on where you sample the others, is the signature failure of one-factor-at-a-time. It is why processes tuned by OFAT so often plateau, and why re-running them as a designed study frequently uncovers a substantially higher optimum. For a full bioprocess walk-through from screening to confirmation, see DOE for bioprocess optimization.

When OFAT is actually fine

One-factor-at-a-time is acceptable in a few narrow situations, but each comes with a caveat. Knowing them keeps the DOE vs OFAT choice honest rather than dogmatic.

A single dominant factor. If prior knowledge says one input overwhelms the rest and the others genuinely do not interact with it, OFAT on that one factor is reasonable, though a two-factor confirmation is cheap insurance.
Feasibility and range-finding. Before a formal design you often need to know whether a factor even moves the response and what range is safe. A quick OFAT sweep to bracket levels is a fine preliminary, feeding sensible low and high settings into the real design.
Hard safety or equipment limits. When running two factors at their extremes together is unsafe or impossible (a pressure and temperature combination the vessel cannot take), OFAT may be forced. Use a constrained design instead where you can.

In every one of these cases the honest move is the same: treat OFAT as a scout, not the strategy. Once the region of interest is bracketed, switch to a designed experiment so the interactions in the multi-factor space, invisible to one-factor-at-a-time, finally become visible. If you are unsure which design to reach for next, our which DOE design should I use decision guide narrows it down by goal, factor count, and run budget.

Frequently Asked Questions

What is the difference between DOE and OFAT?

OFAT (one-factor-at-a-time) changes one variable while holding all others fixed, then moves to the next variable. DOE (design of experiments) varies several factors together in a structured pattern so their effects and interactions can be estimated simultaneously. The key difference is that OFAT cannot detect interactions between factors and explores only a thin path through the factor space, whereas DOE maps the whole space and finds the true optimum, usually in the same number of runs or fewer.

Why does one-factor-at-a-time fail?

One-factor-at-a-time fails because it assumes factors act independently. When two factors interact, the best setting of one depends on the level of the other, and OFAT, having fixed the others, walks up a ridge and stops at a false optimum. It also gives no interaction estimate, no measure of experimental error unless runs are replicated, and poor coverage of the factor space. In systems with interactions, OFAT reliably lands short of the real optimum.

Is DOE always better than OFAT?

DOE is better whenever factors might interact, which is almost always true in biology. OFAT is defensible only in narrow cases: a single factor of interest, a quick feasibility check, or a hard safety constraint that forbids combining extreme settings. For any real optimization with two or more factors, DOE estimates interactions, quantifies error, and finds a better optimum in comparable or fewer runs, so it should be the default.

How many runs does OFAT need compared to DOE?

To estimate main effects, OFAT needs about 2k+1 runs for k factors (a baseline plus a high and low for each factor), but it still learns nothing about interactions. A two-level full factorial needs 2^k runs and estimates every main effect and interaction; a fractional factorial or Plackett-Burman screening design needs far fewer. For 5 factors, OFAT uses 11 runs for main effects only, while a resolution-IV fractional factorial estimates all main effects plus two-factor interaction information in 16 runs, and a Plackett-Burman screen does main effects in 12.

When is OFAT actually acceptable?

OFAT is acceptable when you truly have one dominant factor, when a preliminary run is just checking feasibility or ranges before a formal design, or when safety or equipment limits forbid running factors at their extremes simultaneously. Even then, plan to follow up with a designed experiment once the region of interest is bracketed, because interactions in the multi-factor region will still be invisible to OFAT.

Related Tools

DOE Experiment Generator — Replace an OFAT plan with a full factorial, fractional, or screening design, free in the browser.
Media & Feed Estimator — Cost the runs in your OFAT-vs-DOE comparison before you book bench time.
Growth Curve Fitter — Fit the responses your designed experiment produces and read model quality.

References

Czitrom, V. (1999). One-Factor-at-a-Time versus Designed Experiments. The American Statistician, 53(2), 126–131. DOI: 10.1080/00031305.1999.10474445
Mandenius, C.-F. & Brundin, A. (2008). Bioprocess optimization using design-of-experiments methodology. Biotechnology Progress, 24(6), 1191–1203. DOI: 10.1002/btpr.67
Montgomery, D.C. (2017). Design and Analysis of Experiments, 9th ed. Wiley. ISBN 978-1119113478.
NIST/SEMATECH (2012). e-Handbook of Statistical Methods, Section 5.1: Introduction to DOE. itl.nist.gov

DOE vs OFAT: Why One-Factor-at-a-Time Fails

Key Takeaways

Contents

What OFAT is (and why it feels safe)

The interaction OFAT can't see

Runs vs information: the efficiency gap

Swap OFAT for a real design in 5 minutes

Side-by-side worked example

OFAT run, then DOE run

When OFAT is actually fine

Frequently Asked Questions

What is the difference between DOE and OFAT?

Why does one-factor-at-a-time fail?

Is DOE always better than OFAT?

How many runs does OFAT need compared to DOE?

When is OFAT actually acceptable?

Related Tools

References

Resources & Further Reading

Key Takeaways

Contents

What OFAT is (and why it feels safe)

The interaction OFAT can't see

Runs vs information: the efficiency gap

Swap OFAT for a real design in 5 minutes

Side-by-side worked example

OFAT run, then DOE run

When OFAT is actually fine

Frequently Asked Questions

What is the difference between DOE and OFAT?

Why does one-factor-at-a-time fail?

Is DOE always better than OFAT?

How many runs does OFAT need compared to DOE?

When is OFAT actually acceptable?

Related Tools

References

Related Articles

Design of Experiments: Where Do I Start?

Which DOE Design Should I Use?

Full Factorial Design

DOE for Bioprocess Optimization

Resources & Further Reading