DOE for Cell Culture & Fermentation: Does DOE Work in Biology?

July 2026 12 min read Bioprocess Engineering

Key Takeaways

Contents

  1. Why biology breaks textbook DOE
  2. Does DOE work in biology?
  3. Hard-to-change factors (split-plot)
  4. Biological replicates & variability
  5. Choosing factors (T, pH, DO, feed)
  6. Intensified DoE (iDoE)
  7. Worked example: CHO process factors
  8. Frequently Asked Questions

Every bioprocess scientist who reaches for a statistics textbook on design of experiments hits the same wall: the examples are about injection moulding and chemical yields, where a run is cheap, fast, and repeatable. A cell culture run takes two weeks, costs a bioreactor slot, and never quite repeats. So the fair question is not academic, it is practical. Does DOE work in biology? Yes, and this guide shows how to make DOE for cell culture and DOE fermentation survive the three things biology throws at it: variability, hard-to-change setpoints, and time-course responses. When you are ready to lay one out, the free design of experiments calculator builds the screening and response-surface designs referenced throughout.

Why biology breaks textbook DOE

Textbook DOE assumes cheap, fast, independent, and repeatable runs, and biological processes violate all four. That does not make DOE wrong for biology, it means you adapt the design to the reality rather than copying an industrial example.

Three properties of living systems cause the friction. First, variability: two "identical" bioreactors seeded from the same vial can differ in final titer by 10–20%, so a real factor effect must clear a noisy baseline. Second, hard-to-change factors: you cannot freely re-randomize the temperature of a 2,000 L reactor between runs the way you re-randomize a bench chemistry setpoint. Third, time-course responses: the "result" of a fed-batch is not one number but a trajectory of viable cell density, titer, and metabolites over 14 days, and when you measure matters.

None of these is fatal. Each maps to a specific design choice covered below. The mistake is to ignore them and run a naive fully randomized factorial, then blame DOE when the noise swamps the effects. This is the same lesson as the broader DOE vs OFAT comparison: the method is sound, but only if the design respects the system.

Does DOE work in biology?

DOE works in biology, and the strong factor interactions typical of cell metabolism make it more valuable there than in many engineering settings. Temperature changes how a cell responds to a pH shift; feed rate changes how dissolved oxygen limits growth. These interactions are exactly what a designed experiment captures and one-factor-at-a-time cannot.

The published record is unambiguous. Applications of DOE to bioprocess development, from media formulation to fed-batch optimization, routinely find higher optima than sequential tuning, and regulators actively encourage the approach under Quality by Design. Mandenius and Brundin's widely cited review lays out the full DOE workflow for bioprocess optimization; more recent work applies it to specifics like trace-metal optimization in CHO culture. The evidence is not that DOE might work in biology, it is that teams who adopt it consistently out-develop teams who do not.

Bioreactor factor map: temperature, pH, DO, and feed as DOE factors driving titer, VCD, and quality DOE factors Temperature (& shift day) pH setpoint Dissolved oxygen (DO) Feed strategy Bioreactor Responses (CQAs) Viable cell density Product titer (g/L) Glycosylation / quality Factors interact: each input can move several responses, and the best level of one depends on the others.
The bioreactor factor map: temperature, pH, dissolved oxygen, and feed are the controllable DOE factors; viable cell density, titer, and product quality are the responses. Because the arrows cross, the factors interact, which is precisely why a designed experiment beats tuning them one at a time.

Hard-to-change factors (split-plot)

A hard-to-change factor is one you cannot reset independently for every run, and forcing it into a fully randomized design either breaks execution or hides its true error, so you use a split-plot design instead. In practice this is the single most common way a biology DOE goes wrong.

Consider a study of temperature and three feed factors across 16 runs. If temperature is set at the incubator or reactor level and you share incubators across several vessels, you physically cannot randomize temperature run by run. A split-plot design acknowledges this: temperature becomes the whole-plot factor (changed rarely, applied to a group of reactors), and the feed factors become sub-plot factors (randomized within each temperature group). The analysis then estimates two error terms, whole-plot error for temperature and sub-plot error for the feeds, giving each factor an honest test.

The payoff is correct statistics: a naive analysis that ignores the split-plot structure typically over-states the significance of the hard-to-change factor, because it credits temperature with more independent replication than it actually had. If setpoints in your process are genuinely hard to change, design for it up front rather than patching it in analysis.

Lay out a screening or response-surface design

Pick your factors and levels, and the free DOE generator builds a fractional factorial, Plackett-Burman screen, or central composite design with a randomized run order, ready for the bioreactor suite.

Open the free DOE generator →

Biological replicates & variability

Biological variability is handled by measuring it, not wishing it away: add replicated center points for a pure-error estimate, and use biological replicates rather than only analytical repeats. The distinction between the two kinds of replicate is where many cell culture DOEs quietly fail.

An analytical replicate is the same sample assayed twice, it tells you about assay precision. A biological replicate is an independent culture run at the same conditions, it captures the real process variability that matters for detecting a factor effect. Only biological replicates give you the pure error a DOE needs. Replicated center points do double duty: they estimate this pure error and test for curvature, telling you whether a simple linear model is adequate or you need a response surface.

Figure 2. Illustrative between-replicate variability (coefficient of variation) for common responses. Noisier responses like titer need larger effects or more replication to detect. Values are typical ranges, not universal constants, so estimate your own from center points.

Three more habits keep biological noise from masquerading as signal. Randomize run order so drift in an incubator or a reagent lot does not align with a factor. Block by known nuisance sources, operator, medium lot, or reactor, so their variation is removed from the error. And size the design by power, not habit: the smallest effect you care about must be detectable above the CV you measured. Our guide to how many experiments a DOE needs works the power calculation for exactly this situation.

Choosing factors (T, pH, DO, feed)

The highest-value factors in a mammalian cell culture DOE are temperature (and the day of any temperature shift), pH setpoint, dissolved oxygen, and the feed strategy, because these dominate growth, productivity, and product quality. Choosing factors well matters more than any statistical nicety, a design of the wrong factors is wasted no matter how elegant.

Screen broad, then optimize narrow. Put 4–8 candidate factors into a Plackett-Burman screening design or a fractional factorial to find the vital few, then take the two or three that dominate into a response surface methodology design to map the optimum. When the factors are medium components that must sum to a fixed total, switch to a mixture design instead of a factorial. For media specifically, the companion media optimization with DOE guide walks the full screening-to-formulation path.

Table 1. Common factors for a cell culture / fermentation DOE, with typical ranges and hard-to-change status.
FactorTypical screening rangePrimarily affectsHard to change?
Temperature36.5–37°C, shift to 30–33°CGrowth vs productivity balanceOften (whole-plot)
Temperature-shift dayDay 3–7Titer, product qualityNo
pH setpoint6.8–7.2Lactate, growth, glycosylationSometimes
Dissolved oxygen20–60% air sat.Metabolism, oxidative stressNo
Feed rate2–6% culture vol./dayNutrient supply, osmolalityNo
Glucose target2–8 g/LLactate, cell healthNo
Seeding density0.3–1.0 ×10&sup6; cells/mLLag, peak VCD, titerNo

Deciding which of these to fix and which to vary is half the battle. If a factor is tightly controlled by process constraints or already well understood, hold it constant and spend your runs on the uncertain ones. Troubleshooting a specific misbehaving culture first? The CHO troubleshooting guide and the fed-batch feeding strategies guide help you shortlist factors worth designing around.

Intensified DoE (iDoE)

Intensified DoE (iDoE) changes factor setpoints within a single culture run, so one bioreactor delivers several factor conditions instead of one, dramatically cutting the number of reactors a study needs. It turns the time-course nature of a fed-batch, usually a nuisance, into an advantage.

In a conventional design each reactor holds one condition for the whole run. In iDoE you might hold pH at one setpoint for the first phase, then shift it for the second, treating each phase as a separate design point in time. Because a fed-batch already passes through distinct growth, stationary, and production phases, you are sampling factor effects across the culture's own trajectory. The trade is analytical: the data must be fitted with a dynamic (often hybrid mechanistic) model rather than a simple regression, because the response depends on the history of setpoints, not just their current value.

iDoE shines when reactor slots, not analytics, are the bottleneck, which is the usual situation in a busy development lab with a fixed number of parallel mini-bioreactors. Used well, it can compress a study that would need dozens of reactors into a handful. It is an advanced technique, so most teams reach for it only after the standard screen-then-optimize workflow is comfortable.

Worked example: CHO process factors

Here is a concrete screen: a CHO fed-batch where four process factors are tested for their effect on day-14 titer, using an 8-run fractional factorial before committing to a response surface. The goal of this stage is not the optimum, it is to find the vital few factors worth optimizing.

CHO fed-batch screen (24−1 fractional factorial, 8 runs + 3 center points)

Factors (coded −1 / +1): A = temperature-shift day (day 4 / day 6), B = pH setpoint (6.9 / 7.1), C = feed rate (3% / 5% vol./day), D = glucose target (3 / 6 g/L).

Design: a resolution-IV 24−1 fraction gives 8 runs where main effects are clear of two-factor interactions (they are aliased with each other, not with mains). Add 3 replicated center points for pure error and a curvature check, for 11 reactors total, feasible in two parallel mini-bioreactor blocks.

Result (illustrative): the effects analysis shows feed rate (C) and temperature-shift day (A) dominate titer, pH (B) has a modest effect, and glucose target (D) is negligible over this range. The center points show significant curvature, signalling that a linear model is not enough near the optimum.

Next step: drop D, carry A, B, and C into a central composite design to map the curved optimum, then confirm the predicted best condition in fresh reactors. Screening first meant the response-surface stage optimizes 3 factors, not 4, saving a full block of runs.

This is the pattern that makes DOE for cell culture tractable: a small, cheap screen kills the factors that do not matter, so the expensive optimization runs are spent only where the response actually moves. For the end-to-end version with the analysis and confirmation stages, see DOE for bioprocess optimization.

Frequently Asked Questions

Does DOE work in biology?

Yes. DOE works in biology and is arguably more valuable there than in engineering, because biological factors interact strongly and one-factor-at-a-time misses those interactions. The catch is that biology adds three complications: high run-to-run variability, factors that are hard to change between runs (like temperature setpoint on a large bioreactor), and responses that unfold over time rather than at a single endpoint. A well-designed experiment handles all three with replication, center points, split-plot layouts, and clear response definitions.

How do you handle biological variability in a DOE?

Handle biological variability by estimating it directly and designing against it. Add replicated center points to quantify pure error, use biological replicates (independent cultures) rather than only analytical replicates, randomize run order to spread nuisance effects, and block by known nuisance sources such as operator, medium lot, or incubator. Then size the design so the smallest effect you care about is larger than the noise, using a power calculation rather than a fixed run count.

What factors should I put in a cell culture DOE?

For a mammalian cell culture DOE, the highest-value factors are usually temperature (and the day of any temperature shift), pH setpoint, dissolved oxygen, and the feed strategy (feed rate, feed start day, and glucose or key nutrient targets). Seeding density and osmolality are common additions. Screen 4 to 8 of these with a fractional factorial or Plackett-Burman design first, then optimize the two or three that dominate with a response-surface design.

What is a hard-to-change factor and why does it matter?

A hard-to-change factor is one you cannot easily reset between runs, such as the temperature or gassing configuration of a large bioreactor, or a medium lot shared across a run group. If you force a fully randomized design onto hard-to-change factors, you either cannot execute it or you underestimate that factor's error. The fix is a split-plot design, which groups runs by the hard-to-change factor and analyzes the whole-plot and sub-plot errors separately.

What is intensified DoE (iDoE) in cell culture?

Intensified DoE (iDoE) deliberately changes factor setpoints within a single culture run, for example shifting pH or temperature partway through, so that one bioreactor contributes several factor conditions instead of one. It exploits the time-course nature of a fed-batch to gather more design information from fewer reactors, at the cost of a more complex dynamic model to analyze the data. It is powerful when reactor slots are the limiting resource.

Related Tools

References

  1. Mandenius, C.-F. & Brundin, A. (2008). Bioprocess optimization using design-of-experiments methodology. Biotechnology Progress, 24(6), 1191–1203. DOI: 10.1002/btpr.67
  2. Polanco, A., Liang, H., Park, S. & Wang, S. (2023). Trace metal optimization in CHO cell culture through statistical design of experiments. Biotechnology Progress. DOI: 10.1002/btpr.3368
  3. Montgomery, D.C. (2017). Design and Analysis of Experiments, 9th ed. Wiley (see the split-plot chapter). ISBN 978-1119113478.
  4. NIST/SEMATECH (2012). e-Handbook of Statistical Methods, Section 5.3.3.3: Split-plot designs. itl.nist.gov

Resources & Further Reading