How to Design a DOE for Bioprocess Optimization

Q: How does DOE compare to one-factor-at-a-time (OFAT) in bioprocessing?

DOE typically achieves 1.3x to 2x higher yield improvements compared to OFAT because it detects factor interactions that OFAT misses entirely. OFAT requires more total experiments while exploring less of the design space. For a 4-factor optimization, OFAT might need 20+ sequential experiments and still miss the true optimum, whereas a CCD covers the entire design space in 30 runs including replicates.

What Is Design of Experiments (DOE) in Bioprocessing?

Design of experiments (DOE) is a systematic statistical approach to planning experiments that simultaneously varies multiple process parameters to identify which factors—and which factor interactions—most significantly affect a bioprocess response such as titer, yield, or product quality. Unlike traditional one-factor-at-a-time optimization, DOE explores the entire design space in fewer experiments.

In bioprocess development, DOE is used across upstream and downstream operations: optimizing media composition, fermentation conditions (temperature, pH, dissolved oxygen, agitation), induction parameters, and purification steps. The ICH Q8(R2) guideline explicitly recommends DOE as part of Quality by Design (QbD) for establishing design spaces in biopharmaceutical manufacturing.

A typical DOE workflow for bioprocess optimization follows three stages: screening to identify critical process parameters (CPPs) from a large initial set, optimization using response surface methodology (RSM) to find the optimal operating conditions, and validation through confirmation runs to verify the model predictions hold.

Figure 1. DOE workflow for bioprocess optimization showing the three-stage progression from screening through optimization to validation, with common bioprocess factors at each stage. The DSD shortcut (dashed purple line) allows combining screening and optimization in a single step.

DOE vs. OFAT: Why Traditional Optimization Fails

One-factor-at-a-time (OFAT) optimization—where you fix all variables except one and vary it systematically—misses factor interactions and typically finds a local optimum rather than the global optimum. DOE overcomes both limitations by varying multiple factors simultaneously according to a mathematical design matrix.

The practical difference is significant. Published bioprocess studies consistently show that DOE achieves 1.3–2× higher yields compared to OFAT approaches. In a 4-factor optimization, OFAT explores only the factor axes while DOE explores the full design space, including corners and center points where interactions occur.

Table 1. DOE vs. OFAT comparison for bioprocess optimization
Criterion	OFAT	DOE
Factor interactions	Not detected	Fully estimated
Curvature (quadratic effects)	Not detected	Estimated in RSM designs
Experiments for 4 factors	20–25 (sequential)	27–30 (CCD with replicates)
Design space coverage	< 10% of space	> 80% of space
Optimum quality	Local (axis-bound)	Global (multivariate)
Statistical model	None	Polynomial regression (R², ANOVA)
Reproducibility evidence	Anecdotal	Prediction intervals, confirmation runs
Regulatory acceptance (QbD)	Not recommended	ICH Q8(R2) recommended

Table 1. Comparison of OFAT and DOE approaches for bioprocess optimization. DOE provides statistically rigorous factor interaction estimation and is recommended by ICH Q8(R2) for Quality by Design.

Consider a temperature × pH interaction in CHO cell culture: at pH 7.0, lowering temperature from 37 °C to 33 °C increases titer by 40%. But at pH 6.8, the same temperature shift increases titer by only 10%. OFAT at a single pH would report a misleading temperature effect. DOE detects this interaction and maps the true response surface.

Stage 1: Screening Designs — Identify Critical Factors

Screening is the first stage of DOE where you test 6–15 candidate factors in a small number of runs to identify the 3–5 factors that most significantly affect your response. The goal is factor elimination, not optimization—you only need to detect main effects.

Plackett-Burman Design

Plackett-Burman (PB) designs are Resolution III screening designs that test each factor at two levels (high and low) in a number of runs equal to a multiple of 4. For 7 factors, a PB design requires only 12 runs—compared to 128 runs for a full factorial at 2 levels.

PB designs assume all factor interactions are negligible, which is a reasonable starting assumption when you have many factors. The tradeoff is that any interaction effects get aliased with main effects, so significant factors should be validated in a follow-up design.

Fractional Factorial Design

Fractional factorial designs are more flexible than PB and come in different resolutions. A Resolution IV design (e.g., 2^7-3) confounds two-factor interactions with each other but not with main effects, giving cleaner main effect estimates at the cost of more runs.

Table 2. Screening design selection guide for bioprocess DOE
Design	Factors	Levels	Runs	Resolution	Best For
Plackett-Burman	6–15	2	12–20	III	Maximum factor screening, minimal runs
2^k-p Fractional Factorial	4–8	2	8–32	III–V	When some interactions are expected
Definitive Screening (DSD)	5–16	3	2k+1	—	Combined screening + optimization
Full Factorial 2^k	2–4	2	4–16	Full	Small factor sets, complete interaction info

Table 2. Guide for selecting screening designs based on the number of factors and desired resolution. DSD designs provide three-level estimation in fewer runs than the traditional two-stage approach.

Fed-Batch Feed Strategy Calculator

Once you’ve identified optimal feeding parameters with DOE, use our calculator to generate time-resolved feed rate schedules for exponential, linear, or constant feeding.

Open Calculator →

Stage 2: Optimization Designs — Build the Response Surface

Response surface methodology (RSM) designs test 2–5 significant factors at 3–5 levels to build a second-order polynomial model that maps the relationship between factors and responses. The model includes linear terms, quadratic terms, and two-factor interaction terms, enabling prediction of the optimum and construction of a design space.

Central Composite Design (CCD)

Central Composite Design is the most widely used RSM design in bioprocessing. It combines a full factorial (or fractional factorial) with axial (star) points and center-point replicates. For 3 factors, a face-centered CCD requires 20 runs: 8 factorial points + 6 axial points + 6 center points.

CCD tests each factor at 5 levels (−α, −1, 0, +1, +α), where α = 2^k/4 for rotatability. Face-centered CCD (α = 1) avoids extreme levels but sacrifices rotatability. Choose face-centered CCD when factor ranges have hard constraints (e.g., pH cannot go below 6.0).

Box-Behnken Design (BBD)

Box-Behnken designs test factors at only 3 levels and never run all factors at their extreme values simultaneously, making them safer for bioprocess applications where extreme combinations might kill cells or damage equipment. For 3 factors, BBD requires 15 runs (12 edge midpoints + 3 center points).

The tradeoff: BBD does not include factorial corner points, so it may not predict as accurately near the edges of the design space. Use BBD when factor range extremes are costly or risky to run.

The second-order polynomial model for 3 factors takes the form:

RSM Model Equation

Y = β₀ + β₁X₁ + β₂X₂ + β₃X₃ + β₁₂X₁X₂ + β₁₃X₁X₃ + β₂₃X₂X₃ + β₁₁X₁² + β₂₂X₂² + β₃₃X₃²

Where Y is the response (e.g., titer in g/L), X₁, X₂, X₃ are coded factor levels (−1 to +1), β₀ is the intercept, β_i are linear coefficients, β_ij are interaction coefficients, and β_ii are quadratic coefficients.

Figure 2. Response surface contour plot showing the interaction between temperature and pH on E. coli recombinant protein titer (g/L). The contour lines represent predicted titer values from a CCD model. The optimum region (dark teal) is centered around 32–34 °C and pH 6.9–7.1. Data represents typical DOE results from fed-batch E. coli expression optimization.

Definitive Screening Designs: The Modern Shortcut

Definitive Screening Designs (DSDs), introduced by Jones and Nachtsheim in 2011, represent the most significant advance in DOE methodology for bioprocess applications in the past decade. DSDs combine the screening and optimization stages into a single design that requires only 2k+1 runs for k factors—for example, 13 runs for 6 factors versus 44+ runs for a traditional PB screening followed by CCD optimization.

DSDs achieve this efficiency through three key properties:

Three levels per factor — enabling estimation of quadratic (curvature) effects, unlike PB or fractional factorial.
Main effects are orthogonal to two-factor interactions — main effect estimates are unbiased regardless of which interactions are active.
Main effects are orthogonal to quadratic effects — curvature detection is built in, not confounded with linear terms.

For bioprocess optimization with 6–8 candidate factors, DSDs have been adopted by major biopharmaceutical companies as of 2025 for mAb process characterization, vaccine development, and gene therapy manufacturing. The reduction from ~45 experiments (PB + CCD) to ~15 experiments translates directly to weeks of saved bioreactor time and tens of thousands of dollars in media costs.

The limitation: DSDs require at least 6 factors to fully estimate a second-order model. With fewer factors, use a traditional CCD or BBD.

Worked Example: DOE for E. coli Fed-Batch Optimization

This worked example walks through a complete DOE optimization of recombinant protein expression in E. coli BL21(DE3) fed-batch culture, targeting maximum soluble titer (g/L).

Worked Example — Stage 1: Plackett-Burman Screening

Objective: Identify the most significant factors affecting soluble titer from an initial set of 7 factors.

Factors (7):

Induction temperature: 25 °C (low) / 37 °C (high)
IPTG concentration: 0.1 mM (low) / 1.0 mM (high)
Induction OD₆₀₀: 0.6 (low) / 2.0 (high)
Post-induction pH: 6.5 (low) / 7.5 (high)
Dissolved oxygen: 20% (low) / 60% (high)
Glucose feed rate: 2 g/L/h (low) / 8 g/L/h (high)
Induction duration: 4 h (low) / 16 h (high)

Design: Plackett-Burman, 12 runs + 3 center points = 15 experiments

Response: Soluble protein titer (g/L)

Results (ANOVA, p < 0.05):
• Induction temperature: p = 0.001 *** (most significant)
• IPTG concentration: p = 0.008 **
• Post-induction pH: p = 0.023 *
• Induction OD600: p = 0.089 (borderline)
• DO, feed rate, duration: p > 0.10 (not significant)

Conclusion: Three factors (temperature, IPTG, pH) carry forward to optimization. OD₆₀₀ is borderline—fix at 1.0 (center) and monitor.

Worked Example — Stage 2: CCD Optimization

Factors (3): Temperature (25–37 °C), IPTG (0.1–1.0 mM), pH (6.5–7.5)

Design: Face-centered CCD (α = 1), 20 runs (8 factorial + 6 axial + 6 center points)

Coded levels:
• Temperature: −1 = 25 °C, 0 = 31 °C, +1 = 37 °C
• IPTG: −1 = 0.1 mM, 0 = 0.55 mM, +1 = 1.0 mM
• pH: −1 = 6.5, 0 = 7.0, +1 = 7.5

Fitted model (R² = 0.94, Adj R² = 0.91, Pred R² = 0.84):
Titer = 3.42 − 0.87·T + 0.31·IPTG + 0.22·pH
− 0.45·T·IPTG + 0.18·T·pH
− 0.68·T² − 0.29·IPTG² − 0.14·pH²

Predicted optimum: Temperature = 28.5 °C, IPTG = 0.4 mM, pH = 7.05

Predicted titer: 4.12 g/L (95% PI: 3.6–4.6 g/L)

OFAT baseline titer: 2.1 g/L → DOE improvement: 1.96×

E. coli Expression Optimizer

Use our interactive tool to explore strain selection, promoter systems, IPTG induction parameters, and soluble vs. inclusion body expression strategies.

Optimize Expression →

Figure 3. Main effects and interaction coefficients from the CCD model for E. coli fed-batch titer optimization. Negative coefficients for temperature (T) indicate that lower temperatures favor soluble expression. The T×IPTG interaction is the strongest interaction term, confirming that optimal IPTG concentration depends on induction temperature.

Stage 3: Model Validation and Confirmation Runs

A DOE model is only as reliable as its validation. Run 3–5 independent experiments at the predicted optimum conditions and compare actual vs. predicted responses. The actual values must fall within the 95% prediction interval for the model to be considered validated.

Model Diagnostic Checklist

Before running confirmation experiments, verify these model diagnostics:

R² > 0.85 — the model explains at least 85% of response variance.
Adjusted R² − Predicted R² < 0.20 — if this gap exceeds 0.20, the model may be overfitting. Consider removing non-significant terms.
Adequate precision > 4 — the signal-to-noise ratio is sufficient for the model to navigate the design space.
Lack of fit: p > 0.05 — the model adequately captures the data structure (non-significant lack of fit is good).
Normal probability plot — residuals should fall on a straight line with no outliers or patterns.
Residuals vs. predicted — residuals should scatter randomly with no funnel shape (heteroscedasticity) or curvature.

Worked Example — Stage 3: Confirmation Runs

Predicted optimum: T = 28.5 °C, IPTG = 0.4 mM, pH = 7.05

Predicted titer: 4.12 g/L (95% PI: 3.6–4.6 g/L)

Confirmation run results (n = 5):
Run 1: 4.08 g/L ✓
Run 2: 3.91 g/L ✓
Run 3: 4.22 g/L ✓
Run 4: 3.85 g/L ✓
Run 5: 4.15 g/L ✓

Mean: 4.04 ± 0.15 g/L
All values within 95% PI (3.6–4.6) → Model validated ✓

From Optimum to Design Space

For regulatory submissions under QbD (ICH Q8), the validated model defines a design space—the multidimensional region of factor combinations where product quality is assured. The design space is typically narrower than the experimental range and is established by setting acceptance criteria on all critical quality attributes (CQAs) simultaneously.

Edge-of-failure experiments at the design space boundaries provide evidence that the process is robust. Regulatory agencies expect evidence that the process performs acceptably throughout the entire design space, not just at the optimum point.

Figure 4. Model validation decision flow for DOE in bioprocessing. Each stage has pass/fail criteria with corrective actions for failures. Confirmation runs at the predicted optimum are the final validation step before defining the design space.

DOE Software and Tools for Bioprocess Engineers

Selecting the right DOE software depends on your organization’s needs and budget. All major platforms support the design types discussed in this guide (PB, fractional factorial, CCD, BBD, DSD). The differences lie in ease of use, visualization capabilities, and bioprocess-specific features.

Table 3. DOE software comparison for bioprocess applications (as of 2026)
Software	License Type	Strengths	Bioprocess Features
JMP (SAS)	Commercial (~$1,800/yr)	Visual design builder, DSD support, contour profiler	Bioprocess DOE tutorials, pharma templates
Design-Expert (Stat-Ease)	Commercial (~$1,500/yr)	RSM specialist, 3D surface plots, ANOVA diagnostics	Mixture designs for media optimization
Minitab	Commercial (~$1,600/yr)	Wide statistical toolset, SPC integration	QC/validation workflows, control charts
MODDE (Sartorius)	Commercial	Built for bioprocess, MVDA integration	Ambr/bioreactor data import, design space explorer
Python (pyDOE2, statsmodels)	Open source	Flexible, scriptable, CI/CD integration	Custom models, Bayesian optimization integration
R (rsm, FrF2, AlgDesign)	Open source	Publication-quality plots, extensive packages	Custom optimal designs, mixed models

Table 3. Comparison of DOE software platforms commonly used in bioprocess development. Commercial tools offer better visualization and support, while open-source options provide flexibility and scriptability.

An emerging trend as of 2025–2026 is the integration of Bayesian optimization with DOE. Tools like BayBE (Merck KGaA), Obsidian (Merck & Co.), and ProcessOptimizer (Novo Nordisk) use Gaussian process models to adaptively select the next experiment based on all previous results, reducing total experiments by approximately 50–69% compared to classical RSM for equivalent optimization performance.

Media Formulation & Cost Estimator

After optimizing media composition with DOE, estimate costs across 14 basal media types with supplement options and batch/fed-batch/perfusion mode comparison.

Estimate Costs →

Frequently Asked Questions

How many experiments do I need for DOE optimization in bioprocessing?

The number of experiments depends on the design type and number of factors. A Plackett-Burman screening design for 7 factors requires only 12 runs. A Central Composite Design (CCD) for 3 factors needs 20 runs (8 factorial + 6 axial + 6 center points). Definitive Screening Designs need only 2k+1 runs (e.g., 13 runs for 6 factors) and can estimate main effects, interactions, and quadratic terms in a single step.

What is the difference between screening and optimization DOE designs?

Screening designs (Plackett-Burman, fractional factorial) test many factors (6–15) in few runs to identify which 3–5 factors significantly affect the response. They detect main effects but not interactions or curvature. Optimization designs (CCD, Box-Behnken) test fewer factors (2–5) at more levels to build a response surface model with quadratic terms, enabling precise identification of optimum conditions.

How does DOE compare to one-factor-at-a-time (OFAT) in bioprocessing?

DOE typically achieves 1.3–2× higher yield improvements compared to OFAT because it detects factor interactions that OFAT misses entirely. OFAT requires more total experiments while exploring less of the design space. For a 4-factor optimization, OFAT might need 20+ sequential experiments and still miss the true optimum, whereas a CCD covers the entire design space in 30 runs including replicates.

What software is best for DOE in bioprocessing?

JMP (SAS) and Design-Expert (Stat-Ease) are the most widely used DOE software in bioprocessing due to their visual design builders and response surface tools. Minitab is popular in QC environments. For open-source options, Python’s pyDOE2 and R’s rsm and FrF2 packages provide full DOE capability. MODDE (Sartorius) is specifically designed for bioprocess applications with built-in templates.

What is a Definitive Screening Design and when should I use it?

A Definitive Screening Design (DSD) is a modern DOE approach introduced by Jones and Nachtsheim (2011) that estimates main effects, two-factor interactions, and quadratic effects in a single design requiring only 2k+1 runs for k factors. Use DSDs when you have 5–16 factors and want to combine screening and optimization into one step, reducing total experiments by over 50% compared to the traditional two-stage approach.

How do you validate a DOE model for bioprocess optimization?

Validate a DOE model by running 3–5 confirmation runs at the predicted optimum conditions and comparing actual vs. predicted responses. The actual values should fall within the prediction interval (typically 95% PI). Also check model diagnostics: R² should exceed 0.85, predicted R² should be within 0.20 of adjusted R², adequate precision should exceed 4, and residuals should show no patterns in normal probability and residuals vs. predicted plots.

Related Tools

Fed-Batch Feed Strategy Calculator — Generate exponential, linear, or constant feeding profiles using Monod kinetics.
E. coli Expression Optimizer — Optimize strain, promoter, and induction parameters for recombinant protein expression.
Media Formulation & Cost Estimator — Compare 14 basal media types and estimate costs for batch, fed-batch, and perfusion modes.

References

Mandenius, C.F. & Brundin, A. (2008). Bioprocess optimization using design-of-experiments methodology. Biotechnology Progress, 24(6), 1191–1203. DOI: 10.1002/btpr.67
Jones, B. & Nachtsheim, C.J. (2011). A class of three-level designs for definitive screening in the presence of second-order effects. Journal of Quality Technology, 43(1), 1–15.
Politis, S.N. et al. (2021). Design of experiments and design space approaches in pharmaceutical bioprocess optimization. European Journal of Pharmaceutics and Biopharmaceutics, 166, 208–221. DOI: 10.1016/j.ejpb.2021.06.004
Papathanasiou, M.M. & Experiment review (2023). A review of algorithmic approaches for cell culture media optimization. Frontiers in Bioengineering and Biotechnology, 11, 1195294. DOI: 10.3389/fbioe.2023.1195294
Gisperg, G.F. et al. (2025). Bayesian Optimization in Bioprocess Engineering — Where Do We Stand Today? Biotechnology and Bioengineering. DOI: 10.1002/bit.28960

How to Design a DOE for Bioprocess Optimization (Step-by-Step)

Key Takeaways

Contents

What Is Design of Experiments (DOE) in Bioprocessing?

DOE vs. OFAT: Why Traditional Optimization Fails

Stage 1: Screening Designs — Identify Critical Factors

Plackett-Burman Design

Fractional Factorial Design

Fed-Batch Feed Strategy Calculator

Stage 2: Optimization Designs — Build the Response Surface

Central Composite Design (CCD)

Box-Behnken Design (BBD)

RSM Model Equation

Definitive Screening Designs: The Modern Shortcut

Worked Example: DOE for E. coli Fed-Batch Optimization

Worked Example — Stage 1: Plackett-Burman Screening

Worked Example — Stage 2: CCD Optimization

E. coli Expression Optimizer

Stage 3: Model Validation and Confirmation Runs

Model Diagnostic Checklist

Worked Example — Stage 3: Confirmation Runs

From Optimum to Design Space

DOE Software and Tools for Bioprocess Engineers

Media Formulation & Cost Estimator

Frequently Asked Questions

How many experiments do I need for DOE optimization in bioprocessing?

What is the difference between screening and optimization DOE designs?

How does DOE compare to one-factor-at-a-time (OFAT) in bioprocessing?

What software is best for DOE in bioprocessing?

What is a Definitive Screening Design and when should I use it?

How do you validate a DOE model for bioprocess optimization?

Related Tools

References

📚 Resources & Further Reading

Stay updated on bioprocess tools

Key Takeaways

Contents

What Is Design of Experiments (DOE) in Bioprocessing?

DOE vs. OFAT: Why Traditional Optimization Fails

Stage 1: Screening Designs — Identify Critical Factors

Plackett-Burman Design

Fractional Factorial Design

Fed-Batch Feed Strategy Calculator

Stage 2: Optimization Designs — Build the Response Surface

Central Composite Design (CCD)

Box-Behnken Design (BBD)

RSM Model Equation

Definitive Screening Designs: The Modern Shortcut

Worked Example: DOE for E. coli Fed-Batch Optimization

Worked Example — Stage 1: Plackett-Burman Screening

Worked Example — Stage 2: CCD Optimization

E. coli Expression Optimizer

Stage 3: Model Validation and Confirmation Runs

Model Diagnostic Checklist

Worked Example — Stage 3: Confirmation Runs

From Optimum to Design Space

DOE Software and Tools for Bioprocess Engineers

Media Formulation & Cost Estimator

Frequently Asked Questions

How many experiments do I need for DOE optimization in bioprocessing?

What is the difference between screening and optimization DOE designs?

How does DOE compare to one-factor-at-a-time (OFAT) in bioprocessing?

What software is best for DOE in bioprocessing?

What is a Definitive Screening Design and when should I use it?

How do you validate a DOE model for bioprocess optimization?

Related Tools

References

Related Articles

Fed-Batch Feeding Strategies Explained

E. coli Expression Systems: Complete Guide

IPTG Induction Optimization

5 Bioreactor Scale-Up Criteria Compared

📚 Resources & Further Reading

Stay updated on bioprocess tools