Deep DiveBiology & Life SciencesMachine/Deep Learning

RFdiffusion3: All-Atom Protein Design at 10x Speed

RFdiffusion3 from the Institute for Protein Design at UW enables de novo design of all-atom biomolecular interactions, operating approximately 10x faster than its predecessor and outperforming on 37 of 41 enzyme scaffold benchmarks by inverting AlphaFold3's prediction framework into a generative model.

By Sean K.S. Shin
This blog summarizes research trends based on published paper abstracts. Specific numbers or findings may contain inaccuracies. For scholarly rigor, always consult the original papers cited in each post.
Disclaimer: This post is a research trend overview for informational purposes. Specific findings, statistics, and claims should be verified against the original papers before citation in academic work.

RFdiffusion3: All-Atom Protein Design at 10x Speed

Protein design has historically operated at two levels of resolution. At the backbone level, designers specify the overall foldโ€”the arrangement of alpha-helices, beta-sheets, and loops that define a protein's shape. At the all-atom level, every atom's position matters: the precise geometry of a catalytic site, the hydrogen bonding network at a protein-protein interface, the placement of water molecules that mediate binding. Most computational design tools have worked at the backbone level, leaving all-atom detail to subsequent refinement steps that are slow, approximate, and often require manual intervention.

RFdiffusion3, from David Baker's Institute for Protein Design at the University of Washington, closes this gap. It is a generative model that designs proteins at all-atom resolution from the outset, producing complete atomic structures rather than backbone traces that must be elaborated. It does so approximately 10x faster than RFdiffusion2, and it outperforms its predecessor on 37 of 41 enzyme scaffold benchmarks.

From Prediction to Generation: Inverting AlphaFold3

The conceptual architecture of RFdiffusion3 is best understood as an inversion of AlphaFold3. AlphaFold3 takes a biomolecular systemโ€”protein sequences, nucleic acids, small molecules, ionsโ€”and predicts their three-dimensional arrangement. It is a structure prediction model: given components, output structure.

RFdiffusion3 runs this logic in reverse. It starts from a desired structural outcomeโ€”a binding interface, a catalytic geometry, a scaffolded functional siteโ€”and generates the protein sequence and structure that would produce it. This inversion leverages the same learned representations of biomolecular physics that make AlphaFold3 accurate at prediction, repurposed for generation through a diffusion framework.

The diffusion process works by iteratively denoising a random atomic cloud into a coherent structure. At each step, the model applies its understanding of physical constraintsโ€”bond angles, van der Waals radii, hydrogen bonding geometries, hydrophobic packingโ€”to move atoms toward chemically and physically plausible positions. The "all-atom" designation is critical: unlike backbone-only diffusion, RFdiffusion3 models sidechain atoms, ligand atoms, and solvent-exposed surfaces simultaneously, producing designs that are closer to experimentally realizable structures without post-hoc refinement.

Performance: 37 of 41 Benchmarks

The preprint (bioRxiv 2025.09.18.676967) reports that RFdiffusion3 outperforms RFdiffusion2 on 37 of 41 enzyme scaffold design benchmarks. These benchmarks evaluate the model's ability to generate protein scaffolds that position catalytic residues in geometries compatible with enzymatic function.

<
ClaimSourceConfidenceStatus
RFdiffusion3 enables de novo design of all-atom biomolecular interactionsbioRxiv 2025.09.18.676967 abstractHighStated in abstract
Approximately 10x faster than RFdiffusion2bioRxiv 2025.09.18.676967 abstractHighStated in abstract
Outperforms on 37 of 41 enzyme scaffold benchmarksbioRxiv 2025.09.18.676967 abstractHighStated in abstract
Inverts AlphaFold3 prediction framework into generative modelbioRxiv 2025.09.18.676967 abstractHighStated in abstract

Enzyme design is a demanding test case because catalytic function depends on sub-angstrom positioning of key residues. A designed enzyme that places a catalytic triad's residues even 0.5 angstroms from their optimal positions may show dramatically reduced activity. The fact that RFdiffusion3 outperforms on the vast majority of these benchmarks while operating at all-atom resolution suggests that the model has learned meaningful representations of the geometric requirements for enzyme function.

The four benchmarks where RFdiffusion2 still outperforms RFdiffusion3 deserve attention. Without detailed analysis of which specific enzyme geometries are involved, it is difficult to determine whether these represent systematic weaknesses in the new architecture or statistical noise in a comparison across 41 test cases.

The 10x Speed Improvement

The approximate 10x speed improvement over RFdiffusion2 has practical consequences that extend beyond convenience. Protein design workflows are iterative: designers generate many candidates, filter them computationally, synthesize the most promising ones, and test them experimentally. The experimental stepsโ€”gene synthesis, protein expression, purification, and functional assaysโ€”are slow and expensive. Computational design speed determines how many candidates can be generated and filtered before committing to experimental resources.

A 10x speedup means that the same computational budget produces an order of magnitude more candidate designs. In a field where the success rate of designed proteinsโ€”the fraction that fold correctly and function as intended when actually synthesizedโ€”remains well below 100%, generating more candidates per design cycle is a direct multiplier on the probability of finding functional designs.

The speed improvement also lowers the barrier to applying RFdiffusion3 to larger and more complex design problems. All-atom design of a 500-residue protein interacting with a small-molecule ligand, a metal cofactor, and a partner protein involves positioning thousands of atoms simultaneously. At RFdiffusion2 speeds, such problems might require days of GPU time per design; at 10x faster, they become tractable as routine design tasks.

All-Atom Design: Why It Matters

The transition from backbone-level to all-atom design addresses a persistent gap in the protein design pipeline. Previous workflows required a two-step process: first, design a backbone fold using RFdiffusion or similar tools; second, place sidechains and optimize their conformations using tools like Rosetta's packer or other sidechain placement algorithms. Each step introduced approximations, and errors in backbone design could not always be corrected by sidechain optimization.

All-atom design integrates these steps, allowing the model to jointly optimize backbone geometry and sidechain positioning. This is particularly important for designing protein-small molecule interactions, where the binding pocket geometry depends on precise sidechain arrangements, and for protein-protein interfaces, where complementarity extends to the atomic level.

Open Questions

  • Experimental validation rate: Computational outperformance on benchmarks does not guarantee improved experimental success rates. What fraction of RFdiffusion3's top-ranked designs fold correctly and function as intended when synthesized, and how does this compare to RFdiffusion2?
  • Dynamics and flexibility: All-atom design produces static structures, but proteins are dynamic. Does RFdiffusion3's all-atom accuracy extend to designing proteins whose function depends on conformational changes, allosteric regulation, or intrinsically disordered regions?
  • Small molecule generalization: The enzyme scaffold benchmarks test catalytic site geometry, but therapeutic protein design often involves designing interactions with drug-like small molecules that are not enzyme substrates. How well does the model generalize to these chemically diverse targets?
  • Accessibility and compute requirements: At what computational cost does RFdiffusion3 operate, and is the 10x speedup sufficient to make all-atom design accessible to academic labs without large GPU clusters?
  • The inversion of structure prediction into structure generation represents a conceptual advance in how the field uses deep learning for molecular design. By repurposing the physics learned through prediction tasks, RFdiffusion3 brings the protein design community closer to a workflow where atomic-level design intent can be directly expressed and realized.


    References (1)

    Institute for Protein Design, University of Washington. (2025). De novo design of all-atom biomolecular interactions with RFdiffusion3. bioRxiv.

    Explore this topic deeper

    Search 290M+ papers, detect research gaps, and find what hasn't been studied yet.

    Click to remove unwanted keywords

    Search 6 keywords โ†’