Paper ReviewComputer SystemsExperimental Design

Gigapixel Pathology at Scale: Distributed Computing for Whole-Slide Image Analysis

A single whole-slide pathology image can exceed 10 gigapixelsโ€”far too large for any single GPU to process. ComPRePS 2.0 demonstrates how HPC clusters can process these images in parallel, enabling computational pathology at the scale needed for population-level cancer screening.

By Sean K.S. Shin
This blog summarizes research trends based on published paper abstracts. Specific numbers or findings may contain inaccuracies. For scholarly rigor, always consult the original papers cited in each post.

The digitization of pathology has created a data challenge unlike anything else in medicine. A single whole-slide image (WSI)โ€”a microscopy scan of a tissue sample at diagnostic resolutionโ€”typically contains 1 to 10 gigapixels, with large specimens at 40ร— magnification reaching 10 gigapixels or more. A moderate-sized hospital generates thousands of WSIs daily. A national cancer screening program processes millions annually.

Processing these images with AIโ€”detecting cancer cells, grading tumors, quantifying biomarkersโ€”requires computational resources that no single machine can provide. A single WSI may take minutes to process on a high-end GPU; multiplied by thousands or millions of images, the total computation is enormous. The bottleneck is not the AI model (which is relatively compact) but the data pipeline: reading multi-gigabyte image files, tiling them into processable patches, distributing patches across compute nodes, running inference, and aggregating results back into slide-level predictions.

Kumar et al.'s ComPRePS 2.0 tackles this pipeline challenge head-on, demonstrating how HPC clusters can be organized to process histopathological data at the scale required for clinical deployment.

The Data Pipeline Challenge

Processing a WSI with AI involves a pipeline that is I/O-intensive, compute-intensive, and coordination-intensive in roughly equal measure:

Reading: WSIs are stored in pyramidal formats (SVS, NDPI, MRXS) where the full-resolution image is accompanied by lower-resolution overview layers. Reading the full-resolution layer requires streaming gigabytes of compressed image data from storageโ€”a process that is often storage-bandwidth-limited rather than compute-limited.

Tiling: The full-resolution image is divided into overlapping tiles (typically 256ร—256 or 512ร—512 pixels) that can be processed independently by the AI model. A single WSI may produce 50,000 to 200,000 tiles. Managing this tile setโ€”tracking coordinates, handling overlap regions, maintaining tissue masks to skip background tilesโ€”requires careful bookkeeping.

Inference: Each tile is processed by a deep learning model that classifies tissue type, detects cellular features, or segments structures. Individual tile inference is fast (milliseconds on a GPU), but the volume of tiles makes total inference time substantial.

Aggregation: Tile-level predictions must be aggregated into slide-level resultsโ€”combining thousands of tile predictions into a single diagnosis, tumor grade, or biomarker quantification. The aggregation logic must handle tile boundaries (features that span multiple tiles) and spatial context (a cluster of positive tiles is more significant than isolated positives).

The ComPRePS 2.0 Architecture

ComPRePS 2.0 distributes this pipeline across an HPC cluster with three key design decisions:

Task-level parallelism: Each WSI is processed as an independent task. Tasks are distributed across cluster nodes by a job scheduler that balances load and respects resource constraints (GPU memory, storage bandwidth).

Pipeline parallelism within tasks: Within a single WSI, the read-tile-infer-aggregate stages overlap: while one batch of tiles is being processed by the GPU, the next batch is being read and tiled by the CPU, and the previous batch's results are being aggregated. This pipelining hides I/O latency behind compute time.

Storage optimization: WSIs are pre-processed to extract tissue masks and tile coordinates before the inference phase begins. This pre-processing can be done once and cached, avoiding redundant computation when the same slide is processed with different AI models.

Claims and Evidence

<
ClaimEvidenceVerdict
WSI processing is computationally intensive at clinical scaleData volume analysis for hospital and population-scale pathologyโœ… Well-documented
HPC clusters can parallelize WSI processing effectivelyComPRePS 2.0 demonstrates parallel processing across cluster nodesโœ… Supported
Pipeline parallelism hides I/O latencyOverlapping read/compute/aggregate stages demonstratedโœ… Supported
Current computational pathology systems meet clinical throughput requirementsProcessing speed depends on cluster size and model complexityโš ๏ธ Achievable but resource-intensive

Open Questions

  • Real-time pathology: Can computational pathology achieve fast enough turnaround for intra-operative consultationโ€”where a surgeon waits for a diagnosis while the patient is on the operating table? This requires processing in minutes, not hours.
  • Cloud vs. on-premises: Clinical data governance often requires processing within the hospital network. Should computational pathology run on cloud HPC (scalable but raises data sovereignty concerns) or on-premises clusters (secure but limited in scale)?
  • Multi-stain integration: Pathologists use multiple staining techniques (H&E, IHC, special stains) on consecutive tissue sections. AI systems that integrate multi-stain information require cross-slide registrationโ€”aligning images from different stains of adjacent tissue sectionsโ€”which adds geometric complexity to the processing pipeline.
  • Quality control: Not all WSIs are suitable for AI analysisโ€”out-of-focus regions, tissue folds, air bubbles, and staining artifacts can cause incorrect predictions. Automated quality control that detects and flags problematic regions before inference improves reliability but adds processing overhead.
  • What This Means for Your Research

    For computational pathology researchers, the systems infrastructure for processing WSIs at scale is as important as the AI models that analyze them. A model that achieves 99% accuracy on a curated benchmark but cannot process a hospital's daily slide volume is not clinically useful. ComPRePS 2.0 demonstrates that the infrastructure challenge is solvable with careful pipeline design.

    For HPC researchers, digital pathology provides a domain with clear throughput requirements, well-defined pipeline stages, and enormous data volumesโ€”characteristics that map well onto HPC cluster architectures. The challenge of processing petabytes of image data with AI models is shared with satellite remote sensing, genomics, and other data-intensive scientific domains.

    References (2)

    [1] Kumar, S., Paul, A., Abdelazim, H. et al. (2025). ComPRePS 2.0: enabling massive-scale distributed computing on HPC cluster for histopathological data processing. SPIE.
    Katari Chaluva Kumar, S., Paul, A. S., Abdelazim, H., Dunklin, W., Manthey, D., Moskalenko, O., et al. (2025). ComPRePS 2.0: enabling massive-scale distributed computing on high-performance computing cluster for histopathological data processing. Medical Imaging 2025: Digital and Computational Pathology, 44.

    Explore this topic deeper

    Search 290M+ papers, detect research gaps, and find what hasn't been studied yet.

    Click to remove unwanted keywords

    Search 8 keywords โ†’