History & Area StudiesMachine/Deep Learning

What a Cold War Spy Satellite and AI Found Beneath the Iraqi Desert

By Sean K.S. Shin
This blog summarizes research trends based on published paper abstracts. Specific numbers or findings may contain inaccuracies. For scholarly rigor, always consult the original papers cited in each post.

The Photographs Nobody Was Supposed to See

In the early 1960s, at the height of the Cold War, the United States launched a series of reconnaissance satellites under the codename CORONA. Their mission was straightforward espionage: photograph Soviet military installations, Chinese nuclear facilities, and Middle Eastern infrastructure from orbit. The cameras captured thousands of images of Iraq's landscape, including the flat agricultural plains west of Baghdad around Abu Ghraib, before the film canisters were ejected from orbit, snagged mid-air by military aircraft, and locked away in classified vaults.

For decades, those photographs gathered dust. Then, in 1995, President Clinton ordered the CORONA archive declassified. Historians and geographers suddenly had access to high-resolution images of landscapes that no longer existed, terrain that had been bulldozed, irrigated, urbanized, or bombed into unrecognizability in the intervening decades.

Sixty years after those spy satellites photographed Iraq, a team of archaeologists and computer scientists trained an AI on the declassified images. What the AI found was both a discovery and an elegy: four previously unknown archaeological sites, along with the confirmation that 31 known sites had been completely destroyed, erased from the modern landscape with no surface trace remaining. The ancient Mesopotamian tells, the accumulated mounds of millennia of human habitation, had simply vanished.

This is the story of how Cold War espionage, artificial intelligence, and the race against destruction are converging to reshape our understanding of the ancient world.

Seeing What Has Been Lost

The term "tell" refers to an artificial mound formed by the accumulated remains of ancient settlements, layer upon layer of mud-brick architecture, refuse, and rebuilding over centuries or millennia. In Mesopotamia, tells are the primary archaeological signature of ancient habitation. They dot the landscape of modern Iraq, Syria, and Turkey, each one a compressed archive of human history. But tells are vulnerable. Modern agriculture, urban expansion, military operations, and looting have destroyed thousands of them across the Middle East, often before they could be surveyed, let alone excavated.

Pistola, Orru, Marchetti, and Roccetti (2025) recognized that the CORONA satellite photographs, taken before much of this destruction occurred, offered a unique temporal window. The images captured the Iraqi landscape in a state that is now irrecoverable. The challenge was scale: manually scanning thousands of square kilometers of satellite imagery for subtle topographic anomalies is prohibitively slow. The solution was a convolutional neural network.

The team developed a model they called BingCORONA, built on the MANet (Multi-scale Attention Network) architecture, and trained it on paired datasets: declassified CORONA photographs from the 1960s and modern Bing satellite imagery. The AI learned to identify the characteristic signatures of tells, subtle circular or oval mounds, slight elevation changes, discoloration patterns, across both historical and contemporary images.

The results were striking. BingCORONA achieved 90% accuracy and 88% recall in detecting tells across the Abu Ghraib district west of Baghdad. More importantly, it identified eight candidate sites that were not in any existing archaeological database. The team then conducted ground-truthing fieldwork, and four of those eight AI-suggested sites were confirmed as genuine archaeological sites. The AI had found traces of ancient settlements that had eluded decades of conventional survey.

But the study also documented loss on an enormous scale. Of the 81 archaeological sites identified across the study area, 31 had been completely destroyed since the CORONA photographs were taken. They were invisible on modern satellite imagery, detectable only because the spy satellites had captured them before they vanished. Without the CORONA archive, these sites would be not merely unexcavated but unknown, as if the settlements and the people who built them had never existed.

Teaching Machines to Read the Ancient World

The Iraqi desert is not the only place where AI is recovering what time has erased. At the other end of the archaeological spectrum, from landscape to language, a parallel revolution is unfolding in the decipherment and restoration of ancient texts.

Assael, Sommerschield, and colleagues (2022) developed Ithaca, a deep neural network designed to restore, date, and attribute ancient Greek inscriptions. The system was trained on the Packard Humanities Institute (I.PHI) dataset, a corpus of 78,608 Greek inscriptions spanning over a millennium of Mediterranean history. These inscriptions, carved into stone, scratched into pottery, and pressed into lead, are among the most important primary sources for understanding ancient Greek politics, religion, economics, and daily life. But many are damaged: letters worn away by weather, sections broken off, text obscured by centuries of erosion.

The traditional approach to restoring these texts is called epigraphy, and it demands years of specialized training. An expert epigraphist examines the surviving letters, considers the grammatical and historical context, and proposes restorations for the missing portions. It is painstaking, slow, and inherently limited by the individual scholar's knowledge and experience.

Ithaca changed the equation. The system uses a transformer architecture, the same family of models underlying modern language AI, but adapted for the specific challenges of fragmentary ancient text. When evaluated on its own, Ithaca achieved 62% accuracy in restoring damaged inscriptions. That figure alone would be impressive, but the real breakthrough emerged when Ithaca was used as a collaborative tool: historians working with Ithaca achieved 72% accuracy, a 2.8-fold improvement over the 25% baseline that historians achieved without AI assistance.

Beyond restoration, Ithaca demonstrated two additional capabilities that transformed the utility of the system. It could attribute inscriptions to their likely region of origin with 70.8% accuracy, and it could date inscriptions with a median error of only 3 years. For a field where dating debates often span decades or centuries, this precision was remarkable. The team used Ithaca to revisit a long-standing controversy over the dating of a set of Athenian decrees, and the AI's analysis supported a re-dating that had been proposed on independent historical grounds.

The Landscape of Archaeological AI

Sommerschield and colleagues (2023) provided the most comprehensive mapping of this emerging field in their survey of machine learning for ancient languages. Reviewing over 230 works, they proposed a six-task taxonomy that captures the full pipeline from physical artifact to historical interpretation: digitization, the conversion of physical inscriptions to digital format; restoration, the prediction of missing or damaged text; attribution, determining the origin and authorship of texts; linguistic analysis, parsing grammar, syntax, and semantics; textual criticism, comparing manuscript traditions to reconstruct original compositions; and translation and decipherment, rendering ancient languages into modern ones or cracking undeciphered scripts.

The survey revealed clear technological trends. Early computational approaches relied on rule-based systems and statistical models. The deep learning era brought recurrent neural networks, then convolutional architectures, and now transformer models that dominate the state of the art. Each generation expanded the scale and accuracy of what was computationally possible.

But Sommerschield et al. also raised a critical concern. Of the 15 ancient languages covered by existing ML research, Greek dominates by a wide margin, followed by Latin, Chinese, and a handful of others. Dozens of ancient languages, many from the Global South, have received little or no computational attention. The authors warned of a form of "digital colonialism" in which the ancient languages of well-funded Western academic traditions receive sophisticated AI tools while others are left behind. The concern is not merely academic: the languages that receive computational investment shape which civilizations receive attention, whose histories get told, and whose past remains accessible to future generations.

Archaeological AI: A Capabilities Map

<
CapabilityRepresentative SystemData ScaleKey AchievementLimitation
Text restorationIthaca (Assael et al., 2022)78,608 inscriptions72% accuracy with historian collaborationGreek-only; limited to inscriptional text
Text datingIthaca (Assael et al., 2022)78,608 inscriptionsMedian error 3 yearsRequires substantial training corpus
Site detectionBingCORONA (Pistola et al., 2025)CORONA + Bing imagery90% accuracy, 4 new sites confirmedRegion-specific training needed
Heritage loss trackingBingCORONA (Pistola et al., 2025)81 sites catalogued31 destroyed sites documentedDependent on availability of historical imagery
Multi-language supportSurvey (Sommerschield et al., 2023)230+ works, 15 languagesSix-task taxonomy establishedGreek dominance; most languages unserved

The Race Against Destruction

What connects the spy satellite photographs of Iraq and the transformer models restoring Greek inscriptions is a shared urgency. The archaeological record is not a stable archive waiting patiently for scholars to examine it. It is actively being destroyed, by conflict, by development, by climate change, by looting, and by neglect. Every tell bulldozed for farmland, every inscription weathered beyond legibility, every manuscript consumed by fire represents an irreversible loss of human knowledge.

The 31 destroyed sites in Pistola et al.'s study area are not an anomaly. Across the Middle East, thousands of archaeological sites have been damaged or destroyed in the past two decades alone, casualties of the Iraq War, the Syrian civil war, ISIS's deliberate campaign of cultural destruction, and the relentless pressure of urbanization and agricultural intensification. The CORONA archive is invaluable precisely because it captured a world that no longer exists, but the archive itself is finite. It covers specific regions at specific moments, and there are vast areas of the ancient world for which no such historical baseline exists.

AI does not solve this problem. It accelerates the rate at which we can extract information from what remains, and it extends our ability to detect what we might otherwise miss. BingCORONA found four sites that human surveyors had overlooked. Ithaca restored inscriptions that individual epigraphists might have spent years puzzling over. The survey by Sommerschield et al. laid the intellectual groundwork for extending these tools to dozens of underserved languages and traditions. But none of these tools can recreate what has been physically destroyed.

The deeper lesson of this research is temporal. The CORONA photographs are valuable because they were taken at a moment that can never be repeated. The inscriptions Ithaca restores are valuable because the stones they were carved on are crumbling in real time. The computational tools are powerful, but their power is bounded by the survival of the evidence they analyze. The race is not between AI and human scholars. It is between the rate of discovery and the rate of destruction.

What To Watch

Three developments will define the next phase of archaeological AI. First, the extension of satellite-based site detection beyond the Middle East to regions with comparable heritage threats: sub-Saharan Africa, South and Southeast Asia, and Central America, where deforestation and development are exposing and destroying sites simultaneously. Second, the expansion of ancient language models beyond Greek and Latin to encompass cuneiform, hieratic, Mayan glyphs, and the dozens of undeciphered or under-studied scripts that could reshape our understanding of the ancient world. Third, and most critically, the integration of these tools into heritage management workflows, so that AI-detected sites receive legal protection before they are bulldozed, not after.

The Cold War satellites that photographed Iraq were instruments of geopolitical competition. The AI systems analyzing those photographs are instruments of cultural memory. The transition from one purpose to the other is itself a kind of archaeology: excavating meaning from data that was created for an entirely different reason, and discovering, beneath the surface of the image, traces of a world that is otherwise gone.

References (3)

[1] Assael, Y., Sommerschield, T., Shillingford, B., Bordbar, M., Pavlopoulos, J., Chatzipanagiotou, M., Androutsopoulos, I., Prag, J., & de Freitas, N. (2022). Restoring and attributing ancient texts using deep neural networks. Nature, 603, 280-283.
[2] Sommerschield, T., Assael, Y., Pavlopoulos, J., Stefanak, V., Senior, A., Dyer, C., Bodel, J., Prag, J., Androutsopoulos, I., & de Freitas, N. (2023). Machine Learning for Ancient Languages: A Survey. Computational Linguistics, 49(3), 703-747.
[3] Pistola, A., Orru, V., Marchetti, N., & Roccetti, M. (2025). AI-ming backwards: Vanishing archaeological landscapes in Mesopotamia and automatic detection of sites on CORONA imagery. PLOS ONE.

Explore this topic deeper

Search 290M+ papers, detect research gaps, and find what hasn't been studied yet.

Click to remove unwanted keywords

Search 8 keywords β†’