← All Posts

📐 Mathematics & Statistics

36 articles in Mathematics & Statistics

Trend Analysis
Optimal transport—the mathematical theory of moving distributions efficiently—is providing new tools for machine learning. From domain adaptation to graph neural networks to training optimization, Wasserstein geometry offers a principled way to compare, align, and transform probability distributions.
optimal transportWasserstein distanceRiemannian geometry
Trend Analysis
Conformal prediction offers distribution-free prediction intervals with guaranteed coverage—but the guarantees assume exchangeable data. Time series violates this assumption. Recent work develops adaptive conformal methods that maintain validity under temporal dependence and distribution shift.
conformal predictionuncertainty quantificationtime series
Trend Analysis
LLM-based theorem provers are achieving results that would have been considered impossible two years ago. Goedel-Prover (87 citations) sets a new state-of-the-art for open-source formal proof generation, while multi-agent systems extend theorem proving to quantum physics—raising questions about the nature of mathematical understanding.
automated theorem provingLeanformal proof
Paper Review
Most ATP benchmarks test undergraduate or competition mathematics. RLMEval evaluates neural theorem provers on research-level mathematics from real publications—revealing that the gap between solving competition problems and advancing mathematical research remains substantial.
neural theorem provingresearch mathematicsevaluation
Paper Review
Optimal transport theory faces a computational wall in high dimensions. Rigollet and Stromme prove that entropic regularization breaks through it, establishing dimension-free convergence rates for plug-in estimators—with implications for transfer learning.
optimal transportentropy regularizationdimensionality
Methodology Guide
Deep learning finds correlations. Causal inference finds causes. Jiao et al. survey the growing intersection where neural networks learn not just to predict, but to reason about interventions, counterfactuals, and structural mechanisms.
causal inferencedeep learningmachine learning
Paper Review
Goedel-Prover is the leading open-source automated theorem prover—achieving state-of-the-art performance in Lean 4 through a bootstrapping strategy that generates its own training data. Seed-Prover complements it with reinforcement learning for deeper reasoning chains.
automated theorem provingLean 4Goedel-Prover
Trend Analysis
Optimal transport theory—measuring the most efficient way to move one probability distribution to another—has become a powerful tool in machine learning. The 2025-2026 frontier extends OT to curved Riemannian manifolds, enabling geometric neural network training and operator learning on complex domains.
optimal transportWasserstein distanceRiemannian geometry
Methodology Guide
Conformal prediction provides distribution-free coverage guarantees—but only when calibration and test data are exchangeable. Three 2025 papers extend CP to the real world: adaptive methods for drifting time series, optimal transport for distribution shift, and robust calibration under label corruption.
conformal predictionuncertainty quantificationdistribution shift
Paper Review
Estimating causal effects from observational data is the central challenge of evidence-based medicine, policy, and social science. When confounders are high-dimensional—hundreds of dietary components, thousands of EHR variables—standard methods fail. Bayesian semiparametric approaches offer a principled path through this complexity.
Bayesian inferencecausal inferencehigh-dimensional
Field Map
Algebraic number theory—the study of number systems beyond the integers—underpins both the deepest results in pure mathematics (Fermat's Last Theorem) and the most critical infrastructure of the digital economy (elliptic curve cryptography). As quantum computing threatens current cryptosystems, this ancient-modern connection becomes urgent.
algebraic number theoryFermat last theoremelliptic curve cryptography
Paper Review
Stochastic PDEs model physical systems under uncertainty—from turbulent flows to financial markets. Score-based diffusion models, originally designed for image generation, are being repurposed to learn SPDE solutions adaptively, enabling Bayesian inference over complex physical systems without traditional MCMC.
score-based diffusionstochastic PDEgenerative model
Trend Analysis
Persistent homology has been TDA's workhorse for a decade—extracting topological features (loops, voids, connected components) from data. But 2025's research frontier moves beyond: topological deep learning, Euler characteristic methods, and Reeb graphs are enabling shape-aware AI for molecules, cells, and complex networks.
topological data analysispersistent homologyTDA
Deep Dive
Aristotle solves 2025 International Mathematical Olympiad problems at gold-medal level by combining informal mathematical intuition (LLM reasoning) with formal proof verification (Lean 4). MATP-BENCH extends the challenge to multimodal problems requiring diagram understanding.
IMOmathematical olympiadAristotle
Paper Review
Multivariate time series—financial markets, brain signals, climate systems—are governed by causal relationships encoded in directed acyclic graphs. Learning these causal structures from high-dimensional data is one of the hardest problems in modern statistics, and Bayesian methods offer principled uncertainty quantification over possible causal graphs.
directed acyclic graphcausal discoverytime series
Paper Review
The Max-Cut problem—partitioning a graph to maximize edges between partitions—is NP-hard and a proving ground for quantum advantage. Tate & Gupta identify small graph families where QAOA outperforms the classical Goemans-Williamson algorithm, providing concrete instances for near-term quantum benchmarking.
Max-CutQAOAquantum optimization
Deep Dive
Covariance matrices are not just arrays of numbers—they live on a curved geometric space where the natural distance is the Bures-Wasserstein metric. Marconi develops the fiber bundle geometry of this space, while Khesin & Modin extend optimal transport to vector and matrix densities through gauge theory.
Bures-Wassersteincovariance matrixinformation geometry
Paper Review
In quantum mechanics, observation changes the system—a parallel to the statistical concept of missing-not-at-random (MNAR) data. Kang proposes a unified framework for robust causal directionality inference that bridges quantum measurement theory and statistical causal inference under informative missingness.
quantum inferencecausal directionalityMNAR
Paper Review
Traditional causal discovery requires large datasets and strong statistical assumptions. LLMs bring a new ingredient: domain knowledge encoded in pre-training. Susanti & FĂ€rber test whether LLMs can use observational data for causal discovery, while REX integrates explainable AI with causal structure learning.
causal discoveryLLMobservational data
Paper Review
Standard time series methods treat observations as discrete points. Functional data analysis treats them as samples from continuous curves—unlocking mathematical tools from functional analysis (Hilbert spaces, basis expansions, functional PCA) that capture temporal structure more faithfully.
functional data analysisFDAtime series
Paper Review
Graph coloring—assigning colors to vertices so no adjacent vertices share a color—is NP-hard in general but has practical algorithms that optimize real distributed systems. Ơvarcmajer et al. apply greedy coloring variants to blockchain P2P networks, while Cheng et al. combine graph theory with deep RL for container terminal scheduling.
graph coloringNP-hardcombinatorial optimization
Deep Dive
A remarkable mathematical equivalence: the thermodynamic friction that governs energy dissipation in slowly driven systems is the same as resistance distance in electrical networks and commute time in random walks. Sawchuk & Sivak unify these independently developed mathematical geometries.
thermodynamic geometryfriction metricgraph theory
Deep Dive
Classical statistics says a model with more parameters than data points should memorize training data and fail on new data. Modern neural networks violate this prediction spectacularly—generalizing well despite massive overparameterization. Four 2025 papers advance our theoretical understanding of why.
generalization theoryoverparameterizationimplicit bias
Paper Review
Lean 4 is the rising proof assistant for formalizing mathematics—but users must manually construct proofs tactic by tactic. Lean-auto connects Lean to external automated theorem provers, enabling one-click proof of routine goals that would otherwise require tedious manual construction.
Lean 4automated theorem provingproof automation
Paper Review
Combinatorics—the mathematics of counting, arrangement, and discrete structures—is essential across computer science and probability theory. Xiong et al. create a benchmark of combinatorial identities for automated theorem proving, then show how AI can generate *new* identities alongside their proofs.
combinatoricscombinatorial identitiesautomated theorem generation
Paper Review
Implementing a quantum gate requires driving a quantum system along a path through the special unitary group SU(n). The energetically optimal path is a geodesic—but in the sub-Riemannian geometry where only certain control directions are available, geodesics are fundamentally different from Riemannian ones.
sub-Riemannian geometryquantum computinggate optimization
Paper Review
The edge-isoperimetric number measures how well a graph expands—a property critical for network design, error-correcting codes, and randomized algorithms. Abiad et al. obtain sharp spectral bounds for graph powers and distance-regular graphs, connecting algebraic graph theory with combinatorial optimization.
spectral graph theoryisoperimetric numberedge expansion
Paper Review
Graphs are discrete, unordered structures—fundamentally different from the continuous data that standard diffusion models handle. Petersen et al. develop a Bayesian framework for discrete graph generation that combines diffusion and flow matching models with principled posterior inference.
graph generationdiscrete diffusionBayesian inference
Deep Dive
Microtubules—the protein scaffolding of every cell—have a lattice structure with mathematical properties that connect to number theory and adelic topology. Planat explores whether these algebraic structures could support quantum coherence, touching on one of science's most controversial hypotheses.
microtubulesparametric resonancearithmetic geometry
Field Map
Hecke modifications—a technique for modifying vector bundles at specific points—appear across number theory, complex geometry, and mathematical physics. Alvarenga et al. provide an accessible introduction to this cross-disciplinary tool that connects the Langlands program to quantum field theory.
Hecke modificationsvector bundlesalgebraic geometry
Paper Review
Self-dual codes—codes that are their own dual—possess optimal error-correcting properties and deep algebraic structure. Fang & Liu construct new families of self-dual codes from algebraic curves, expanding the toolkit for designing codes with guaranteed minimum distance.
algebraic geometry codesself-dual codeserror correction
Paper Review
The Möbius function—central to number theory's inclusion-exclusion principle—extends naturally to algebraic structures (lattices, posets, group rings) with applications in combinatorics, cryptography, and data analysis. Sharma et al. survey these extensions and their computational implications.
Möbius functioncombinatorial number theoryalgebraic structures
Methodology Guide
When potential confounders outnumber observations—common in genomics, EHR data, and social media studies—standard causal adjustment fails. Cha et al. and Kong develop debiased estimators that provide valid causal inference in the high-dimensional regime where classical methods break down.
confounding adjustmenthigh-dimensionalcausal inference
Paper Review
Multi-omics data (genomics + proteomics + metabolomics) reveals thousands of biological associations—but associations are not causes. Mishra et al. develop instrumental factor models that use genetic variants as natural experiments to distinguish causal mechanisms from confounded correlations in high-dimensional biological data.
multi-omicscausal inferenceinstrumental variables
Paper Review
Attribute-based signatures allow users to sign messages based on their attributes (role, department, clearance level) without revealing their identity. Goel et al. improve the efficiency of ABS using elliptic curve cryptography—achieving smaller signatures and faster verification while maintaining anonymity.
elliptic curve cryptographyECCattribute-based signature
Correlation is not causation. Everyone knows that. But how do you actually discover causation in complex dynamical systems? A new framework borrowed from weather forecasting does it by running the clock backwards---tracing effects back to their causes through Bayesian data assimilation.
causal inferencedata assimilationBayesian