Communications in Mathematical Sciences

Volume 22 (2024)

Number 6

The conditional barycenter problem, its data-driven formulation and its solution through normalizing flows

Pages: 1635 – 1656

DOI: https://dx.doi.org/10.4310/CMS.2024.v22.n6.a8

Authors

Esteban G. Tabak (Courant Institute of Mathematical Sciences, New York University, New York, NY, USA)

Giulio Trigila (Weissman School of Arts and Sciences, Baruch College, City University of New York, NY, USA)

Wenjun Zhao (Division of Applied Mathematics, Brown University, Providence, RI, USA)

Abstract

A family of normalizing flows is introduced for selectively removing from a data set the variability attributable to a specific set of cofactors, while preserving the dependence on others. This is achieved by extending the barycenter problem of optimal transport theory to the newly introduced conditional barycenter problem. Rather than summarizing the data with a single probability distribution, as in the classical barycenter problem, the conditional barycenter is represented by a family of distributions labeled by the cofactors kept. The use of the conditional barycenter and its differences with the classical barycenter are illustrated on synthetic and real data addressing treatment effect estimation, super-resolution, anomaly detection and lightness transfer in image analysis.

Keywords

optimal transport, barycenter problem, normalizing flows, conditional distributions

2010 Mathematics Subject Classification

49Qxx, 62G07

Tabak’s work was partially supported by ONR grant N00014-15-1-2355.

Received 26 August 2022

Received revised 15 January 2024

Accepted 16 January 2024

Published 18 July 2024