Evan Patterson
Stanford University, Statistics Department
Special Session on Applied Category
AMS Fall Western Sectional Meeting, November 9-10, 2019
Is there a correspondence between two graphs?
Aflalo et al 2015; Alvarez-Melis et al 2017; Vayer et al 2018
Monge problem (1781): Given measures μ∈Prob(X) and ν∈Prob(Y) and cost function c:X×Y→R,
minimizeT:X→Y:Tμ=ν∫Xc(x,T(x))μ(dx)Image: Villani 2003
Monge problem is combinatorial and nonconvex.
Kantorovich's relaxation (1942): Replace deterministic map with probabilistic coupling:
minimizeπ∈Coup(X,Y)∫X×Yc(x,y)π(dx,dy)where
Coup(X,Y):={π∈Prob(X×Y):projXπ=μ,projYπ=ν}.New problem is convex, in fact a linear program.
When cost is a metric d:X×X→R, we get the Wasserstein metric on Prob(X):
Wp(μ,ν):=infApplied to graph matching in two ways:
Featurize vertices of both graphs in common metric space, then compute Wasserstein distance.
Convert graphs into distinct metric spaces on vertices (via shortest path metric), then compute Gromov-Wasserstein distance.
Problem: Discards significant information about edges.
Goal: Construct a Wasserstein-style metric on graphs that respects both vertices and edges, in a sense to be defined.
So, this informal diagram should not commute!
In fact, no reason to restrict to graphs; generalizing to \mathcal{C}-sets even points the way towards a solution.
Recall: For \mathcal{C} a small category, a \mathcal{C}-set is a functor X: \mathcal{C} \to \mathbf{Set}.
The category of \mathcal{C}-sets is the functor category [\mathcal{C},\mathbf{Set}].
Example: When \mathcal{C} = , a \mathcal{C}-set is a (directed) graph.
Example: When \mathcal{C} = , a \mathcal{C}-set is a symmetric graph.
Nearly the same as an undirected graph.
Other examples
For applications: Attributes can be modeled in \mathcal{C}, to get vertex-attributed graphs, edge-attributed graphs, and so on.
A \mathcal{C}-set in a category \mathcal{S} is a functor X: \mathcal{C} \to \mathcal{S}.
For us, useful categories \mathcal{S} include:
Leads to \mathcal{C}-sets, measurable \mathcal{C}-spaces, measure \mathcal{C}-spaces, metric \mathcal{C}-spaces, and so on.
In this talk, I give a sketch. A systematic development is in the paper.
A Markov kernel M: X \to Y is a measurable assignment of each point x \in X to a probability measure M(x) \in \mathrm{Prob}(Y).
Other names for Markov kernels:
There is a category Markov of measurable spaces and Markov kernels.
Let \mu \in \mathrm{Prob}(X) and \nu \in \mathrm{Prob}(Y).
For any coupling \pi \in \mathrm{Coup}(\mu,\nu), the disintegration (conditional probability distribution) M: X \to Y satisfies
\mu \cdot M = \nu.Conversely, for any Markov kernel M: X \to Y with \mu \cdot M = \nu, there is a product
\mu \otimes M \in \mathrm{Coup}(\mu,\nu).In fact, this correspondence is functorial.
Proposition (folkore?): Under regularity conditions, there is an isomorphism between
Interpretation: Markov kernels allow a "directed" version of optimal transport.
\mathbf{Meas} embeds in \mathbf{Markov} as the deterministic Markov kernels:
\mathcal{M}: \mathbf{Meas} \hookrightarrow \mathbf{Markov}.Induces a relaxation functor by post-composition:
\mathcal{M}_*: [\mathcal{C},\mathbf{Meas}] \to [\mathcal{C},\mathbf{Markov}].Definition. A Markov morphism X \to Y of measurable \mathcal{C}-spaces X and Y is a morphism \mathcal{M}_*(X) \to \mathcal{M}_*(Y).
So, a Markov morphism \Phi: X \to Y of graphs X and Y consists of Markov kernels \Phi_V: X(V) \to Y(V) and \Phi_E: X(E) \to Y(E) such that
Important: Graph homomorphism is NP-hard, but Markov graph morphism is a linear feasibility problem.
Examples of Markov morphisms:
But more exotic things can happen because mass can be "split".
Example: X = self loop, Y = directed cycle.
No graph homomorphisms X \to Y, but there is a unique Markov morphism \Phi: X \to Y:
\Phi_V(*) \sim \mathrm{Unif}(Y(V)), \qquad \Phi_E(*) \sim \mathrm{Unif}(Y(E)).Now, the metric side of matching \mathcal{C}-sets.
Let \mathbf{Met} be the category of Lawvere metric spaces and maps. (Note choice of morphisms.)
Definition. A metric category is a category \mathcal{S} enriched in \mathbf{Met}, i.e., the hom-sets \mathcal{S}(X,Y) are Lawvere metric spaces.
Definition. A morphism f: X \to Y in \mathcal{S} is short if for all morphisms g, g': Y \to Z and h, h': W \to X,
d(fg,fg') \leq d(g,g') \quad\text{and}\quad d(hf,h'f) \leq d(h,h').Short morphisms of \mathcal{S} form a subcategory \mathrm{Short}(\mathcal{S}).
Category \mathbf{Met} with supremum metric
d_\infty(f,g) := \sup_{x \in X} d_Y(f(x),g(x)), \qquad f,g \in \mathbf{Met}(X,Y).Short morphisms are short maps:
d_Y(f(x),f(x')) \leq d_X(x,x'), \quad \forall x,x' \in X.Proposition: For any metric category \mathcal{S}, \mathrm{Short}(\mathcal{S}) is enriched in \mathrm{Short}(\mathbf{Met}).
Category \mathbf{MM} of mm spaces and measurable maps, with L^p metric, 1 \leq p < \infty:
d_p(f,g) := \left( \int_X d_Y(f(x),g(x))^p\, \mu_X(dx) \right)^{1/p}, \qquad f,g \in \mathbf{MM}(X,Y).Proposition: A map f: X \to Y is short iff
\mu_X f := \mu_X \circ f^{-1} \leq \mu_Y,and
d_Y(f(x),f(x')) \leq d_X(x,x'), \quad \forall x,x' \in X.Let \mathcal{C} be a finitely presented category and \mathcal{S} a metric category.
Idea: For X,Y \in [\mathcal{C},\mathcal{S}], consider distance from naturality of transformation \phi: X \to Y at c \in \mathcal{C}:
Theorem: For any 1 \leq p \leq \infty, a Lawvere metric on [\mathcal{C},\mathcal{S}] is defined by
d(X,Y) := \inf_{\phi: X \to Y} \sideset{}{_p}\sum_{f: c \to c'} d(Xf \cdot \phi_{c'}, \phi_c \cdot Yf),where
Note: Condition that each \phi_c \in \mathrm{Short}(\mathcal{S}) is needed for triangle inequality.
Category \mathbf{MMarkov} of mm spaces and Markov kernels, with Wasserstein metric:
W_p(M,N) := \inf_{\Pi \in \mathrm{Coup}(M,N)} \left( \int_{X \times Y \times Y} d_Y(y,y')^p\, \Pi(dy,dy'\,|\,x)\, \mu_X(dx) \right)^{1/p}Generalizes both classical L^p and Wasserstein metrics:
Proposition: Under regularity conditions, a Markov kernel M: X \to Y is short iff
\mu_X M \leq \mu_Yand there exists \Pi \in \mathrm{Prod}(X,Y) such that
\int_{Y \times Y} d_Y(y,y')^p\, \Pi(dy,dy\,|\,x,x') \leq d_X(x,x')^p, \qquad \forall x,x' \in X.Consequence: Via the theorem, a Wasserstein-style metric on metric measure \mathcal{C}-spaces, computable by solving a linear program.
Paper: Hausdorff and Wasserstein metrics on graphs and other structured data, 2019. arXiv:1907.00257.