How to prove equations using diagrams, part 2
Weak equivalence and E-diagrams
This post is cross-posted at the Topos Institute blog.
This is the second part in a series about diagrammatic reasoning, inspired by e-graphs. Last time, we reviewed the concept of initial functor and showed by example how to calculate with diagrams and initial functors. This time, we make that calculus more systematic and we reconceive e-graphs in terms of initial functors.
Weak equivalence of diagrams
We’ve been deriving equations by chaining together initial functors between diagrams, going in either direction. Let’s give a name to this equivalence relation on diagrams.
Definition 1 A weak equivalence from a diagram
Two diagrams in a category are said to be weakly equivalent if they can be connected by a zig-zag of weak equivalences.
Why call this relation weak equivalence? First, it is not equivalence in any familiar 2-category of diagrams. Moreover, though initial functors are closed under composition and any isomorphism of categories is initial, as follows directly from the definition, initial functors are most interesting when they’re not invertible. So, in order to make weak equivalence be an equivalence relation, we have to consider zig-zags of initial functors.1
By definition, weak equivalence of diagrams preserves limits whenever they exist. Sometimes this property can be used to extract commutation relations from diagrams, as we saw last time. However, we caution that weak equivalence does not preserve commutativity:2 if a commutative diagram
The following weak equivalence of diagrams is a minimal counterexample.
The functor shown is initial because the limit of both diagrams is the equalizer of
so that limit of both diagrams is
It can be a useful feature of weak equivalence that it works just as well regardless of whether the diagrams involved commute. As it happens, I first got into this topic by studying “Tonti diagrams,” which present partial differential equations in physics (Patterson et al. 2023). These diagrams do not commute; if they did, the equations would already be solved!
E-diagrams
We now attempt to make contact with e-graphs. That’s less straightforward than it might seem since, taken at face value, the concrete descriptions of e-graphs offered by Nelson (1980; see also Detlefs, Nelson, and Saxe 2005, sec. 4.2) and by Willsey et al. (2021, sec. 2.1) look rather different. We reproduce features of both without claiming to be fully faithful to either.
Definition 2 An e-diagram over a diagram
The pieces constituting an e-diagram have interpretations in e-graph jargon:
- The target category
is a theory, most clearly when is finitely presented, so that its generating graph is a signature, defining types and (unary) operations, and its path equations are equational axioms. - The diagram
is a term graph in the theory , particularly when the indexing category is freely generated by a finite graph. The objects of are e-class IDs. - The category
is something like a hash cons. Its objects are canonical e-class IDs.3 Below, we give conditions on the functor that make the morphisms of encode the mapping of a hash cons. - As for the functor
, the object map assigns each e-class ID to a canonical e-class ID, like in a union-find, while the morphism map just exists to witness the initiality of .
We now define the category of all e-diagrams over a diagram.
Definition 3 A morphism of e-diagrams
We do not assume that the functor
In other words, the category of e-diagrams over a diagram
The trivial e-diagram
Any diagram
The terminology is apt because the initial e-diagram is, in practice, how an e-diagram will be initialized from an existing diagram (term graph).
The ideal e-diagram
What is not so obvious is that the category of e-diagrams over a fixed diagram also has a terminal object. This follows from the foundational theorem by Street and Walters (1973) that initial functors form the left class of morphisms in an orthogonal factorization system (OFS) on
Theorem 1 (Comprehensive factorization) Initial functors and discrete opfibrations form the left and right classes, respectively, of an orthogonal factorization system on
Recall that a functor
The existence of the comprehensive factorization system is a powerful result with many consequences, such as:
- the cancellation property of the left class in an OFS implies that any functor
constituting a morphism of e-diagrams is initial, as noted above; - a closure property of the left class states that initial functors are stable under pushout, as used below.
Moreover, comprehensive factorization implies that the factorization of a diagram
The existence and uniqueness of a functor
The e-diagram produced by comprehensive factorization is ideal in the sense that the “hash cons invariant” (Willsey et al. 2021, Definition 2.7) is satisfied for any canonical e-node that can be formed. Rephrased in e-graph jargon, the functor
So, is the ideal e-diagram the only e-diagram we’ll ever need? No, because in general it cannot be computed. As we’ll have occasion to study in a future post, the comprehensive factorization of a functor
Rewriting e-diagrams
Having codified the objects of interest as e-diagrams, we turn to manipulating e-diagrams using the formalism of pushout-based rewriting. The goal is to show that calculations similar to last time’s can be carried out by applying rewrite rules, where each rule application is guaranteed to preserve weak equivalence of diagrams. Note that this scheme is not intended as a guide to implementation. Though e-graphs can be implemented using rewriting techniques, it should likely not be in terms of initial functors between categories. Rather, having proved the procedure correct, we can forget about initiality and compile to rewrite rules operating directly on the graphs generating the diagram shapes. As a proof of concept, my colleague Kris Brown has implemented e-graphs using hypergraph rewriting (Brown 2025).
Rewrites of an e-diagram
Rewriting over a fixed diagram
The simplest rewrites change the e-diagram
along with a match of its left-hand side, namely any map
to obtain a new e-diagram
Rewriting the diagram itself
Given a rewrite rule
The result is a new e-diagram
Outlook
In this second post on diagrammatic reasoning, we filled in aspects of the formalism neglected in the first post, proposing a definition of e-diagram and showing how to rewrite e-diagrams by applying rules in the form of initial functor motifs. While this is all “just” formalism, with the key ideas already present in the examples and calculations from last time, it has been a good excuse to review the comprehensive factorization system on
Having set out the basic formalism, we are now free to embelish it by considering categories with extra structure. Next time we’ll study diagrams and initial functors for categories with finite products, recovering the standard setting of e-graphs that allows constants and operations of arity greater than one. More ambitiously, in future posts, we hope to show the benefits of a clean categorical semantics by generalizing to new settings where e-graphs are not standardly applied.
References
Footnotes
Instead of passing to a mere relation of weak equivalence, which forgets how two diagrams are weakly equivalent, the believing structuralist would rather localize at the initial functors, formally inverting that class of morphisms within the category of diagrams. In a paper led by Kevin Carlson (Carlson et al. 2024), we proved that localizing at initial functors is equivalent to localizing at the largest class of diagram morphisms that preserve solutions to equations (lifts against any discrete opfibration). This justifies defining weak equivalences of diagrams to be initial functors.↩︎
Recall from the first post that a diagram commutes if, for every pair of parallel morphisms in the diagram’s shape, their images in the target category are equal.↩︎
Calling a chosen representative of an equivalence class “canonical” is traditional in computer science but should not be confused with “canonical” as a term of art in mathematics. Far from being mathematically canonical, the choice of elements to play the role of canonical e-class IDs is completely arbitrary. In a union-find data structure, the canonical IDs are usually taken to be a subset of the set of all IDs—a set-theoretic distinction meaningless in structuralist mathematics. What is canonical is the ideal e-diagram introduced later, since it is unique up to unique isomorphism.↩︎
Writing initial functors with “
” arrows and discrete opfibrations with “ ” arrows calls to mind the ur-example of a factorization system, the epi-mono factorization in . However, the analogy runs deeper than that: the comprehensive factorization in categorifies the epi-mono factorization in . A discrete opfibration (as well as a discrete fibration) is a map into a 1-category whose fibers are 0-categories, i.e., sets, while an injection is a map into a 0-category whose fibers are (-1)-categories, i.e., propositions. Moreover, an initial functor is a functor such that each comma category (“lax pullback”) is non-empty and connected, while a surjection is a function such that each pullback is nonempty. I thank David Jaz Myers for pointing out this analogy to me.↩︎In the e-graph engine
egglog
, these heuristics take of the form of schedules for applying rewrite rules.↩︎