Representation and Invariance of Scientific Structures1 serves as summary and culmination of the life work of Patrick Suppes (1922–2014), long-serving Professor of Philosophy at Stanford. It is the last book that Suppes published during in his prolific career as a researcher in psychology, education, and philosophy of science.2 Early drafts, entitled Set-theoretical Structures in Science, emerged from lecture notes in 1962, but the book did not officially appear until forty years later. Although it has already, and unjustly, fallen out of print, Representation and Invariance remains freely available in digital form through CLSI Publications.
Patrick Suppes pioneered what is now called the semantic, or model-theoretic, view of scientific theories. According to Suppes, scientific theorizing involves the construction of a hierarchy of interlinked set-theoretic models—models of theories, models of experiments, and models of data. The model-theoretic view, and the associated ideas of representation and invariance, have been influential among philosophers of science, with the book under review receiving the 2003 Lakatos Award. Yet these ideas are little appreciated by working scientists or statisticians. Even within the philosophy of science, Suppes’ emphasis on probabilistic and statistical models has been largely lost by his philosophical descendants.
I therefore begin the review by explaining how the notion of a logical theory and its models, introduced by Alfred Tarski for purely mathematical purposes, can be reinterpreted in the empirical context of science. This leads naturally to the ideas of representation and invariance theorems, in both mathematical and scientific contexts, and finally to the content of Representation and Invariance, which is, as the title suggests, all about representation and invariance in many branches of science, from geometry and physics to psychology and linguistics.
Models in mathematics and science
Mathematical logic studies the interplay between syntax, the symbolic manipulation of formal languages, and semantics, the assignment of meaning to sentences expressed in a formal language. Corresponding to these categories are the two fundamental entities of logic, theories and models.
A logical theory, inhabiting the world of syntax, defines a language of basic types, functions, and predicates, as well as the axioms that they must obey. A typical theory is the theory of groups, consisting of a single type , a binary function symbol (multiplication), a unary function symbol (inverse), and a nullary function symbol, or constant, (identity), together with the axioms defining a group (the laws of associativity, left and right identity, and left and right inverse), expressed in a suitable formal language.3 For instance, the associative law might be written as:
A set-theoretic model of a theory, inhabiting the world of semantics, assigns to each of the theory’s basic types, functions, and predicates a corresponding set, function, or relation, respectively, which have the appropriate domains and codomains and satisfy the axioms. Thus, a model of the theory of groups consists of a set , a binary operation on , and so on, satisfying the group axioms. Unsurprisingly, then, a model of the theory of groups is just a group in the ordinary sense.
The idea of a theory and its models is fundamental to mathematical logic and is known to every student of the subject. What is less commonly known is how the logical notion of “model” figures in the empirical sciences. I argue, following many others, that scientists construct empirical models of scientific theories, with the word “model” understood in a sense compatible with logic. Patrick Suppes is generally credited as being the first to point this out, in a paper entitled “A comparison of the meaning and uses of models in mathematics and the empirical sciences.”4 This paper is reproduced almost verbatim in Section 2.1 of Representation and Invariance.
Consider as paradigm the scientific theory of classical particle mechanics. Physics textbooks rarely, if ever, present Newtonian mechanics in a formal, axiomatic style, but doing so is fairly straightforward.5 In an axiomatization of the theory, we might have a type , representing the particles in the system; a type , representing the time interval under consideration; a type , indexing the external forces; a mass function ; a position function ; an internal force function , giving the force exerted by one particle on another at a given time; and an external force function , giving the external force on a particle at a given time. Axioms would include positivity of the mass function, twice differentiability of the position function, and, of course, Newton’s three laws. For example, Newton’s third law, of equal and opposite reaction, reads:
Suppes axiomatizes classical particle mechanics along these lines in Representation and Invariance, to which the interested reader is referred for further detail.6
Since the time of Newton, astronomers have constructed models of this theory to represent and predict the dynamics of the solar system. In a simple model of the solar system, is interpreted as a set with ten elements, corresponding to the Sun and its nine planets (or dwarf planets); the internal forces are given by the law of universal gravitation
and there are no external forces, meaning that is interpreted as the empty set.7 Given an assignment of the masses , the gravitational constant , and the initial positions and velocities and , there exists a unique solution, on a time interval containing , of the differential equation that is Newton’s second law. With this solution in hand, we obtain a model of the theory of classical particle mechanics.
To test the theory, astronomers would compare this model of the theory with another model built from empirical data. To avoid confusion between the two models, we call the former theoretical and the latter empirical.8 The empirical model is a model of the theory’s empirical reduct, or observable fragment: what remains of the theory when restricted to terms corresponding to observational or experimental data. In the case of classical particle mechanics, the empirical reduct would include the mass and position functions, which are estimated from observational data, but exclude the force functions, which are not measurable. In the empirical model, the type is interpreted as a finite set of time points , corresponding to observations made over time. The empirical model is then determined by an assignment of masses and positions for each celestial body . As often happens, the theoretical model is continous and the empirical model is discrete, but a comparison can be made by embedding the latter in the former. The theory is accurate to the extent that the position functions of the two models agree.
In this way, the concept of a model, originating in mathematical logic, can help to explain how scientific theories are connected with experimental data. The picture I have sketched is suggestive but seriously incomplete. As every scientist knows, there is a large gap between what is directly observed and what counts as experimental data when evaluating a theory. We do not directly observe the positions of celestial bodies in space; we estimate them, with some degree of accuracy, from telescope images and other astronomical data. Considerable background theory is often needed to make any sense of the raw data at all. Thus, in another early paper, Suppes argues that the primary scientific theory must be complemented with a theory of the experiment and concomitant models of data.9 Another important question is how to quantify the agreement between theoretical and empirical models, using numerical and statistical analysis. To his credit, Suppes takes this issue seriously, especially in the just-cited paper, but subsequent philosophers have tended to downplay or ignore it. My impression is that, even after some sixty years of amplification and development, the model-theoretic view has not attained the status of complete and satisfactory account of the scientific method (a tall order, admittedly). Nevertheless, I find it inspiring, and I hope to contribute to it in my own future research.
You may be wondering whether these esoteric ideas about theories and models have any practical consequences. In fact, the question of how scientific theories come to be connected with, and confirmed or disconfirmed by, experiments is central to the philosophy and methodology of science. To pursue this in depth would be too long a digression, but to illustrate let us briefly consider the implications for falsificationism.
According to Karl Popper, arguably the most influential 20th-century philosopher of science, scientific theories can never be confirmed, no matter the preponderance of positive evidence, but they can be disconfirmed, or falsified, by sufficiently strong negative evidence. Under a naive form of falsificationism, we might think that scientific theories can be directly falsified, but the model-theoretic view suggests otherwise. It is models of theories, not the theories themselves, that are tested by experiments. In general, it is possible to falsify a parametric family of models, but impossible to falsify the class of all models of the theory, for it is too large.10 It is therefore impossible to falsify the theory itself. Actually, if we take the model-theoretic view seriously, we must deny that the theory has any truth value at all; a theory is not true or false per se, but true or false only in a possible model. The history of science lends support to these logical considerations. When a successful theory fails to account for some new observation, the first and most common response is not to declare the theory falsified, but to look for new models of the theory that account for the anomaly. Famously, the existence of the planet Neptune was first hypothesized to explain deviations from the expected orbit of Uranus and only later confirmed by a painstaking search of the night sky. The theory of classical celestial mechanics is extremely flexible and is capable of producing an enormous range of models, simply by postulating the existence of new, unseen bodies.11
Much more deserves to be said about the role of models in mathematics and science, but alas, the heart of Representation and Invariance lies elsewhere. Only Chapters 1 and 2 study models in science from a general point of view, and this material serves mainly as motivation for the rest of the book. Mostly, the book is devoted to exhibiting the structure of specific mathematical and scientific theories. The theories are drawn from a wide range of disciplines, but the analysis is united by the goal of proving representation and invariance theorems. And this brings us to the main theme of the book.
Representation and invariance
Why should anyone go to the trouble of axiomatizing a scientific theory? If given a formalization of a scientific theory, what would you do with it?
One appealing answer is that mathematical scientists should try to do with formal scientific theories exactly what mathematicians do with their own theories: construct and, if possible, classify its models. Whenever an interesting new abstract theory crops up, mathematicians will explore the consequences of its axioms, proving various theorems, but they will also try to construct new models of the theory, which can serve as counterexamples or point towards unexpected phenomena. If luck is on the mathematicians’ side, this process of constructing new models will eventually culminate in a characterization or even a classification of the theory’s models, up to isomorphism. The result is a representation theorem.
To state this point more carefully, a representation theorem for an axiomatized theory asserts that every model of the theory is isomorphic to a model belonging to some restricted subclass of models, where “isomorphic” refers to the notion of isomorphism defined by the theory.12 Here are a few famous representation theorems from the mathematical sciences, some of which appear in Representation and Invariance:
- Every finite-dimensional real vector space is isomorphic to for some .
- Hurwitz’s theorem: Every finite-dimensional normed division algebra over the real numbers is isomorphic to the real numbers, the complex numbers, the quaternions, or the octonions.
- Cayley’s theorem: Every group is isomorphic to a transformation group, i.e., is embedded in a symmetric group.
- Yoneda’s lemma:13 Every (locally small) category is embedded in the category of contravariant functors .
- Fundamental theorem of finite abelian groups: Every finite abelian group is isomorphic to a direct sum of cyclic groups of prime-power order.
- Classification of simple finite groups: Every simple finite group is isomorphic to a cyclic group of prime order, an alternating group of degree at least 5, a group of Lie type, or one of a small class of exceptional groups, the “sporadic groups.”
- Kleene’s theorem: Every regular language is decidable by a finite automaton.
- Church-Turing thesis (or part of it): Every partial recursive function is computable by a Turing machine.
- Newton’s shell theorem: A spherically symmetric body exerts a gravitational force equal to that of a point mass at its center with the same total mass, hence every system of noncolliding spherical bodies is equivalent to a system of particles.
- Every system of classical particle mechanics is equivalent to a subsystem of a closed system (i.e., a system with no net external force on any particle).
- Cox’s theorem: Every system of assigning numerical plausibilities to logical propositions, in accordance with certain postulates of “common sense,” is equivalent to a (finitely additive) probability measure.
- de Finetti’s theorem: Every exchangeable sequence of Bernoulli random variables is a mixture of sequences of independent and identically distributed Bernoulli random variables.
Apparently I like cataloging representation theorems, as this list has grown longer than I intended. Even so, I have left out many excellent theorems and the reader is invited to think of their own favorites.
We should draw several lessons from these examples. First, representation theorems need not be deep, though sometimes they are very deep. The difficulty of a representation theorem can range from easy, as in Cayley’s theorem, whose proof is about a paragraph long, to extraordinarily challenging, as in the classification of simple finite groups, whose proof is some 10,000 pages long and is perhaps the most complicated proof ever devised by human beings. As the case of groups also shows, representation theorems are far from unique. Any subclass of models can be chosen to represent the class of all models, provided it contains a member of every isomorphism class. In particular, there is always a trivial representation consisting of all models of the theory. At the other extreme, a classification is a minimal representation theorem, whose representing class contains no two isomorphic members. Whether a representation theorem is interesting depends on whether the representing class has real conceptual content. Thus, the discovery of representation theorems is not a mechanical process; it requires creativity and subject-matter knowledge.
Once a representation theorem has been established, the natural next questions to ask are whether multiple representations exist and, if so, how they are related. For Suppes, such questions are answered by “invariance theorems.” Given their centrality to Representation and Invariance, it is surprisingly difficult to find out, with any degree of precision, what an invariance theorem actually is. The closest I can find to a general definition is buried in a section about invariance in theories of measurement:
A representation theorem should ordinarily be accompanied by a matching invariance theorem stating the degree to which a representation of a structure is unique. In the mathematically simple and direct cases it is easy to identify the group as some well-known group of transformations. For more complicated structures, for example, structures that satisfy the axioms of a scientific theory, it may be necessary to introduce more complicated apparatus, but the objective is the same, to wit, to characterize meaningful concepts in terms of invariance.14
I will try to explain invariance more carefully, but first, to illustrate the scope of the concept, I list a few examples of invariance theorems, some of which are treated in the book:
- The representation of a finite-dimensional real vector space as Cartesian space is invariant under the group of invertible linear transformations of .15
- The representation of a Euclidean space, with relations of betweenness and equidistance, as Cartesian space is invariant under the group of similarity transformations (rotations, reflections, translations, and uniform scalings).
- The representation of classical spacetime as (with three space dimensions and one time dimension) is invariant under the group of Galilean transformations.16
- The representation of relativistic spacetime as is invariant under the group of Lorentz transformations.
- Two finite-dimensional vector spaces are isomorphic if and only if they have the same dimension.
- If two topological manifolds are isomorphic, then they have the same dimension.
- Ornstein isomorphism theorem: Two Bernoulli schemes are isomorphic (as measure-preserving dynamical systems) if and only if they have the same entropy.
- Conservation of energy: In a system of classical particle mechanics whose forces are given by a potential energy function, not explicitly depending on time, the total energy is conserved.
- Conservation of momentum: In a system of classical particle mechanics with no external forces, the total momentum is conserved.
It seems that, for Suppes, invariance theorems can take many different forms. So what is invariance, anyway?
The most fundamental kind of invariance theorem, represented by the first four examples, explains the degree to which the representing isomorphisms in a representation theorem are unique. This is easiest to understand when the representation theorem is a classification, as in the examples above. In this case, for any model , there is exactly one isomorphic model in the representing class, but there may be, and usually are, many different representing isomorphisms between and . Indeed, suppose is one such isomorphism. For every automorphism of (isomorphism from to itself), the composite is another isomorphism between and . Conversely, any two isomorphisms are related in this way: the composite is an automorphism of satisfying . The automorphism group of thus contains, via its action on representations, all possible information about the uniqueness, or lack of it, in representing any model by the model . An invariance theorem is a description of the automorphism groups of each representing model in a classification.17
Invariance theorems are studies in symmetry. The automorphism group of a model is its group of symmetries: the transformations of the model into itself that preserve the operations of the theory. Highly symmetric models admit many different representing isomorphisms, whereas models with little symmetry admit few representing isomorphisms. In the extreme case, the representing model has a trivial automorphism group, containing only the identity transformation, and the representing isomorphisms are unique. The natural numbers , as a model of Peano arithmetic, are an example: the only automorphism of is the identity. However, this situation is not very common. The models we create in mathematics and physics tend to be highly symmetric, as in the four geometric examples given above. The Cartesian model of Euclidean geometry, for instance, admits a large group of geometric symmetries, the similarity transformations. In cases like this, where we have a preferred model, it is especially important to characterize its symmetries by an invariance theorem. If we wish to study a theory by studying a particular model, we must ensure that any notions defined in terms of the model are invariant under its automorphisms. Otherwise, we are not studying the theory, but only some irrelevant detail of the model.
This brings us to the second kind of invariance theorem in Representation and Invariance, concerning “invariants” in the sense familiar to mathematicians. An invariant is a function on the space of models of a theory that takes equal values on isomorphic models. Suppes occasionally refers to theorems establishing an invariant as “invariance theorems.” The next three examples in the list, about the dimensions of vector spaces and topological manifolds and the entropies of Bernoulli processes, are invariance theorems of this kind. An invariant is complete if it takes equal values only on isomorphic models. Complete invariants are coveted because they classify models up to isomorphism. Dimension is a complete invariant for (finite-dimensional) vector spaces, but is not complete for manifolds.
The third and final kind of invariance theorem, on display in the last two examples, establishes a conservation law for models of a dynamical theory. At first glance, a conserved quantity may look like an invariant, inasmuch as both involve something that “stays constant” or “remains unchanged,” but this impression is misleading. An invariant stays constant under isomorphism between different models, whereas a conserved quantity stays constant under time evolution within a single model. (In particular, conservation laws only make sense for theories with a time component or an analogous notion of dynamics.) Conservation laws are, if anything, more closely related to symmetry and to invariance theorems in the original sense. According to Noether’s first theorem, under certain technical conditions, there is a one-to-one correspondence between one-parameter symmetry groups and conservation laws.18 For instance, conservation of energy corresponds to time translation symmetry and conservation of momentum to space translation symmetry. But you should not conclude from this that any conservation law is just theorem about symmetry groups in disguise. Noether’s theorem applies only to continuous symmetries of variational problems, as found in Lagrangian and Hamiltonian mechanics. In the absence of a variational principle, there is, as far as I know, no general relationship between symmetries and conservation laws.19 Noether’s theorem is mentioned only in passing in Representation and Invariance.
In summary, then, Representation and Invariance features at least three kinds of invariance theorems. They are all related to symmetry and invariance, but they are not, in general, reducible to each other.
Contents of the book
You should now have a good idea what this book is about and what are its major themes. Now let us turn to the contents of the book itself. Contrary to what my long-winded introduction might suggest, Representation and Invariance is not mainly a book about representation and invariance in general, but about specific representation theorems and invariance theorems in the sciences. The aim seems to be to demonstrate the utility and ubiquity of the ideas of representation and invariance by showing how they manifest in diverse branches of the physical and social sciences.
In a memorable passage Suppes declares himself to be an “unreconstructed pluralist.”20 Though made in the context of reductionism in science, this remark fairly sums up the spirit of the book and the author’s philosophical work. In my view, its pluralistic spirit is both the greatest strength and the greatest weakness of Representation and Invariance. Suppes draws on his unusually broad set of interests and published work to present a wide array of examples from geometry, probability theory, classical physics, quantum physics, psychology, linguistics, and measurement theory. He never lets a rigid theoretical framework get in the way of a compelling example. Nor does he fall into the trap, so easy in the philosophy of science, of caricaturing the scientific method or the content of science through an excessive reductivism. But by the same token, in his treatments of models in science, and of representation and invariance, Suppes makes no pretense at systematicity. My analysis of what constitutes an invariance theorem, incomplete though it is, is more systematic than any to be found in Representation and Invariance. Readers looking for a systematic account of how models are used in science, or of the role played by representation and invariance theorems, will be disappointed. What readers will find is a delightful zoo of representation and invariance theorems, presented with mathematical detail and in historical context.
The book’s content is organized into four chapters about the general concepts of models, representation, and invariance, followed by four chapters about representation and invariance theorems in different disciplines. Even the early Chapters 3 and 4 are amply illustrated by examples from measurement theory. Measurement theory is the study of how qualitative relations can be represented quantitatively, motivated by psychometrics. It forms an important part of Suppes’ life work, and I will not do it justice here. The monumental Chapter 5, spanning some 130 pages, is about the representation and philosophical interpretation of probability theory. Chapters 6 and 7 are about representations of space and time and in mechanics. These categories are construed unusually broadly to include not just classical and quantum physics, but also philosophical and psychological aspects of space and time. For instance, there is a fascinating section about whether visual space—space as perceived by human beings—has a Euclidean geometry or some other kind of geometry.21 Representation and Invariance concludes with another doorstopper, a long Chapter 8 on representations of language.
The content of Representation and Invariance tends to fall into one of two categories: fairly standard, textbook-level material from a particular field, or more esoteric, original material based on the published research of Suppes and his collaborators. Yet even the “textbook” material is presented with a unique gloss. For instance, Suppes recapitulates many famous results from physics, but in an axiomatic, synthetic style no longer found in physics textbooks. The prerequisites on the part of the reader are minimal; prior knowledge of the very basics of logic and probability should suffice. Moreover, the book’s episodic style means that if you get lost or bored by a particular section, you can usually move on to the next section without loss of continuity.
Representation and Invariance is eclectic and stimulating, the product of a lifetime of intellectual wandering that is, perhaps, no longer possible within today’s academy. Given its breadth of topics, I doubt there is any single person who could not find something to learn from this book. I would recommend it to anyone curious about how ideas from mathematical logic manifest in the empirical sciences.
Patrick Suppes, 2002. Representation and Invariance of Scientific Structures. CSLI Publications. PDF.
The collected papers of Patrick Suppes are freely available through the Suppes Corpus, hosted by Stanford University.
Patrick Suppes, 1960. “A comparison of the meaning and uses of models in mathematics and the empirical sciences.” Synthese, Vol. 12, No. 29, pp. 287-301. In: Suppes Corpus.
Granted, it is more straightforward in principle than in practice. The trouble with formalization in first-order predicate logic with identity, or “standard formalization” as Suppes calls it, is that scientific theories presuppose a good deal of mathematics, which must be encoded in elementary logic before the theory can be expressed (Suppes, 2002, Sec. 2.2). Classical mechanics assumes at least the construction of the real numbers and calculus, while quantum mechanics requires considerably more, in the vein of functional analysis and probability theory. Suppes sidesteps this problem by working in informal set theory, in much the same style as everyday mathematics. His official slogan is that “to axiomatize a theory is to define a set-theoretical predicate” (Suppes, 2002, Sec. 2.3). That is fine as far as it goes, but clearly departs from the ideal of complete formalization. A truly practical foundation of mathematics is needed to formalize not just mathematics but scientific knowledge as well.
To be more explicit about our assumptions, we could combine the theory of classical particle mechanics with an axiomatization of universal gravitation, yielding a more specialized theory which might be called the theory of classical celestial mechanics.
Patrick Suppes, 1960. “Models of data.” Logic, Methodology and Philosophy of Science: Proceedings of the 1960 International Congress. In: Suppes Corpus.
It is possible, using statistical theory, to make mathematical sense out of claims that a certain class of models is “small enough” or “too large” to test empirically. For an explicit link between falsificationism and VC theory (computational learning theory), see: David Corfield, Bernhard Schölkopf, and Vladimir Vapnik, 2009. “Falsificationism and statistical learning theory: Comparing the Popper and Vapnik-Chervonenkis dimensions.” Journal for General Philosophy of Science, Vol. 40, No. 1, pp. 51-58. DOI.
Lest this remark seem unfair, I should point out that physicists have rescued contemporary cosmological theory using exactly this device: positing the existence of vast amounts of unseen matter. So far all attempts to detect this famous dark matter have failed, and its existence remains purely hypothetical.
More precisely, the statement given is the Yoneda embedding, usually treated as a special case of the general Yonena lemma. The Yoneda embedding generalizes Cayley’s theorem as follows. View a group as a category with one object and invertible morphisms. Then the category of functors is just the category of -sets and -equivariant maps. Thus, an embedding of (as a category) into this functor category is an embedding of (as a group) into the automorphism group of some -set . In particular, is embedded in the symmetric group on . (What is this mysterious -set ? It is simply acting on itself, as you can see from the proofs of Cayley’s or Yoneda’s embedding.)
The precise meaning of this statement is: for any two isomorphisms , there exists an invertible linear transformation such that , and, conversely, for any isomorphism and any invertible linear transformation , the composition is another isomorphism. The next several invariance theorems are interpreted similarly, but are less trivial.
For this and the subsequent invariance theorem to make sense, classical and relativistic spacetime must be axiomatized. To do this, Suppes introduces frames of reference, a quantitative device familiar to physicists, but also provides references to purely qualitative axiomatizations, in the style of classical Euclidean geometry.
What about representation theorems which are not classifications, whose representing classes contain isomorphic models? In such cases, to determine the degree of uniqueness of representation we must understand more than the automorphisms of representing models. We must also undertand the isomorphisms between distinct representing models. The appropriate notion of symmetry is not a group but a groupoid, which we might call the isomorphism groupoid. Such cases are not treated in Representation and Invariance and seem to be less well understood. Yet examples abound if one looks for them. Can you think of any?
For a mathematically respectable but still readable treatment of Noether’s theorems, see: Peter Olver, 1993. Applications of Lie Groups to Differential Equations. 2nd edition. Springer-Verlag New York.
To appreciate the subtlety, notice that the theory of classical particle mechanics always satisfies conservation of momentum, assuming there are no external forces, but it does not require that the forces be translationally invariant or be derived from a translationally invariant potential energy. This is not a contradiction to Noether’s theorem because Newtonian mechanics, in its traditional “three laws” form, is not derived from a variational principle. Instead, conservation of momentum is “baked into” the theory, a nearly immediate consequence of Newton’s third law (Suppes, 2002, p. 325). The Lagrangian and Hamiltonian versions of classical mechanics, by contrast, are based on a variational principle and do not take Newton’s third law as an axiom. Conservation of momentum no longer holds automatically but does hold when the potential energy has space translation symmetry. Revealing conservation laws to be a manifestation of symmetry are just one way in which Lagrangian and Hamiltonian mechanics improve upon Newton’s original formulation.