Symbolic regression

Symbolic regression is regression over spaces of mathematical expressions. Most existing systems are based on genetic programming; fact, some authors incorrectly treat the terms as synonymous.

Literature

  • Schmidt & Lipson, 2009: Distilling free-form natural laws from experimental data (doi)
  • Brunton, Proctor, Kutz, 2016: Discovering governing equations from data by sparse identification of nonlinear dynamical systems (doi, arxiv)

Projects

  • Eureqa : Developed at Cornell, commercialized by Nutonian
    • Based on genetic programming
    • Schmidt & Lipson, 2009 [original paper, see above]
    • Stoutemyer, 2013: Can the Eureqa symbolic regression program, computer algebra, and numerical analysis help each other? (pdf, arxiv)
  • HeuristicLab
    • Framework for metaheuristics, especially genetic programming
    • Includes modules for symbolic regression and symbolic classification
    • Wagner & Afflenzeller, 2005: HeuristicLab: A generic and extensible optimization environment (doi)
  • Mathgen (see also SCIgen and snarXiv )
    • Generates random “math papers,” including equations
    • Equations are nonsensical and not always syntactically valid

Related topics

This topic is obviously related to computer algebra. It is also vaguely related to diverse other topics in statistics and ML.

Learning neural network topologies

  • Kwok & Yeung, 1997: Constructive algorithms for structure learning in feedforward neural networks for regression problems (doi, pdf)
  • NeuroEvolution of Augmenting Topologies (NEAT)
    • NEAT : Stanley & Miikkulainen, 2002: Evolving neural networks through augmenting topologies (doi, tech report )
    • HyperNEAT : Stanley et al, 2009: A hypercube-based encoding for evolving large-scale neural networks (doi, pdf)
    • ES-HyperNEAT : Risi & Stanley, 2012: An enhanced hypercube-based encoding for evolving the placement, density, and connectivity of neurons (doi)
  • Cortes et al, 2016: AdaNet: Adaptive structural learning of artificial neural networks (arxiv)
    • Inspired by boosting in ML, cf. references below
    • Friedman, 2001: Greedy function approximation: A gradient boosting machine (doi)
    • Grubb & Bagnell, 2011: Generalized boosting algorithms for convex optimization (arxiv)

Bayesian nonparametrics

  • Adams, Wallach, Ghahramani, 2010: Learning the structure of deep sparse graphical models (arxiv, pdf)

Bayesian hyperparameter optimization (wiki )

An example of combinatorial search guided by Bayesian inference.

  • Snoek, Larochelle, Adams, 2012: Practical Bayesian optimization of ML algorithms (arxiv, GitHub )
    • Uses Gaussian processes
  • Bergstra, Yamins, Cox, 2013: Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures (pdf)
    • Tree-structured Parzen estimator

Criticism by B. Recht and K. Jamieson (1 ,2 ):

  • Li, Jamieson, et al, 2016: Hyperband: A novel bandit-based approach to hyperparameter optimization (pdf, arxiv)

Bayesian numerical analysis

Using Bayesian analysis when there is no sampling error. See the webpage of Hennig & Osbourne.

  • Diaconis, 1988: Bayesian numerical analysis (pdf, tech report )
  • Ghahramani & Rasmussen, 2003: Bayesian Monte Carlo (pdf)
  • Briol et al, 2016: Probabilistic integration: A role for statisticians in numerical analysis? (arxiv)
  • Cockayne et al, 2017: Bayesian probabilistic numerical methods (arxiv)

Automatic differentiation

  • Baydin et al, 2015: Automatic differentiation in ML: a survey (arxiv)
  • Griewank & Walther, 2008: Evaluating derivatives (doi)
  • Berz, 1992: Automatic differentiation as nonarchimedean analysis (pdf)
  • Kantor & Solodovnikov, 1989: Hypercomplex numbers
    • Dual numbers as a very special case of hypercomplex numbers

Complexity of expressions

  • Vladislavleva et al, 2009: Order of nonlinearity as a complexity measure for models generated by symbolic regression via Pareto genetic programming (doi)