Stability

There is a close connection in ML and statistics between stability under perturbation and generalization to unseen data.

Literature

Statistics

  • Meinshausen & Bühlmann, 2010: Stability selection (doi, arxiv)
    • Textbook treatment in Bühlmann & van de Geer, 2011, Statistics for High-Dimensional Data, Ch. 10: Stable solutions
    • Refined in Shah & Samworth, 2012: Variable selection with error control: another look at stability selection (doi, arxiv)
    • R packages: hdi (paper), stabs
  • Yu, 2013: Stability (doi, arxiv, pdf)
    • Start here: Nice overview and review of literature
  • Lim & Yu, 2016: Estimation stability with cross-validation (doi, arxiv)
    • Sec 2.7: Stability of (unstable in high dim. with collinearity) vs (more stable)

Machine Learning

  • Bousquet & Elisseeff, 2002: Stability and generalization (pdf)
  • Mukherjee et al, 2006: Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization (doi, preprint )
    • Baby version in Nature, 2004: General conditions for predictivity in learning theory (doi)
    • See also Rakhlin, Mukherjee , Poggio, 2004: On stability and concentration of measure (pdf)
  • Shalev-Shwartz, Shamir, Srebro, Sridharan, 2010: Learnability, stability, and uniform convergence (pdf)
    • Textbook treatment in Shalev-Shwartz & Ben-David, 2014, Understanding Machine Learning, Ch. 13: Regularization and stability
    • Important reference for (Dwork et al, 2015) (arxiv)
  • Xu et al, 2012: Sparse algorithms are not stable (doi, pdf)
  • Thakurta & Smith, 2013: Differentially private feature selection via stability arguments, and the robustness of the lasso (pdf)
    • Textbook treatment in Dwork & Roth, 2014, Algorithmic Foundations of Differential Privacy, Ch. 7: When worst-case sensitivity is atypical
    • Motivated by differential privacy, but provides independently interesting sufficient conditions for stability of lasso
    • Sec 1.2: Nice survey of literature on stability