Randomized linear algebra

Randomized linear algebra, aka sketching, uses randomized embeddings to the reduce the dimensionality, and improve the computational efficiency, of large-scale problems in numerical linear algebra.

Community activity

Simons Institute:

  • 2013 Big Data Boot Camp: Drineas & Mahoney: Past, present and future of randomized numerical linear algebra (video)
  • 2018 Foundations of Data Science Boot Camp (abstracts , video)
    • Woodruff & Clarkson: Sketching for linear algebra
    • Mahoney: Sampling for linear algebra, statistis, and optimization
  • 2018 Workshop: Randomized numerical linear algebra and applications (abstracts , video)

Literature

Surveys

  • Halko, Martinsson, Tropp, 2011: Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions (doi, arxiv)
  • Mahoney, 2011: Randomized algorithms for matrices and data (doi, arxiv)
    • See also Mahoney’s 2013 lecture notes (arxiv)
    • Review article: Drineas & Mahoney, 2016, in Communications of the ACM: RandNLA: randomized numerical linear algebra (doi)
  • Woodruff, 2014: Sketching as a tool for numerical linear algebra (doi, arxiv)
  • Kannan & Vempala, 2017: Randomized algorithms in numerical linear algebra (doi, pdf)

Random projections and the Johnson–Lindenstrauss lemma

There are now several textbook treatments:

  • Vempala, 2004: The Random Projection Method
  • Vershynin, 2018: High-dimensional Probability
    • Sec 5.3: Application: Johnson–Lindenstrauss lemma
    • Sec 9.3: The Johnson-Lindenstrauss lemma for infinite sets
  • Wainwright, 2019: High-dimensional Statistics, Example 2.12: Johnson–Lindenstrauss embedding

See also the literature on metric embeddings.

The fast Johnson–Lindenstrauss transform (FJLT) is still confined to the research literature:

  • Ailon & Chazelle, 2009: The fast Johnson–Lindenstrauss transform and approximate nearest neighbors (doi)
    • Conference version: Ailon & Chazelle, 2006: Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform (doi)
    • Review article: Ailon & Chazelle, 2010, in Communications of the ACM: Faster dimension reduction (doi)
  • Matoušek, 2008: On variants of the Johnson–Lindenstrauss lemma (doi, pdf)
    • Alternate proof of, and slight improvement on, the FJLT
    • Readable and self-contained exposition, but can made much simpler, nearly trivial, with the modern machinery of measure concentration
    • All lemmas in Sec 2 are now standard (Vershynin, 2017, High-dimensional Probability, Chapter 2)
    • The key Proposition 3.2 is Bernstein’s inequality for sub-exponential random variables (Vershynin, 2017, Theorem 2.8.2, esp. p. 35)
    • With this background, Theorem 3.1 (the J-L lemma with independent sub-Gaussian projection coefficients) is proved in one paragraph
    • Lemma 4.2 is equivalent to saying \(Y\) has sub-exponential norm at most \(C \alpha / \sqrt{2q}\) (Vershynin, 2017, Proposition 2.7.1 (v))
    • Theorem 4.1 (a sparse J-L lemma for vectors with small \(\ell_\infty\) norm) then follows quickly from Bernstein’s inequality (I don’t think Matoušek’s truncation argument is necessary)