OxCSML : People : Patrick Rebeschini

Learning theory, Optimization, Implicit Regularization

I am an Associate Professor in the Department of Statistics at the University of Oxford, and a Tutorial Fellow at University College, Oxford. I work on developing methodologies and theoretical foundations for Machine Learning.

Publications

2021

T. Farghly , P. Rebeschini , Time-independent Generalization Bounds for SGLD in Non-convex Settings, in Advances in Neural Information Processing Systems 34, 2021.
We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and smoothness, a setting that has received increased attention in the sampling/optimization literature. Unlike existing bounds for SGLD in non-convex settings, ours are time-independent and decay to zero as the sample size increases. Using the framework of uniform stability, we establish time-independent bounds by exploiting the Wasserstein contraction property of the Langevin diffusion, which also allows us to circumvent the need to bound gradients using Lipschitz-like assumptions. Our analysis also supports variants of SGLD that use different discretization methods, incorporate Euclidean projections, or use non-isotropic noise.
```
@inproceedings{farghly2021time,
  title = {Time-independent Generalization Bounds for SGLD in Non-convex Settings},
  author = {Farghly, Tyler and Rebeschini, Patrick},
  booktitle = {Advances in Neural Information Processing Systems 34},
  year = {2021},
  publisher = {Curran Associates, Inc.}
}
```

2020

D. Richards , P. Rebeschini , Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent, Journal of Machine Learning Research, vol. 21, no. 34, 1–44, 2020.

@article{JMLR:v21:18-638,
  author = {Richards, Dominic and Rebeschini, Patrick},
  title = {Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent},
  journal = {Journal of Machine Learning Research},
  year = {2020},
  volume = {21},
  number = {34},
  pages = {1-44}
}

D. Richards , P. Rebeschini , L. Rosasco , Decentralised Learning with Random Features and Distributed Gradient Descent, in Proceedings of the 37th International Conference on Machine Learning, 2020, vol. 119, 8105–8115.

@inproceedings{richards2020decentralised,
  title = {Decentralised Learning with Random Features and Distributed Gradient Descent},
  author = {Richards, Dominic and Rebeschini, Patrick and Rosasco, Lorenzo},
  booktitle = {Proceedings of the 37th International Conference on Machine Learning},
  pages = {8105--8115},
  year = {2020},
  volume = {119},
  series = {Proceedings of Machine Learning Research},
  month = {13--18 Jul},
  publisher = {PMLR}
}

2019

D. Martı́nez-Rubio , V. Kanade , P. Rebeschini , Decentralized Cooperative Stochastic Bandits, in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., 2019, 4529–4540.

@incollection{NIPS2019_8702,
  title = {Decentralized Cooperative Stochastic Bandits},
  author = {Mart\'{\i}nez-Rubio, David and Kanade, Varun and Rebeschini, Patrick},
  booktitle = {Advances in Neural Information Processing Systems 32},
  editor = {Wallach, H. and Larochelle, H. and Beygelzimer, A. and d\textquotesingle Alch\'{e}-Buc, F. and Fox, E. and Garnett, R.},
  pages = {4529--4540},
  year = {2019},
  publisher = {Curran Associates, Inc.}
}

T. Vaskevicius , V. Kanade , P. Rebeschini , Implicit Regularization for Optimal Sparse Recovery, in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., 2019, 2972–2983.

@incollection{NIPS2019_8562,
  title = {Implicit Regularization for Optimal Sparse Recovery},
  author = {Vaskevicius, Tomas and Kanade, Varun and Rebeschini, Patrick},
  booktitle = {Advances in Neural Information Processing Systems 32},
  editor = {Wallach, H. and Larochelle, H. and Beygelzimer, A. and d\textquotesingle Alch\'{e}-Buc, F. and Fox, E. and Garnett, R.},
  pages = {2972--2983},
  year = {2019},
  publisher = {Curran Associates, Inc.}
}

D. Richards , P. Rebeschini , Optimal Statistical Rates for Decentralised Non-Parametric Regression with Linear Speed-Up, in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., 2019, 1216–1227.

@incollection{NIPS2019_8405,
  title = {Optimal Statistical Rates for Decentralised Non-Parametric Regression with Linear Speed-Up},
  author = {Richards, Dominic and Rebeschini, Patrick},
  booktitle = {Advances in Neural Information Processing Systems 32},
  editor = {Wallach, H. and Larochelle, H. and Beygelzimer, A. and d\textquotesingle Alch\'{e}-Buc, F. and Fox, E. and Garnett, R.},
  pages = {1216--1227},
  year = {2019},
  publisher = {Curran Associates, Inc.}
}

P. Rebeschini , S. Tatikonda , A new approach to Laplacian solvers and flow problems, Journal of Machine Learning Research, vol. 20, no. 36, 2019.

@article{rebeschini2019new,
  title = {A new approach to Laplacian solvers and flow problems},
  author = {Rebeschini, Patrick and Tatikonda, Sekhar},
  journal = {Journal of Machine Learning Research},
  volume = {20},
  number = {36},
  year = {2019},
  publisher = {Journal of Machine Learning Research}
}

2017

P. Rebeschini , S. C. Tatikonda , Accelerated consensus via Min-Sum Splitting, in Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds. Curran Associates, Inc., 2017, 1374–1384.

@incollection{NIPS2017_6736,
  author = {Rebeschini, Patrick and Tatikonda, Sekhar C},
  booktitle = {Advances in Neural Information Processing Systems 30},
  editor = {Guyon, I. and Luxburg, U. V. and Bengio, S. and Wallach, H. and Fergus, R. and Vishwanathan, S. and Garnett, R.},
  pages = {1374--1384},
  publisher = {Curran Associates, Inc.},
  title = {Accelerated consensus via Min-Sum Splitting},
  year = {2017},
  bdsk-url-1 = {http://papers.nips.cc/paper/6736-accelerated-consensus-via-min-sum-splitting.pdf}
}

2016

P. Rebeschini , S. Tatikonda , Decay of correlation in network flow problems, in 2016 Annual Conference on Information Science and Systems (CISS), 2016, 169–174.

@inproceedings{7460496,
  author = {Rebeschini, P. and Tatikonda, S.},
  booktitle = {2016 Annual Conference on Information Science and Systems (CISS)},
  doi = {10.1109/CISS.2016.7460496},
  keywords = {graph theory;matrix algebra;optimisation;constrained network optimization problems;exponential rate;graph-theoretical distance;killed random walk;local sensitivity;network flow problems;substochastic transition matrix;Convex functions;Correlation;Information science;Linear programming;Optimization;Radio frequency;Sensitivity analysis;decay of correlation;killed random walk;network flow;sensitivity of optimal point},
  month = mar,
  pages = {169-174},
  title = {Decay of correlation in network flow problems},
  year = {2016},
  bdsk-url-1 = {http://dx.doi.org/10.1109/CISS.2016.7460496}
}

2015

P. Rebeschini , R. Handel , Can local particle filters beat the curse of dimensionality?, Ann. Appl. Probab., vol. 25, no. 5, 2809–2866, Oct. 2015.

@article{rebeschini2015,
  author = {Rebeschini, Patrick and van Handel, Ramon},
  doi = {10.1214/14-AAP1061},
  fjournal = {The Annals of Applied Probability},
  journal = {Ann. Appl. Probab.},
  month = oct,
  number = {5},
  pages = {2809--2866},
  publisher = {The Institute of Mathematical Statistics},
  title = {Can local particle filters beat the curse of dimensionality?},
  volume = {25},
  year = {2015},
  bdsk-url-1 = {https://doi.org/10.1214/14-AAP1061},
  bdsk-url-2 = {http://dx.doi.org/10.1214/14-AAP1061}
}

P. Rebeschini , R. Handel , Phase transitions in nonlinear filtering, Electron. J. Probab., vol. 20, 46 pp., 2015.

@article{rebeschini2015b,
  author = {Rebeschini, Patrick and van Handel, Ramon},
  doi = {10.1214/EJP.v20-3281},
  fjournal = {Electronic Journal of Probability},
  journal = {Electron. J. Probab.},
  pages = {46 pp.},
  pno = {7},
  publisher = {The Institute of Mathematical Statistics and the Bernoulli Society},
  title = {Phase transitions in nonlinear filtering},
  volume = {20},
  year = {2015},
  bdsk-url-1 = {https://doi.org/10.1214/EJP.v20-3281},
  bdsk-url-2 = {http://dx.doi.org/10.1214/EJP.v20-3281}
}

P. Rebeschini , A. Karbasi , Fast Mixing for Discrete Point Processes, in Proceedings of The 28th Conference on Learning Theory, Paris, France, 2015, vol. 40, 1480–1500.
We investigate the systematic mechanism for designing fast mixing Markov chain Monte Carlo algorithms to sample from discrete point processes under the Dobrushin uniqueness condition for Gibbs measures. Discrete point processes are defined as probability distributions μ(S)∝\exp(βf(S)) over all subsets S∈2^V of a finite set V through a bounded set function f:2^V→\mathbbR and a parameter β>0. A subclass of discrete point processes characterized by submodular functions (which include log-submodular distributions, submodular point processes, and determinantal point processes) has recently gained a lot of interest in machine learning and shown to be effective for modeling diversity and coverage. We show that if the set function (not necessarily submodular) displays a natural notion of decay of correlation, then, for βsmall enough, it is possible to design fast mixing Markov chain Monte Carlo methods that yield error bounds on marginal approximations that do not depend on the size of the set V. The sufficient conditions that we derive involve a control on the (discrete) Hessian of set functions, a quantity that has not been previously considered in the literature. We specialize our results for submodular functions, and we discuss canonical examples where the Hessian can be easily controlled.
```
@inproceedings{pmlr-v40-Rebeschini15,
  address = {Paris, France},
  author = {Rebeschini, Patrick and Karbasi, Amin},
  booktitle = {Proceedings of The 28th Conference on Learning Theory},
  editor = {Gr{\"u}nwald, Peter and Hazan, Elad and Kale, Satyen},
  month = {03--06 Jul},
  pages = {1480--1500},
  publisher = {PMLR},
  series = {Proceedings of Machine Learning Research},
  title = {Fast Mixing for Discrete Point Processes},
  volume = {40},
  year = {2015},
  bdsk-url-1 = {http://proceedings.mlr.press/v40/Rebeschini15.html}
}
```

2014

P. Rebeschini , R. Handel , Comparison Theorems for Gibbs Measures, Journal of Statistical Physics, vol. 157, no. 2, 234–281, Oct. 2014.
The Dobrushin comparison theorem is a powerful tool to bound the difference between the marginals of high-dimensional probability distributions in terms of their local specifications. Originally introduced to prove uniqueness and decay of correlations of Gibbs measures, it has been widely used in statistical mechanics as well as in the analysis of algorithms on random fields and interacting Markov chains. However, the classical comparison theorem requires validity of the Dobrushin uniqueness criterion, essentially restricting its applicability in most models to a small subset of the natural parameter space. In this paper we develop generalized Dobrushin comparison theorems in terms of influences between blocks of sites, in the spirit of Dobrushin–Shlosman and Weitz, that substantially extend the range of applicability of the classical comparison theorem. Our proofs are based on the analysis of an associated family of Markov chains. We develop in detail an application of our main results to the analysis of sequential Monte Carlo algorithms for filtering in high dimension.
```
@article{Rebeschini2014,
  author = {Rebeschini, Patrick and van Handel, Ramon},
  day = {01},
  doi = {10.1007/s10955-014-1087-7},
  issn = {1572-9613},
  journal = {Journal of Statistical Physics},
  month = oct,
  number = {2},
  pages = {234--281},
  title = {Comparison Theorems for Gibbs Measures},
  volume = {157},
  year = {2014},
  bdsk-url-1 = {https://doi.org/10.1007/s10955-014-1087-7},
  bdsk-url-2 = {http://dx.doi.org/10.1007/s10955-014-1087-7}
}
```