Large Deviations
09 Nov 2005 17:39
The limit theorems of probability theory --- the weak and strong laws of large numbers, the central limit theorem, etc. --- basically say that averages taken over large samples (of well-behaved independent, identically distributed random variables) converge on expectation values. (The strong law of large numbers asserts almost-sure convergence, the central limit theorem asserts a kind of convergence in distribution, etc.) These results say little or nothing about the rate of convergence, however, which is often important for many applications of probability theory, e.g., statistical mechanics. One way to address this is the theory of large deviations. (I believe the terminology goes back to Varadhan in the 1970s, but that's just an impression, rather than research.)
Let me say things sloppily first, so the idea comes through, and then more
precisely, so people who know the subject won't get too upset.
Suppose X is a random variable with expected value EX, and we
consider
, the sample mean
of n samples of
X.
"obeys a large deviations principle" if there is a
non-negative function r, called the rate function, such
that
.
(The rate function has to obey some sensible but technical continuity
conditions.) This is a large deviation result, because the difference
between the empirical mean and the expectation is remaining constant
as n grows --- there has to be a larger and large conspiracy, as it
were, among the samples to keep deviating from the expectation in the same way.
Now, one reason what I've stated isn't really enough to satisfy a mathematician
is that the right-hand side converges on zero, so the functional form of the
probability could be anything which also converges on zero and that'd be
satisfied, but we want to pick out exponential convergence. The usual
way is to look at the limiting growth rate of the probability. Also, we want
the probability that the difference between the empirical mean and the
expectation falls into any arbitrary set. So one usually sees the LDP asserted
in some form like, for any reasonable set A,
(Actually, to be completely honest, I really shouldn't be assuming
that there is a limit to those probabilities. Instead I should connect the lim
inf of that expression to the infimum of the rate function over the interior
of A, and the lim sup to the infimum of the rate function over the
closure of A.)
Similar large deviation principles can be stated for the empirical distribution, the empirical process, functionals of sample paths, etc., rather than just the empirical mean. There are tricks for relating LDPs on higher-level objects, like the empirical distribution over trajectories, to LDPs on lower-level objects, like empirical means. (These go under names like "the contraction principle".)
Since ergodic theory extends the probabilistic limit laws to stochastic processes, rather than just sequences of independent variables, it shouldn't be surprising that large deviation principles also hold for some stochastic processes. I am particularly interested in LDPs for Markov processes, and their applications. There are further important connections to information theory, since in an awful lot of situations, the large deviations rate function is the Kullback-Leibler divergence, a.k.a. the relative entropy.
Related, but strictly speaking distinct topics:
- Finite-sample deviation inequalities, such as the Bernstein, Chernoff and Hoeffding inequalities, which bound the probability of averages departing by more than a certain amount from expectation values at given finite sample sizes;
- Concentration of measure, roughly speaking upper bounds on deviation probabilities holding uniformly over large classes of functions. (Note that large deviations principles have match upper and lower bounds, and need only hold asymptotically.)
See also: Exponential Families of Probability Measures; Maximum entropy
- Recommended:
- James Bucklew, Large Deviation Techniques in Decision, Simulation, and Estimation
- Thomas Cover and Joy Thomas, Elements of Information Theory [Very nice chapter on large deviations for IID sequences]
- Amir Dembo and Ofer Zeitouni, Large Deviations Techniques and Applications [Chapters 2, 4 and 5, and parts of chapter 6, are available in postscript format via Prof. Dembo's page for his course on large deviations]
- Frank den Hollander, Large Deviations [Nice introductory text for people with an applied probability background. Short.]
- Richard S. Ellis
- "The Theory of Large Deviations: from Boltzmann's 1877 Calculation to Equilibrium Macrostates in 2D Turbulence", Physica D 133 (1999): 106--136
- Entropy, Large Deviations, and Statistical Mechanics
- M. I. Friedlin and A. D. Wentzell, Random Perturbations of Dynamical Systems
- Hugo Touchette, "The Large Deviations Approach to Statistical Mechanics", Physics Reports 478 (2009): 1--69, arxiv:0804.0327
- S. R. S. Varadhan, "Large Deviations", Annals of Probability 36 (2008): 397--419 [Copy via Prof. Varadhan. Wald Lecture for 2005.]
- Recommended, more specialized:
- R. R. Bahadur, Some Limit Theorems in Statistics [1971. The notation is now much more transparent, and the proofs of many basic theorems considerably simplified. But if there's a better source for statistical applications than this little book, I've yet to find it.]
- Julien Barré, Freddy Bouchet, Thierry Dauxois and Stefano Ruffo, "Large deviation techniques applied to systems with long-range interactions", cond-mat/0406358 = Journal of Statistics Physics 119 (2005): 677--713
- Michel Benaïn and Jörgen W. Weibull, "Deterministic Approximation of Stochastic Evolution in Games", Econometrica 71 (2003): 879--903 [JSTOR]
- Arijit Chakrabarty, "Effect of truncation on large deviations for heavy-tailed random vectors", arxiv:1107.2476
- J.-R. Chazottes and D. Gabrielli, "Large deviations for empirical entropies of Gibbsian sources", math.PR/0406083 = Nonlinearity 18 (2005): 2545--2563 [This is a very cool result which shows that block entropies, and entropy rates estimated from those blocks, obey the large deviation principle even as one lets the length of the blocks grow with the amount of data, provided the block-length doesn't grow too quickly (only logarithmically). I wish I could write papers like this.]
- W. De Roeck, Christian Maes and Karel Netocny, "H-Theorems from Autonomous Equations", cond-mat/0508089 [this basically derives the H-theorem of statistical mechanics as a large deviations result, assuming a certain reasonable Markovian form for the macroscopic dynamics. In fact, we have a separate argument that you don't have that Markovian form, you're just not trying hard enough; see here]
- Paul Dupuis, "Large Deviations Analysis of Some Recursive Algorithms with State-Dependent Noise", Annals of Probability 16 (1988): 1509--1536 [Open access]
- Gregory L. Eyink
- "Action principle in nonequilbrium statistical dynamics," Physical Review E 54 (1996): 3419--3435 [Least action as a consequence of Markovian LDP]
- "A Variational Formulation of Optimal Nonlinear Estimation," physics/0011049 [Nice connections between optimal state estimation (assuming a known form for the underlying stochastic process), nonequilibrium statistical mechanics, and large deviations theory, leading to tractable-looking numerical schemes for estimation.]
- Jin Feng and Thomas G. Kurtz, Large Deviations for Stochastic Processes [Online]
- Fuqing Gao and Xingqiu Zhao, "Delta method in large deviations and moderate deviations for estimators", Annals of Statistics 39 (2011): 1211-1240, arxiv:1105.3552 [This is based on an extension of the "contraction principle" which is of independent interest]
- S. Orey and S. Peliken, "Large deviations principles for stationary processes", Annals of Probability 16 (1988): 1481--1496
- Eric Smith, "Large-deviation principles, stochastic effective actions, path entropies, and the structure and meaning of thermodynamic descriptions", arxiv:1102.3938
- Eric Smith and Supriya Krishnamurthy, "Symmetry and Collective Fluctuations in Evolutionary Games", SFI Working Paper 11-03-010
- To read:
- Paul H. Algoet and Brian H. Marcus, "Large Deviation Theorems for Empirical Types of Markov Chains Constrained to Thin Sets," IEEE Trans. Info. Theory 38 (1992): 1276--1291
- Alexei Andreanov, Giulio Biroli, Jean-Philippe Bouchaud, and Alexandre Lefèvre, "Field theories and exact stochastic equations for interacting particle systems", Physical Review E 74 (2006): 030101 = cond-mat/0602307
- Ellen Baake, Frank den Hollander and Natali Zint, "How T-Cells Use Large Deviations to Recognize Foreign Antigens", arxiv:q-bio.SC/0605016 [Presumably == the paper of the same title in Journal of Mathematical Biology 57 (2008): 841--861, but that orders the authors Zint, Baake and den Hollander.]
- L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, C. Landim, "Large deviation approach to non equilibrium processes in stochastic lattice gases", arxiv:math/0602557
- Matthias Birkner, Andreas Greven and Frank den Hollander, "Quenched large deviation principle for words in a letter sequence", arxiv:0807.2611
- Igor Bjelakovic, Jean-Dominique Deuschel, Tyll Krueger, Ruedi
Seiler, Rainer Siegmund-Schultze and Arleta Szkola
- "A quantum version of Sanov's theorem", quant-ph/0412157 = Communications in Mathematical Physics 260 (2005): 659--671 [Quantum large deviations!]
- "Typical support and Sanov large deviations of correlated states", math.PR/0703772 = Communications in Mathematical Physics 279 (2008): 559--584
- Amarjit Budhiraja, Paul Dupuis, Markus Fischer, "Large deviation properties of weakly interacting processes via weak convergence methods", arxiv:1009.6030
- Amarjit Budhiraja, Paul Dupuis, Vasileios Maroulas, "Large deviations for infinite dimensional stochastic dynamical systems", Annals of Applied Probability 36 (2008): 1390--1420 = arxiv:0808.3631
- Arijit Chakrabarty, "Central Limit Theorem and Large Deviations for truncated heavy-tailed random vectors", arxiv:1003.2159
- Sourav Chatterjee and S. R. S. Varadhan, "The large deviation principle for the Erdos-Renyi random graph", arxiv:1008.1946
- Po-Ning Chen, "Generalization of Gartner-Ellis theorem", IEEE Transactions on Information Theory 46 (2000): 2752--2760
- Zhiyi Chi
- "Large deviations for template matching between point processes", Annals of Applied Probability 15 (2005): 153--174 = math.PR/0503463
- "On the asymptotic of likelihood ratios for self-normalized large deviations", arxiv:0709.1506
- Igor Chueshov and Annie Millet, "Stochastic 2D hydrodynamical type systems: Well posedness and large deviations", arxiv:0807.1810
- A. de Acosta, "A general nonconvex large deviation result II", Annals of Probability 32 (2004): 1873--1901 = math.PR/0410101
- Zach Deitz and Sunder Sethuraman, "Large deviations for a class of nonhomgeneous Markov chains", math.PR/0404230
- B. Derrida, "Non equilibrium steady states: fluctuations and large deviations of the density and of the current", cond-mat/0703762
- B. Derrida, Joel L. Lebowitz and Eugene R. Speer, "Exact Large Deviation Functional for the Density Profile in a Stationary Nonequilibrium Open System," cond-mat/0105110
- Paul Dupuis and Richard S. Ellis, A Weak Convergence Approach to the Theory of Large Deviations [PDF preprint]
- Vlad Elgart and Alex Kamenev, "Rare Events Statistics in Reaction--Diffusion Systems", cond-mat/0404241 [i.e., large deviations]
- Andreas Engel, Remi Monasson and Alexander K. Hartmann, "On Large Deviation Properties of Erdos-Renyi Random Graphs", Journal of Statistical Physics 117 (2004): 387--426
- Hans Follmer and Steven Orey, "Large Deviations for the Empirical Field of a Gibbs Measure", Annals of Probability 16 (1988): 961--977
- Cristian Giardina', Jorge Kurchan, Luca Peliti, "Direct evaluation of large-deviation functions", cond-mat/0511248 ["numerical [evaluation of] probabilities of large deviations of physical quantities, such as current or density, that are local in time. The large-deviation functions are given in terms of the typical properties of a modified dynamics, and since they no longer involve rare events, can be evaluated efficiently and over a wider ranges of values."]
- Yuri Golubev, Vladimir Spokoiny, "Exponential bounds for minimum contrast estimators", arxiv:0901.0655
- Nathael Gozlan and Christian Léonard
- "A large deviation approach to some transportation cost inequalities", math.PR/0510601
- "Transport inequalities. A survey", arxiv:1003.3852
- Alice Guionnet, "Large deviations and stochastic calculus for large random matrices", Probability Surveys 1 (2004): 72--172 [Open access]
- O. V. Gulinskii and R. S. Liptser, "Example of Large Deviations for Stationary Processes", Theory of Probability and Applications 44 (1999): 211--225 [PDF]
- Te Sun Han
- "Hypothesis Testing with the General Source", IEEE Transactions on Information Theory 46 (2000): 2415--2427 = math.PR/0004121 ["The asymptotically optimal hypothesis testing problem with the general sources as the null and alternative hypotheses is studied.... Our fundamental philosophy in doing so is first to convert all of the hypothesis testing problems completely to the pertinent computation problems in the large deviation-probability theory. ... [This] enables us to establish quite compact general formulas of the optimal exponents of the second kind of error and correct testing probabbilities for the general sources including all nonstationary and/or nonergodic sources with arbitrary abstract alphabet (countable or uncountable). Such general formulas are presented from the information-spectrum point of view."]
- "An information-spectrum approach to large deviation theorems", cs.IT/0606104
- Henrik Hult and Gennady Samorodnitsky, "Large deviations for point processes based on stationary sequences with heavy tails", Journal of Applied Probability 47 (2010): 1--40
- Svante Janson, "Large deviations for sums of partly dependent random variables", Random Structures and Algorithms 24 (2004): 234--248 ["We use and extend a method by Hoeffding to obtain strong large deviation bounds for sums of dependent random variables with suitable dependency structure. The method is based on breaking up the sum into sums of independent variables. Applications are given to U-statistics, random strings and random graphs." Applied here only to Erdos-Renyi (IID) random graphs, but might be extendable to Markov random graphs...? PDF preprint]
- Giovanni Jona-Lasinio, "From fluctuations in hydrodynamics to nonequilibrium thermodynamics", arxiv:1003.4164
- Vladislav Kargin, "A Large Deviation Inequality for Vector Functions on Finite Reversible Markov Chains", math.PR/0508538
- Gerhard Keller, Equilibrium States in Ergodic Theory [blurb]
- Michael Keyl, "Quantum state estimation and large deviations", quant-ph/0412053
- Yuri Kifer, "Large deviations and adiabatic transitions for dynamical systems and Markov processes in fully coupled averaging", arxiv:0710.2405
- Yuichi Kitamura, "Empirical likelihood methods in econometrics: Theory and Practice", Cowles Foundation Discussion Paper No. 1569 (2006)
- F. Klebaner and R. Liptser, "Large Deviations for Past-Dependent Recursions", math.PR/0603407 [Corrected version of Problems of Information Transmission 32 (1996): 23--34]
- Ioannis Kontoyiannis and S. P. Meyn
- "Large deviations asymptotics and the spectral theory of multiplicatively regular Markov processes", math.PR/0509310 = Electronic Journal of Probability 10 (2005): 61--123
- "Spectral Theory and Limit Theorems for Geometrically Ergodic Markov Processes", math.PR/0209200 = Annals of Applied Probability 13 (2003): 304--362
- "Computable exponential bounds for screened estimation and simulation", Annals of Applied Probability 18 (2008): 1491--1518, arxiv:math/0612040
- D. Lacoste, A. W. C. Lau and K. Mallick, "Fluctuation theorem and large deviation function for a solvable model of a molecular motor", Physical Review E 78 (2008): 011915
- Vivien Lecomte, Cécile Appert-Rolland, and
Frédéric van Wijland
- "Thermodynamic formalism for systems with Markov dynamics", cond-mat/0606211
- "Thermodynamic formalism and large deviation functions in continuous time Markov dynamics", cond-mat/0703435
- Vivien Lecomte and Julien Tailleur, "A numerical approach to large deviations in continuous time", Journal of Statistical Mechanics: Theory and Experiment 2007: P03004
- Raphael Lefevere, Mauro Mariani, Lorenzo Zambotti, "Large deviations for renewal processes", arxiv:1009.2659
- Robert Sh. Liptser and Anatolii A. Pukhalskii, "Limit theorems on large deviations for semimartingales", math.PR/0510028 [But published in a journal in 1992]
- Yutao Ma, Ran Wang, Liming Wu, "Moderate Deviation Principle for dynamical systems with small random perturbation", arxiv:1107.3432
- Claudio Macci, "Large Deviations for Empirical Estimators of the Stationary Distribution of a Semi-Markov Process with Finite State Space", Communications in Statistics: Theory and Methods 37 (2008): 3077--3089
- Satya N. Majumdar and Alan J. Bray, "Large-Deviation Functions for Nonlinear Functionals of a Gaussian Stationary Markov Process", cond-mat/0202138 = Physical Review E 65 (2002): 051112
- David McAllester, "A Statistical Mechanics Approach to Large Devations Theorems" [E-print available via CiteSeer --- published?]
- Abdelkader Mokkadem, Mariane Pelletier and Baba Thiam, "Large and moderate deviations principles for kernel estimators of the multivariate regression", math.ST/0703341
- K. Netocny and F. Redig, "Large deviations for quantum spin systems", math-ph/0404018 = Journal of Statistical Physics 117 (2004): 521--547
- Enzo Olivieri and Maria Eulalia Vares, Large Deviations and Metastability [Blurb]
- Huyen Pham, "Some applications and methods of large deviations in finance and insurance",math.PR/0702473
- Anatoly Puhalskii, Large Deviations and Idempotent Probability
- Anatolii A. Puhalskii, "Stochastic processes in random graphs", math.PR/0402183 [Large deviations for Erdos-Renyi graphs. Memo to self: how much work would it be to extend this to Markovian graphs?]
- Hong Qian, "Relative Entropy: Free Energy Associated with Equilibrium Fluctuations and Nonequilibrium Deviations", math-ph/0007010 = Physical Review E 63 (2001): 042103
- Olivier Rivoire, "The cavity method for large deviations", cond-mat/0506164 = Journal of Statistical Mechanics: Theory and Experiment (2005): P07004 ["A method is introduced for studying large deviations in the context of statistical physics of disordered systems. The approach, based on an extension of the cavity method to atypical realizations of the quenched disorder, allows us to compute exponentially small probabilities (rate functions) over different classes of random graphs."]
- David Ruelle, Thermodynamic Formalism
- L. Saulis and V. A. Statulevicius, Limit Theorems for Large Deviations
- Carolyn Schroeder, "I-Projection and Conditional Limit Theorems for Discrete Parameter Markov Processes", Annals of Probability 21 (1993): 721--758
- Adam Shwartz, Large Deviations in Performance Modeling
- Joe Suzuki, "A Markov chain analysis of genetic algorithms: large deviation principle approach", Journal of Applied Probability 47 (2010): 967--975
- Vincent Y. F. Tan, Animashree Anandkumar, Lang Tong and Alan S. Willsky, "A Large-Deviation Analysis of the Maximum-Likelihood Learning of Markov Tree Structures", IEEE Transactions on Information Theory 57 (2011): 1714--1735, arxiv:0905.0940 [Large deviations for Chow-Liu trees]
- Wei Wang, A. J. Roberts and Jinqiao Duan, "Large deviations for slow-fast stochastic partial differential equations", arxiv:1001.4826 ["the rate function is exactly that of the averaged equation plus the fluctuating deviation which is a stochastic partial differential equation with small Gaussian perturbation"]
- Lingjiong Zhu, "Process-Level Large Deviations for General Hawkes Processes", arxiv:1108.2431
- To write:
- CRS, "Large Deviations in Exponential Families of Stochastic Automata"
