September 30, 2009

Books to Read While the Algae Grow in Your Fur, September 2009

Alexander Rosenberg, Economics: Mathematical Politics or Science of Diminishing Returns?
Rosenberg is a philosopher of science focused on biology and neo-classical economics. This is his second attempt at assessing the latter. (I haven't read his first go-round and it doesn't seem to be necessary.) Ordinarily, he says, the goal of science is to make increasing accurate and precise predictions about the world. (This includes the historical sciences like geology and paleontology, which predict new evidence rather than new events at definite future times. [Different branches of astronomy actually make both kinds of predictions.]) Economics, however, is incredibly bad at prediction — certainly at the improving-precision part. Rosenberg does not argue this point very strongly, taking it to be more or less obvious to anyone familiar with the state of economics, and I'm inclined to agree. This picture might be complicated by a detailed consideration of applied econometric models, but even when those work, they are very poorly grounded in economic theory*. (Incidentally, one of the pleasures of reading this was seeing Rosenberg assault Friedman's "Methodology of Positive Economics" essay, whose influence has been profound and utterly malign.) Rosenberg then has two questions: (1) if economics does share the usual goal of science, what are its prospects for achieving it? (2) if it does not have that goal, what is it trying to achieve — or, perhaps, better, what is the kind of thing economists do and want to keep doing fitted to achieve?
As to (1) he is intensely skeptical, because he sees microeconomic explanations as grounded in intentional explanations, a not-too-compelling formalization of folk psychology (desires mapping to utility functions and beliefs to subject probability distributions). He is extremely skeptical, on strictly philosophical grounds, about intentional explanations being made much better in the future than they have been through recorded history. His also skeptical that we could ever have something like a cognitive-scientific or neuro-scientific theory which explains behavior and recovers folk psychology as a useful approximation in certain domains; I completely fail to follow his argument here. What I seem to understand would imply that he thinks thermostats and self-guided missiles are impossible, so if I am right he should really listen to Uncle Norbert, especially before the inevitable robot uprising (I can just see his last words being, a la pp. 142--143, "this robot can't really be trying to kill me, because if that intention were represented in one part of its computational system, l1, who is the interpreter who treats the configuration of memory registers in l1 as expressing this goal? Surely it must be some other sub-system, call it l2, which reads l1, but then we face the same question all over again for l2 — urk!"); but no doubt I am wrong and he has some more reasonable idea which he does not, however, convey. The resounding experimental failure of maximizing-subjectively-expected-utility theory is not addressed. (Perhaps this was less clear in 1994 than it is now, but I doubt it.) I suspect that he would feel any of the models of choice proposed in behavioral economics are subject to the same critique, mutatis mutandis, that he makes of conventional microeconomics, because they're basically intentional.
As to (2), Rosenberg argues as follows. There is a three-way relationship between a discipline's goals, its theories, and its methods: given the goals (say, maximizing predictive accuracy), the theories tell us something about how well different methods will meet the goals. Likewise if you fix the goals and methods, only certain kinds of theories will be acceptable or reachable. And if you fix the theories and methods, you constrain the goals you can attain. (Rosenberg's argument here is very close to that of Larry Laudan in his great book Science and Values, and I think it's correct.) If we take neo-classical methods and theories as given, what might economics be successfully aiming at? Clearly not, by the previous argument, scientific prediction. Rather, Rosenberg offers two possibilities, not mutually exclusive. On the one hand, maybe it's really a species of hyper-formalized social-contract theory from political philosophy, with (as he says) the Walrasian auctioneer in the role of Hobbes's sovereign. Or: maybe it's a species of applied mathematics, interested in the implications of interacting transitive preference orderings. As he says, applied mathematicians are rarely interested in whether their math can, in fact, be applied to the real world — that's not their department.
Excusing economics's poor track-record as an empirical science by saying it's really political philosophy and/or applied math may be a defense worse than the original accusation. As Rosenberg notes, it makes the idea of attending to what economists have to say about policy matters rather odd; at best one should listen to them as much as to any other sect of political philosophers. I would suspect that Rosenberg was proposing this maliciously, but he seems to be sincere and not just good at writing with a straight face. I don't think economics is in quite such a plight as he does, but having just put the book down I admit I'm hard-pressed to articulate why.
*: For instance, when real business cycle theorists and their kin fit dynamic stochastic "general equilibrium"** models to empirical time-series of macro-economic quantities, these time series are first de-trended, i.e., made stationary. This has no justification in the representative-agent story underlying the models, but seems, at present, to be essential to actually getting estimates. Typically the de-trending is done through the "Hodrick-Prescott" filter***, again with no theoretical justification, and the business cycle is operationally defined as "the residuals of the filter". I suspect that most of the predictive ability of DSGEs comes from the filter, plus implicitly doing a moving-average smoothing of the residuals. It would be interesting to pit them against naive nonparametric forecasting (along say these lines).
**: I use the scare-quotes because I don't agree that representative agent models are general equilibrium models.
***: Known in statistics decades before Hodrick and Prescott as a "smoothing spline". (The word "spline" does not appear in their paper, and they are entirely innocent of the vast literature on how much smoothing to do.)
Sarah Graves, Wicked Fix
More cozy comfort-reading about sordid multiple homicide. (But whatever happened to Sam's girlfriend from the previous book?)
John Billheimer, Highway Robbery
Well-written, amusing and absorbing mystery novel about a family of highway engineers in West Virginia. The only thing keeping it from being perfect-for-me mind-candy is that part of the plot turns on making fun of environmentalists; but you can't have everything. This is the second book in a series; I've not read the others but will look them up.
Bent Jesper Christensen and Nicholas M. Kiefer, Economic Modeling and Inference
Review: An Optimal Path to a Dead End.
Joe Hill and Gabriel Rodriguez, Locke and Key, vol. 2: Head Games
High-grade comic book mind-candy. Definitely needs the earlier book.
Chelsea Cain, Evil at Heart
Great, if somewhat stomach-turning, mind-candy. (Probably needs the earlier books.) I actually wish Cain did more with the media-frenzy angle, however.
Possible continuity error: Isn't Susan awfully unconcerned about leaving her car alone in really dodgy neighborhoods, after she broke out one of its windows?
Thomas Levenson, Newton and the Counterfeiter: The Unknown Detective Career of the World's Greatest Scientist
Or: Newton demands the noose. — A wonderfully readable little biography of Newton, with the hook of looking at how he tackled his second career as Warden of the Mint, in charge of actually producing the English currency, and of catching and punishing counterfeiters. In particular, Levenson focuses on Newton's pursuit of a counterfeiter of particular skill and temerity, one William Chaloner, providing a great opportunity to explain the criminal underworld in which such figures lived, and the vast opportunities opening up for them as a result of the social transformations of which Newton was at once symbol, beneficiary and further driver. (Any idiot understands stealing a hunk of metal, and almost any idiot can grasp substituting pewter for silver, but the higher reaches of monetary crime require numeracy and comfort with sophisticated abstractions.) This is, in short, a portrait of the foundations of our world being laid, from the intellectual system of rational scientific explanation, to states powerful enough to enforce written laws on millions and raise the funds needed to wage war across the world, to through global commerce and flows of money, and stock-market Ponzi schemes in which geniuses lose fortunes. Enthusiastically recommended if any of this sounds the least bit appealing.
Kat Richardson, Vanished
Mind-candy: An American shaman in London. Ends in media res, though not with a cliff-hanger.
Halbert White, Estimation, Inference, and Specification Analysis
Review: How to Tell That Your Model Is Wrong; and, What Happens Afterwards.
House of Mystery: Love Stories for Dead People
More tales from the bar, plus a really unfortunate basement.
Tiziano Scalvi et al., The Dylan Dog Case Files
No purchase link because I actually dis-recommend it: predictable, tedious, implausible, not scary, excruciating when it tries to be funny, ultimately tiresome. (The drawing is I admit pretty good, but nowhere near the covers Mignola provides for the translated edition.) Is this really that popular in Italy? If so, does the original have virtues which did not survive translation, or does the old country simply have no taste at all in comics?
Phil and Kaja Foglio, Agatha Heterodyne and the Circus of Dreams and Agatha Heterodyne and the Clockwork Princess
Volumes 4 and 5 of Girl Genius. Go read.
I. J. Parker, The Convict's Sword
Converging murder cases in Heian-era Japan. Stands alone, but I enjoyed it more for knowing the back-story. (Previous volumes in the series: 1 and 2, 3, 4, 5.)
Madeleine E. Robins, Point of Honour
Your basic hard-boiled female private-eye detective novel, which also happens to be a historical mystery and a Regency romance; the charming love-child of Jane Austen, or perhaps Georgette Heyer, and Dashiell Hammett. I read it in one sitting from the beginning — "It is a truth universally acknowledged that a Fallen Woman of good family must, soon or late, descend to whoredom" — to the end, and really want the sequel.
(Read following up on an old review by Kate Nepveu.)
Update: The sequel is as good.

Books to Read While the Algae Grow in Your Fur; Enigmas of Chance; The Dismal Science; Pleasures of Detection, Portraits of Crime; Scientifiction and Fantastica; Philosophy; Writing for Antiquity; The Great Transformation

Posted by crshalizi at September 30, 2009 23:59 | permanent link

September 25, 2009

Miniature Pearl

It would be wrong to say that Judea Pearl knows more about causal inference than anyone else — I can think of some rivals very close to where I'm writing this — but he certainly knows a lot, and has worked tirelessly to formulate and spread the modern way of thinking about the subject, centered around graphical models and their associated structural equations. I remember spending many happy hours with his book Causality when it came out in 2000, and look forward to spending more with the new edition, which is making its way to me through the mail now. In the meanwhile, however, there is what he describes as "A new survey paper, gently summarizing everything I know about causation (in only 43 pages)":

"Causal Inference in Statistics: An Overview", forthcoming in Statistics Surveys 3 (2009): 96--146 [Free PDF]
Abstract: This review presents empirical researchers with recent advances in causal inference, and stresses the paradigmatic shifts that must be undertaken in moving from traditional statistical analysis to causal analysis of multivariate data. Special emphasis is placed on the assumptions that underly all causal inferences, the languages used in formulating those assumptions, the conditional nature of all causal and counterfactual claims, and the methods that have been developed for the assessment of such claims. These advances are illustrated using a general theory of causation based on the Structural Causal Model (SCM) described in Pearl (2000a), which subsumes and unifies other approaches to causation, and provides a coherent mathematical foundation for the analysis of causes and counterfactuals. In particular, the paper surveys the development of mathematical tools for inferring (from a combination of data and assumptions) answers to three types of causal queries: (1) queries about the effects of potential interventions, (also called "causal effects" or "policy evaluation") (2) queries about probabilities of counterfactuals, (including assessment of "regret," "attribution" or "causes of effects") and (3) queries about direct and indirect effects (also known as "mediation"). Finally, the paper defines the formal and conceptual relationships between the structural and potential-outcome frameworks and presents tools for a symbiotic analysis that uses the strong features of both.

The paper assumes a reader who's reasonably well-grounded in statistics, though not necessarily in the causal-inference literature. (Of such readers, I imagine applied economists might have more unlearning to do than most, because they will keep asking "but when do I start estimating beta?") It's not ideally calibrated for an reader coming from, say, machine learning.

One theme running through the paper is the futility of trying to define causality in purely probabilistic terms, and the fact that cases where it looks like one can do so are really cases where causal assumptions have been smuggled in. Another is that once you realize counterfactual or mechanistic assumptions are needed, the graphical-models/structural equation framework makes it immensely easier to reason about them than does the rival "potential outcomes" framework. In fact, the objects which the potential outcomes framework takes as its primitives can be constructed within the structural framework, so the correct part of the former is a subset of the latter. And by reasoning on graphical models it is easy to see that confounding can be introducing by "controlling for" the wrong variables, something explicitly denied by leading members of the potential-outcomes school. (Pearl quotes them making this mistake, and manages to pull off a more-in-sorrow-than-in-glee tone while doing so.) Mostly, however, the paper is about showing off what can be done within the new framework, which is really pretty impressive, and ought to be part of the standard tool-kit of data analysis. If you are not already familiar with it, this is an excellent place to begin, and if you are you will enjoy the elegant and comprehensive presentation.


Looking back over what I write in this blog, I feel like, on the one hand, there's too little of it lately, and on the other hand, it's too tilted towards negative, critical stuff. While not regretting at all being negative and critical about stupid ideas that need to be criticized (or, really, pulverized), I will try to expand and balance my output by posting at least once a week on some good science. We'll see how this goes.


Enigmas of Chance

Posted by crshalizi at September 25, 2009 10:12 | permanent link

September 23, 2009

Next Week at the Statistics Seminar: Selecting Demanding Models

Attention conservation notice: Only of interest if you (1) care about statistical model selection and (2) are in Pittsburgh on Monday afternoon.
"Composite Likelihood Bayesian Information Criteria for Model Selection in High-Dimensional Data"
Prof. Peter Song, Dept. of Biostatistics, University of Michigan
Abstract: For high-dimensional data with complicated dependency structures, the full likelihood approach often renders to intractable computational complexity. This imposes difficulty on model selection as most of the traditionally used information criteria require the evaluation of the full likelihood. We propose a composite likelihood version of the Bayesian information criterion (BIC) and establish its consistency property for the selection of the true underlying model. Under some mild regularity conditions, the proposed BIC is shown to be selection consistent, where the number of potential model parameters is allowed to increase to infinity at a certain rate of the sample size. Simulation studies demonstrate the empirical performance of this new BIC criterion, especially for the scenario that the number of parameters increases with the sample size.
Place and Time: 4--5 pm on Monday, 28 September 2009, in Doherty Hall A310

As always, the seminar is free and open to the public.

Enigmas of Chance

Posted by crshalizi at September 23, 2009 16:53 | permanent link

September 09, 2009

My Combinatorics Problem, Let Me Show You It

Because I am too lazy to solve it myself, and too dumb to find it in the literature: How many ways are there to file N letters among k folders, such that no folder is empty? (What I am really interested in doing is counting the number of non-isomorphic functions of a discrete random variable.) The first reader to provide a solution gets a free year's subscription to the blog.

Update, 2 hours later: Thanks to reader M.L. for pointing me to Stirling numbers of the second kind, and to readers L.Z. and Jake Hofman for reading the sentence in parentheses and directing me to Bell numbers and counting surjections. Also to the four (!) others who wrote in after them with solutions. — Now that I see the answer, I suspect I did this problem in one of David Griffeath's classes...

Mathematics

Posted by crshalizi at September 09, 2009 14:33 | permanent link

"Two Problems on Stirring Processes --- and their Solutions" (This Week at the Statistics Seminar)

Attention conservation notice: Irrelevant unless you are (a) interested in interacting particle systems and/or stochastic processes on graphs, and (b) in Pittsburgh.

This week (in fact, tomorrow!) at the statistics seminar:

Thomas M. Liggett, "Two Problems on Stirring Processes — and their Solutions"
Abstract: Consider a finite or infinite graph G=(V,E), together with an assignment of nonnegative rates ce to e \in E. For each edge e, there is a Poisson process \Pie of rate ce. A Markov process is defined on this structure by placing labels on V, and then interchanging the labels at the two vertices joined by e at the event times of \Pie. If one considers the motion of just one label, this is a random walk on V. If all labels are 0 or 1, it is the symmetric exclusion process on V. If all labels are distinct, it is a random walk on the set of permutations of V. In this talk, I will describe recent solutions to two problems about these processes:
1. Suppose G=Z1 and ce=1 for each edge. Consider the symmetric exclusion process starting with the configuration ... 1 1 1 0 0 0 .... , and let Nt be the number of 1's to the right of the origin at time t. This is sum of negatively correlated Bernoulli random variables. In 2000, R. Pemantle asked whether Nt satisfies a central limit theorem. I will explain how a new negative dependence concept leads to a a positive answer to this question. This is partly based on joint work with J. Borcea and P. Brändén.
2. Suppose G is the complete graph on n vertices, and consider both the random walk on V and the random walk on the set of permutations of V. Each is a reversible, finite state, Markov chain, with n and n! states respectively. The exponential rate of convergence to equilibrium (which is the uniform distribution) for such a chain is determined by the smallest non-zero eigenvalue of -Q, where Q is the transition rate matrix of the chain. Let l1 and l2 be these values for the two processes. It is elementary that l2 =< l1. Based on explicit computations in some special cases, D. Aldous conjectured in 1992 that l1=l2. I will describe some elements of the approach that leads to a proof of this conjecture. This is joint work with P. Caputo and T. Richthammer.
Time and Place: Thursday, 10 September 2009, 4:30--5:30 pm, Giant Eagle Auditorium (Baker Hall A53)

The seminar is free and open to the public.

Enigmas of Chance; Mathematics

Posted by crshalizi at September 09, 2009 14:24 | permanent link

Three-Toed Sloth:   Hosted, but not endorsed, by the Center for the Study of Complex Systems