According to the principles of nominalism, what (its critics asked) is, for instance, a Chopin piano concerto? Is it a piece of paper covered with musical notation? Or is it, perhaps, an event that occurred in Chopin's mind? Or is it every particular instance of its performance?The expositions preceding such questions are simple, clear, engaging, accurate; they are also non-revelatory, but then this is supposed to be a popular work and not original scholarship. I am happy to say that in this book (unlike his previous one), displays Kolakowski's abundant talents as a critic of philosophy, someone who can explain to the reader what a philosopher attempted, and why, and why it might matter to us, without at the same time letting himself be captured by the writer he happens to be explaining.Does God's absolute omnipotence really entail the consequence that all the moral rules He revealed to us are His arbitrary decree, and that it makes no sense to say that they are good in themselves, independently of being decreed by Him?
Let us assume that God, in His omnipotence, is causing us to imagine everything we experience and think of as real, and that the world of our perception is an illusion. What would be the difference between this world and a real world identical with it in content? How could the reality be described so as to distinguish it from the illusion?
Husserl forced us to confront an uncomfortable alternative: either we accept the restrictions of empiricism, turning away from the great philosophical tradition — the search for truth, meaning, and the nature of being — and impoverishing European culture, or we must accept some form of transcendentalism, not necessarily Husserl's reduction and his idealism, but the belief that the human mind can have some insight into being and truth.Certainly Kolakowski thinks we are faced with this choice, and strongly hints, though he doesn't unequivocally state, that we should go for the second option. But notice that he doesn't say the second option is true, just that it needs to be accepted if certain once-valued cultural traditions are to retain their legitimacy. In other words it is a this-worldly, consequentialist, indeed vulgarly pragmatist argument for transcendentalism! Even accepting the dilemma at face value, one might well feel that an honest continuation of the tradition of devotedly seeking the truth would involve giving up ideas that seem, in retrospect, like wishful thinking...
Books to Read While the Algae Grow in Your Fur; Scientifiction and Fantastica; The Continuing Crises; The Beloved Republic; Philosophy; Complexity; Writing for Antiquity; The Commonwealth of Letters; The Progressive Forces; The Dismal Science
Posted by crshalizi at December 31, 2007 23:59 | permanent link
Via Bob Sutton, I learn that Teresa Amabile revealed the secret of my success, such as it is, back in 1981:
Using edited excerpts from actual negative and positive book reviews, this research examined the hypothesis that negative evaluators of intellectual products will be perceived as more intelligent than positive evaluators. The results strongly supported the hypothesis. Negative reviewers were perceived as more intelligent, competent, and expert than positive reviewers, even when the content of the positive review was independently judged as being of higher quality and greater forcefulness. At the same time, in accord with previous research, negative reviewers were perceived as significantly less likable than positive reviewers. The results on intelligence ratings are seen as bolstering the self-presentational explanation of the tendency shown by intellectually insecure individuals to be negatively critical.I am mentioned by name in Sutton's comments. A resolution may be called for.
Manual trackback: Matt McIrvin; Science After Sunclipse; Ars Mathematica
Posted by crshalizi at December 31, 2007 22:22 | permanent link
Claude Bernard was one of the great scientists of the 19th century, and arguably of all time. By most accounts (including his own), he played a pivotal role in reshaping physiology into an experimental science, based on physics and chemistry, which sought to uncover the mechanisms behind normal and pathological processes in organisms by deliberately and systematically manipulating them in precisely reproducible ways. This ideal continues to dominate large parts of biology, including very much molecular biology, and even influenced psychology (via behaviorism) and the arts (via Zola, whose grasp of what it meant to conduct an experiment was at best tenuous). Two key ideas we normally associate with the 20th century — falsificationism and homeostasis — were explicitly central to Bernard's thought, but I've written about that elsewhere and won't repeat myself.
Bernard did, however, have a scientific weakness, which was to disdain quantitative studies in biology, and more especially statistics. He was very much of the school which held that the need for statistical analysis was never anything more than a sign of bad experimental design. (He put it much more elegantly.) He also looked down on contemporary quantitative studies. Kelvin is supposed to have said that "When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of science." Bernard (as it were) pre-emptively countered this argument, as follows:
The application of mathematics to natural phenomena is the aim of all sicnece, because phenomenal law should always be mathematically expressed. To this end, data used in calculations should be results of well-analyzed facts, so that we may be sure that we fully known the conditions of the phenomena between which we wish to establish an equation. Now, I think that efforts of this kind are premature in most vital phenomena, precisely because these phenomena are so complex that we must not only assume, but are in fact certain that, beside the few among their conditions which we know, there are numberless others which are still totally unknown. I believe that the most useful path for physiology and medicine to follow now is to seek to discover new facts instead of trying to reduce to equations the facts which science already possesses. This does not mean that I condemn the application of mathematics to biological phenomena, because the science will latter be established by this alone; only I am convinced that, since a complete equation is impossible for the moment, qualitative must necessarily precede quantitative study of phenomena. [An Introduction to the Study of Experimental Medicine, p. 129]In other words, a meager and unsatisfying understanding can very well find its expression in numbers, if prior scientific knowledge hasn't marked out what you should be measuring, how you should be measuring it, and what other measures you should be relating it to. The difficulties of physiological measurement are compounded by the fact that organisms have inner environments, physio-chemical variables internal to the organism, whose homeostatic maintenance makes organisms more or less autonomous, and these too need to be identified and measured. Though I am myself a quantitative scientist and a statistician — in fact, co-author on a just-submitted paper on the statistical analysis of neurophysiological data — I must say that Bernard had a point. If I look at successful ventures into quantitative biology (e.g.), there is a long history of qualitative science behind them, getting us to the point where do grasp at least some of the "conditions of the phenomena".
But I promised cats. Bernard made all these points in, among other places, his classic Introduction to the Study of Experimental Medicine, giving over a fairly long section (part II, chapter II, section IX; pp. 129--140 of the English translation) to explaining why he did not look kindly on "calculations in the study of living beings", permitting himself what I can only call snark. Drawing a veil over the episode in which an enterprising physiologist claimed to be collecting samples from a railway station men's room (pp. 134--135; perhaps a precursor of Sen. Craig?), let us consider cats (p. 132):
In the part of their investigation devoted to nutrition, Bidder and Schmidt described a very notable experiment, perhaps one of the most laborious ever performed. From the point of view of elementary analysis [i.e., tracking the amounts of different chemical elements], they kept a balance sheet of everything taken in and given out by a cat during eight days' nourishment and nineteen days' fasting. But this cat was in a physiological condition of which they were unaware; she was pregnant, and had her kittens on the seventeenth day of the experiment. In these circumstances, our authors considered the kittens as excretions, and calculated them with other eliminated materials as a simple loss of weight. I believe that these interpretations should be rectified when trying to define such complex phenomena.The fate of the cat and her kittens is not recorded, but I like to imagine them slinking around Bidder and Schmidt's laboratories in Dorpat like a pride of domestic sphinxes.
Manual trackback: Siris
Posted by crshalizi at December 28, 2007 18:06 | permanent link
Somebody once told me about a model for the growth of links on the Web by analogy with Hebbian learning. It was not either of the two people I thought it was. If it was you, or if you have any idea what I'm talking about, please drop me a line (cshalizi at stat dot oryx dot cmu dot edu, deleting the name of a genus of antelope).
Posted by crshalizi at December 04, 2007 15:47 | permanent link
Gary Farber of Amygdala ("all the news that gives me fits") is, simply, one of the best bloggers around, and has been for years; he's also in a real mess and could use your help. Subscribe, why don't you?
(I see that Lizardbreath at Unfogged has beat me to the title, but it's still a good one so I'll use it anyway.)
Posted by crshalizi at December 03, 2007 14:45 | permanent link
Attention conservation notice: A strained conceit which involves two different mathematical sciences, and has at least one big problem even if you grant the outlandish major premise.
A basic part of modern ideas about causality and causal inference is what's sometimes called the "Neyman-Rubin counterfactual conception of causality" --- which is a mouthful, even if it does commemorate one of my heroes. The basic idea is simple: the causal effect of some difference in treatments on an outcome variable is the average difference between the value of the outcome between cases where one treatment was applied and those where the other one was applied to the same member of the population. The causal effect of a 419 e-mail on someone's net worth, for example, is the difference between their net worth if they answer the letter and if they do not. This is pretty obviously the kind of thing one wants to know about possible manipulations — though if you really want to get into the subtleties you should really read Glymour * and Pearl — but there is a problem: how can you both apply and not apply the treatment? How can you both answer and not answer the letter? (I am somewhat disappointed that there isn't an industry-funded P. T. Barnum Institute in Lagos, putting out press releases claiming that people who answered 419 letters would have lost even more money had they not done so.)
Trying to square this circle has lead to a lot of very ingenious statistical work, like propensity score methods and the consistent discovery algorithms of Spirtes, Glymour and Scheines. But now, via Wolfgang, I learn of something which makes me think that there is after all a physical solution:
Ruling out EPR correlations as unacceptable, let's concentrate on the bit about getting branches of the wave function to communicate with each other. Basically, Polchinski shows how to construct an apparatus where you make two measurements on the same particle in sequence, and the result of the second measurement depends on what you would have done had the first measurement been different. He turns this into a situation where "in effect, the apparatus reads the observer's mind", but what's more interesting to me is that this is just the kind of thing one would want to do for causal inference.
Here's how it goes. Step 1: Take a charged spin-1/2 particle like an electron. If we measure its spin along our favorite axis, say with a Stern-Gerlach device, we can get only one of two results, either +1/2 or -1/2, depending on the path the particle takes through the device. Do so, record the measurement, and rejoin the two paths. Step 2: If the spin was +1/2, do nothing. If the spin was -1/2, either (a) do nothing, or (b) rotate the spin with a magnetic field. Step 3: subject the particle to a field coupled to a suitably-constructed nonlinear observable; this is where the nonlinearity comes in. (I realize that sentence means nothing if you don't know some ordinary quantum mechanics; the point is, it's doing something that's not possible in ordinary QM.) Step 4: If you measured +1/2 in step 1, measure the spin again. Otherwise, do nothing. Upshot: If you take a measurement in step 4, you get +1/2 if you took action (a) in step 2, and -1/2 if you took action (b). (That's not obvious but does follow from Polchinski's calculations.) But to take either action, you'd have had to have measured a spin of -1/2 in step 1, which means you don't take a measurement at all in step 4. Thus, "the action and observation are in two different branches of the wave function".
Polchinski's set-up is a little too simple to estimate the causal effect of responding to a 419 scam, but that's just an implementation detail. Quantum mechanics seems very strongly to be a linear theory, but if that's wrong and it's really nonlinear (if only very slightly), then, pretty generically, doing causal inference should just be a matter of constructing the right sequence of measurements to act as an "Everett phone" between branches of the wave-function. Good-bye, covariate matching; hello, interferometry!
*: I can't resist quoting a paragraph.
I am tempted to think [that the demand] that a cause always be relative to a specific alternative is an improvement on the bare counterfactual account of causal relations. The reason is this: My Uncle Schlomo smoked two packs of cigarettes a day, and I am firmly convinced that smoking two packs of cigarettes a day caused him to get lung cancer. But it may not be true that in the closest possible world in which Uncle Schlomo did not smoke two packs a day, he did not contract cancer. Reflecting on Schlomo's addictive personality, and his general weakness of will, it may well be that the closest possible world in which Schlomo did not smoke two packs of cigarettes a day is a world in which he smoked three packs a day. I can reconcile this reflection with the counterfactual analysis of causality by supposing ... that "smoking two packs of cigarettes a day caused him to get lung cancer" is elliptical speech, and what is meant, but not said, is that smoking two packs of cigarettes a day, rather than not smoking at all, caused Schlomo to contract lung cancer.
Posted by crshalizi at December 02, 2007 15:20 | permanent link
It's curious, one might say, how, ever since there's a been a global economic system, the people who are supposedly just too dumb for civilization are the ones on its periphery, even as that periphery keeps moving. Thus wrote ibn Khaldûn (1377):
We have explained that the cultivated region of that part of the earth which is not covered by water has its center toward the north, because of the excessive heat in the south and the excessive cold in the north. The north and the south represent opposite extremes of cold and heat. It necessarily follows that there must be a gradual decrease from the extremes toward the center, which, thus, is moderate. The fourth zone is the most temperate cultivated region.The bordering third and fifth zones are rather close to being temperate. The sixth and second zones which are adjacent to them are far from temperate, and the first and seventh zones still less so. Therefore, the sciences, the crafts, the buildings, the clothing, the foodstuffs, the fruits, even the animals, and everything that comes into being in the three middle zones are distinguished by their temperate (well-proportioned character). The human inhabitants of these zones are more temperate (well-proportioned) in their bodies, color, character qualities, and (general) conditions. They are found to be extremely moderate in their dwellings, clothing, food stuffs, and crafts. They use houses that are well constructed of stone and embellished by craftsmanship. They rival each other in production of the very best tools and implements. Among them, one finds the natural minerals, such as gold, silver, iron, copper, lead, and tin. In their business dealings they use the two precious metals (gold and silver). They avoid intemperance quite generally in all their conditions. Such are the inhabitants of the Maghrib, of Syria, the two 'Iraqs, Western India (as-Sind), and China, as well as of Spain; also the European Christians nearby, the Galicians, and all those who live together with these peoples or near them in the three temperate zones. The 'Iraq and Syria are directly in the middle and therefore are the most temperate of all these countries.
The inhabitants of the zones that are far from temperate, such as the first, second, sixth, and seventh zones, are also farther removed from being temperate in all their conditions. Their buildings are of clay and reeds. Their foodstuffs are durra and herbs. Their clothing is the leaves of trees, which they sew together to cover themselves, or animal skins. Most of them go naked. The fruits and seasonings of their countries are strange and inclined to be intemperate. In their business dealings, they do not use the two noble metals, but copper, iron, or skins, upon which they set a value for the purpose of business dealings. Their qualities of character, moreover, are close to those of dumb animals. It has even been reported that most of the Negroes of the first zone dwell in caves and thickets, eat herbs, live in savage isolation and do not congregate, and eat each other. The same applies to the Slavs. The reason for this is that their remoteness from being temperate produces in them a disposition and character similar to those of the dumb animals, and they become correspondingly remote from humanity. The same also applies to their religious conditions. They are ignorant of prophecy and do not have a religious law, except for the small minority that lives near the temperate regions. (This minority includes,) for instance, the Abyssinians, who are neighbors of the Yemenites and have been Christians from pre-Islamic and Islamic times down to the present; and the Mali, the Gawgaw, and the Takrur who live close to the Maghrib and, at this time, are Muslims. They are said to have adopted Islam in the seventh [thirteenth] century. Or, in the north, there are those Slav, European Christian, and Turkish nations that have adopted Christianity. All the other inhabitants of the intemperate zones in the south and in the north are ignorant of all religion. (Religious) scholarship is lacking among them. All their conditions are remote from those of human beings and close to those of wild animals. "And He creates what you do not know."
Of course, ibn Khalduûn remains a great scholar and pioneer of social science for other reasons. (For example, he goes on in that chapter to rebuke the hereditarians for their scientific ignorance.)
Posted by crshalizi at December 01, 2007 19:15 | permanent link
Books to Read While the Algae Grow in Your Fur; Scientifiction and Fantastica; Enigmas of Chance; The Progressive Forces; The Dismal Science
Posted by crshalizi at November 30, 2007 23:59 | permanent link
Attention Conservation Notice: Of interest only if you have been following the mess about William Saletan's series of articles in Slate, saying that liberals are just as bad as creationists for refusing to accept the scientific evidence that black people are just inherently dumber than white people. Of course, it may not be interesting even if you have been following that. (For background, see here, here, here or here; I don't feel like rewarding Saletan's rubbish with a link, but it shouldn't take you but a moment to find it, if you want.)
Saletan has written an epilogue, titled "Regrets", to his series, which is a very curious piece of work indeed. Here's the end of it in its entirety (except for the links):
In researching this subject, I focused on published data and relied on peer review and rebuttals to expose any relevant issue. As a result, I missed something I could have picked up from a simple glance at Wikipedia.For the past five years, J. Philippe Rushton has been president of the Pioneer Fund, an organization dedicated to "the scientific study of heredity and human differences." During this time, the fund has awarded at least $70,000 to the New Century Foundation. To get a flavor of what New Century stands for, check out its publications on crime ("Everyone knows that blacks are dangerous") and heresy ("Unless whites shake off the teachings of racial orthodoxy they will cease to be a distinct people"). New Century publishes a magazine called American Renaissance, which preaches segregation. Rushton routinely speaks at its conferences. I was negligent in failing to research and report this. I'm sorry. I owe you better than that.
In my first post about this, I said that there were two possible interpretations of Saletan's actions: that he didn't know that the ideas he was spreading were crap, or that he did, but spread them anyway to advance an agenda. Saying that the second interpretation was more charitable wasn't just a joke. Sadly, this partial mea culpa supports the first interpretation, that of incompetence. To put it in "shorter William Saletan" form, what he is saying is: I am shocked — shocked! — to discover that the people who devote their careers to providing supposedly-scientific backing for racist ideas are, in fact, flaming racists. And he does seem to be shocked, though it is hard (as Yglesias says) to see why, logically, he should strain out those gnats he displays for our horrified inspection while swallowing the camel of group inferiority (and telling his readers that camel is really great and the coming thing). This indicates a level of incompetence as a reporter and researcher that is really quite stunning — as Brad DeLong says, this seems like a trained incapacity.
But let me back up a minute to the bit about relying on "peer review and rebuttals to expose any relevant issue". There are two problems here.
One has to do with the fact that, as I said, it is really very easy to find the rebuttals showing that Rushton's papers, in particular, are a tragic waste of precious trees and disk-space. For example, in the very same issue of the very same journal as the paper by Rushton and Jensen which was one of Saletan's main sources, Richard Nisbett, one of the more important psychologists of our time, takes his turn banging his head against this particular wall. Or, again, if Saletan had been at all curious about the issue of head sizes, which seems to have impressed him so much, it would have taken about five minutes with Google Scholar to find a demonstration that this is crap. So I really have no idea what Saletan means when he claimed he relied on published rebuttals — did he think they would just crawl into his lap and sit there, meowing to be read? If I had to guess, I'd say that the most likely explanation of Saletan's writings is that he spent a few minutes with a search engine looking for hits on racial differences in intelligence, took the first few blogs and papers he found that way as The Emerging Scientific Consensus, and then stopped. But detailed inquiry into just how he managed to screw up so badly seems unprofitable.
The other problem with his supposed reliance on peer review is that he seems confused about how that institution works. I won't rehash what I've already said about it, but only remark that passing peer review is better understood as saying a paper is not obviously wrong, not obviously redundant and not obviously boring, rather than as saying it's correct, innovative and important. Even this misses a deeper problem, a possible failure mode of the scientific community. A journal's peer review is only as good as the peers it uses as reviewers. If everyone, or almost everyone, who referees for some journal is in the grip of the same mistake, then they will not catch it in papers they review, and the journal will propagate it. In fact, since journals usually recruit new referees from their published authors or people recommended by old referees, mistakes and delusions can become endemic and self-confirming in epistemic communities associated with particular journals. To give a concrete example, the community using Physica A is pretty uniformly (and demonstrably) mistaken about how to tell when something is a power-law distribution, so what that journal publishes about power laws is unreliable, and those who derive their training and information from that journal go on to propagate the errors. It would be easy to find even more extreme examples from the physical and mathematical sciences (especially, I must say, among journals published by Elsevier), but it would take too long to explain why they are wrong.
Put simply, the problem is that any group of quack scholars with a shared delusion can put together a journal, dub each other peer reviewers, and go on their cheerful way by endorsing each others' work for their journal. (One of the ways you can tell that intelligent design creationism is a propaganda front and not a real, if stupid, scholarly movement is that their effort to put together just such a journal was never more than half-assed, and it's moribund for some time now.) This isn't even always a bad thing, since sometimes people who seem like quacks are in fact right, and doing things like starting their own journals gives them a chance to get their act together and assemble a convincing case. But all of this does mean that the peer-review filter is a very weak and accepting one, especially on controversial topics. It does not seem unreasonable of me to ask that those who set themselves up as science reporters grasp this.
(I hope no one will mis-interpret me as saying that peer review is worthless — I think some form of it is essential, it's just not enough — or that I'm endorsing some silly social-constructionist view that science is just the views of the winners of the scientific community's internal political squabbles. If I thought that, I'd not be pursuing a scientific career, but rather making much more money, and reading many fewer boring papers and writing many fewer boring grant applications, on a sub-tropical island. Science is systematic and cumulative inquiry into what the world is like and how it works, and by and large one that succeeds in producing increasingly reliable and refined knowledge about the world. This is marvelous and inspiring, but it's still a social process implemented by East African Plains Apes [and some of their tools], and it's wise to be realistic about the implications of this fact.)
Let me close with a quotation (via Jessa Crispin) from William Langewiesche, which conveys, better than I could, something of what I was trying to say about the responsibilities of journalists:
"You have this precious, incredibly privileged thing," he said, "which is the reader's attention for a little while. And you can make the slightest misstep and the reader will put you down. People will say that the reader lives in a busy world. But that's not the reason why. The reason is that the writer blows it, and loses the reader's trust."
Saletan has blown it very badly indeed.
Update, 2 December: I should have said in the first place that my discussion of peer review and hysteresis is heavily indebted to the very illuminating chapter on "Market Failures in the Economy of Science" in the dissertation of Dr. Nienke Oomes. I hope that my mentioning this, even belatedly, might prod Nienke into publishing it, or at least putting it online.
Update, 4 December: Stephen Metcalf, Slate's critic-at-large, goes some way towards redeeming its honor.
Manual trackback: The Geomblog; 3 Quarks Daily; Crooked Timber; Quantum of Wantum; Egregious Moderation; Noli Irritare Leones; The Mahatma X Files; Brendan Nyhan; Rash Matters; The Useless Tree; Nanopolitan; IQ Review; Existence is Wonderful; OpenLeft; Earthli; Jewcy; Science After Sunclipse
Posted by crshalizi at November 30, 2007 13:35 | permanent link
There are components of the world's wave-function where modernity and the industrial and scientific revolutions began in Song-dynasty China. (They may well have higher amplitude than our component.) There, bad evolutionary psychologists use the handicap principle to explain why men are so attracted to women with feet so tiny they can barely walk.
Posted by crshalizi at November 29, 2007 13:46 | permanent link
Posted by crshalizi at November 28, 2007 22:55 | permanent link
Roe and Baker's argument is simple but ingenious and compelling. The climate system contains a lot of feedback loops. This means that the ultimate response to any perturbation or forcing (say, pumping 20 million years of accumulated fossil fuels into the air) depends not just on the initial reaction, but also how much of that gets fed back into the system, which leads to more change, and so on. Suppose, just for the sake of things being tractable, that the feedback is linear, and the fraction fed back is f. Then the total impact of a perturbation J is
If we knew the value of the feedback f, we could predict the response to perturbations just by multiplying them by 1/(1-f) — call this G for "gain". What happens, Roe and Baker ask, if we do not know the feedback exactly? Suppose, for example, that our measurements are corrupted by noise --- or even, with something like the climate, that f is itself stochastically fluctuating. The distribution of values for f might be symmetric and reasonably well-peaked around a typical value, but what about the distribution for G? Well, it's nothing of the kind. Increasing f just a little increases G by a lot, so starting with a symmetric, not-too-spread distribution of f gives us a skewed distribution for G with a heavy right tail.
To illustrate, here is a histogram made from drawing 10,000 random numbers from a Gaussian distribution with mean 0.6 and a standard deviation of 0.1; think of these as the values of f, the strength of the feedback.
And here is the histogram of the corresponding values of G, the over-all gain.
If we can reduce the uncertainty in f, then of course the distribution of G will also narrow, but much more slowly than you might think. Here is what things look like when the standard deviation of f is cut by a half, to 0.05:
A few points seem worth making here.
In short: the fact that we will probably never be able to precisely predict the response of the climate system to large forcings is so far from being a reason for complacency it's not even funny.
Update, 27 November: I should have linked to the discussion of this paper over on RealClimate initially.
Manual trackback: Earning My Turns; Political Animal; Coyote Blog; A Change in the Wind; That's Almost Right
Posted by crshalizi at November 25, 2007 23:57 | permanent link
OK, boys and girls, settle down. I know everyone is anxious to leave for Thanksgiving break, but based on the reaction to the previous post around the blogs, we need to do a quick reading skills test.
A number of people have earnestly objected that Slate doesn't charge a subscription --- very true! Can anyone tell me the name of the figure of speech being employed in the title? Go ahead, Johnny. "Sarcasm; the use of irony to express contempt; from the Greek sarkazein, to tear flesh." Very good, Johnny.
Next, can someone tell me why the names of Saletan's sources appear underlined in the text? "Because they are anchors of hyperlinks." That's right! Did anyone follow those hyperlinks? You did! And what do you find there? No, Johnny, let someone else take a turn --- Cindy? "Demonstrations that, even if you accept IQ is valid, his sources are quacks who couldn't think their way out of a wet paper bag." Excellent! My, you're doing much better than the blogosphere, boys and girls. Let me just add that if you follow this controversy at all, you know that these people are quacks; you can even discover it by, at most, a quarter of an hour with your favorite search engine. I will leave you to judge whether a journalist who doesn't check up on his sources that way is doing his job.
Third point: Why do I not go into all the reasons why you shouldn't accept the usual IQ framework in the first place? Alex? "Because you wrote about 22,000 words in two parts doing so already, and linked to all that at the bottom of the post." Correct!
Finally, fourth point, why do I not say anything about the correlation between head sizes and IQ that impresses Saletan so much? Johnny? "Because you can't handle the truth?" I knew I was going to regret calling on you. Who else? Chris? "Because a construct's being correlated with a physical variable doesn't imply being physically meaningful in any way. Height is correlated with head size, so the sum of height and blood triglycerides will be correlated with head size." True, but maybe not all that compelling to the audience. Anyone else? Roxana? "Because the evidence for that correlation is taken apart in the piece you linked to about Rushton?" Right --- do you have something to add to that? "It's easy to discover from the literature that there really isn't any such correlation, once you design your study so that it's not hopelessly confounded." Did people catch what Roxie did there — she attached a link. Boys and girls, you should follow that link! Roxie, thanks for that article, and let me say that getting a result like that out of Vincent Sarich, of all people, is an especially nice catch.
OK, class, congratulations; you passed this little exam in basic on-line reading skills, unlike the blogosphere's various comment sections. Have a happy Thanksgiving, everyone!
Posted by crshalizi at November 21, 2007 14:00 | permanent link
William Saletan's recent venture into demanding that we squarely face the harsh light of his pseudo-scientific prejudices is, in itself, intensely boring — we've played this scene over and over again — but becomes more interesting when we try to trace it back to causes, and then forward again to effects.
His writing the story may be explained in one of two ways.
William Saletan is the national correspondent of Slate, and published this multi-part heap of rubbish there. This means it was approved by his editors. We may interpret their action in one of three ways.
The efficient alternative is, of course, to stop paying attention to Slate, or other magazines which publish idiotic and pseudo-scientific apologias for bigotry.
Updates: See next post before complaining. 25 November: Stupid mis-spelling fixed, thanks to Loren Spice.
Manual trackback: 3 Quarks Daily; Crooked Timber; American Nonsense; The Mahatma X Files; Quantum of Wantum; Language Log; onegoodmove; First Drafts; Nanopolitan; Ionian Enchantment; IQ Review; OpenLeft; Jewcy
Posted by crshalizi at November 20, 2007 23:29 | permanent link
Posted by crshalizi at November 20, 2007 21:45 | permanent link
Because one mindless, shapeless, blasphemous, unhealthily-expanding entity in the office just wasn't enough:
Posted by crshalizi at November 17, 2007 16:50 | permanent link
A passage from the fourth referee report I wrote today: "It would be unfair to compare the author's methodological advice to enjoining us to remember to breathe; it is more like reminding us not to hold forks by their pointy ends, which rather go into the food." But on second thought I deleted that; I grow soft.
Posted by crshalizi at November 09, 2007 23:50 | permanent link
November is, supposedly, National Novel Writing Month. In honor of this season, I would like to encourage everyone (not just the various participants) to read this post at Making Light, and the ensuing comments thread. #900, in particular, is very funny, but it really has to be appreciated in context.
Posted by crshalizi at November 08, 2007 08:20 | permanent link
Speaking, as we were, of the Futurists, I present today's evidence of their subliminal influence over the course of the twentieth century.
![]() |
![]() |
| The Street Enters the House (Umberto Boccioni, 1911) | The Stata Center at MIT (Frank Gehry, 2004) |
Eventually, as post-humaniy mutates into a species of mind-bending Lovecraftian monstrosities, we'll not only be at home in such buildings, they'll keep the rain out, too.
(AP story via Science After Sunclipse; photo of Stata Center via flickr user highsmith.)
Posted by crshalizi at November 07, 2007 14:23 | permanent link
My grandfather used to tell a joke about a magazine running a contest where the first prize was an all-expenses-paid week in Pittsburgh --- and the second prize was two weeks. With that kind of baseline, it's nice to see it getting some love from National Geographic and from the Times, though calling Butler Street in Lawrenceville a "design district" may still be a bit of a stretch. (The Coca Cafe is indeed very nice, however.)
The funniest endorsement of Pittsburgh as a travel destination I've read lately is undoubtedly the Washington Post's account of Richard Mellon Scaife's marital troubles. What makes it extra amusing to me is that Scaife and his soon-to-be-ex-wife live in my neighborhood, and I go past his house on my usual running route. (I guessed the "Welcome home, Beauregard" sign was about the dog coming back from the vet's.) While I am, of course, sad that my fellow Shadysiders are having such an ugly divorce, I can't help feeling that it couldn't happen to a nastier wingnut. The role of obscene amounts of inherited money in fostering the whole sordid spectacle is more yet evidence that Andrew Carnegie was on to something when he declared "The man who dies thus rich dies disgraced."
(Via Kris Klinkner in e-mail, except for the Post story, via Krugman's blog.)
Postcards; The Running-Dogs of Reaction; Heard About Pittsburgh, PA
Posted by crshalizi at November 06, 2007 23:48 | permanent link
I present for your consideration two case studies in rhetorical self-fashioning. John Brunner almost saw this coming.
The morals you draw may be your own.
(The second via Halfway Down the Danube.)
Posted by crshalizi at November 05, 2007 20:26 | permanent link
From Marty Lederman at Balkinzation:
Jack Goldsmith left OLC before he could complete the "replacement" torture opinion. Daniel Levin succeeded him. OLC had previously opined that waterboarding was lawful. Levin apparently (and understandably) was a bit skeptical -- so much so that he asked the military to subject him to waterboarding! (This is not your parents' OLC -- can you imagine what it would take for anyone after this to want to be Assistant Attorney General there?) Naturally, Levin concluded that the procedure was, well, torture, at least "unless performed in a highly limited way," and under guidelines the Administration had failed to implement. (No doubt Levin did suffer severe physical suffering, and that's in a situation far removed from being a detainee.)At this point, Alberto Gonzales nevertheless insisted that Levin include in his December 30, 2004 opinion the footnote about how the legal analysis did not affect all previously approved techniques! It's not clear why Levin assented to this -- it's an outrageous and inappropriate thing for a White House Counsel to do -- but the footnote was included. (I should add that the December 2004 Levin opinion also included an analysis of "severe physical suffering" that is entirely unpersuasive and that is the basis for the counterintuitive (i.e., patently wrong) conclusion that waterboarding is not torture. I've criticized that portion of the Levin memo previously. Now I wonder whether that, too, was the work of Alberto Gonzales and David Addington, rather than Levin himself, and whether Levin's planned follow-up memo (see below) might have called that analysis into question.)
Levin then set about to write another opinion, one that would cut back on the approved techniques (and that would, at a minimum, repudiate or temper the previous OLC advice on waterboarding).
Unfortunately, at this point Gonzales was confirmed as AG -- and he fired Levin, replacing him with Steve Bradbury, who was more than happy to give Gonzales the legal advice they wanted. (No word -- yet -- on whether Bradbury was waterboarded.)
Let's go over that again. A "loyal conservative, Republican lawyer" cares enough for the law that he has himself waterboarded. He concludes that water-boarding is torture. (This is what everyone with experience says too, not to mention our own legal history.) Torture is a crime. For saying as much, he got fired. The reason is, this administration wants to torture.
The point of this torture is not to extract information; there are better ways to do that, which we have long used. The point of this torture is not to extract confessions; there are no show trials of terrorists or auto-de-fes in the offing. The point of this torture is to exercise unlimited, unaccountable power over other human beings; to negate the very point of our country, to our profound and lasting national shame.
Calling this administration "sadistic" insults thousands of sane, decent, kinky sexual perverts.
Manual trackback: Nanopolitan; Wintry Smile
The Beloved Republic; The Continuing Crisis; The Running-Dogs of Reaction
Posted by crshalizi at November 04, 2007 11:45 | permanent link
BLDGBLOG has just posted about the the crazy, and highly depressing, architecture created by the Italian and Austro-Hungarian armies for their trench warfare in the Alps during WWI, with the trenches running as far up the mountains as they could get. The post is, as usual, excellent, with great photos and contemporary reportage by (of all people) H. G. Wells. I commend it to your attention. (I have long wondered whether some of Gramsci's remarks about "wars of position" vs. "wars of maneuver" were not colored by news of this conflict.)
This gives me the occasion to plug the best book I've read on the Alpine front, and one of the the best memoirs of the Great War I've encountered period, Emilio Lussu's Sardinian Brigade (in the original, Un anno sull'altipiano). Lussu's real achievement here is to movingly evoke the proverbial "long stretches of boredom, punctuated by brief moments of terror" — and he is very good at conjuring both futility and terror — without histrionics. His auctorial voice remains cool, lucid, rational, slightly detached — to mangle Wells, in a different connection, the voice is that of an intellect vast, cool, and not wholly sympathetic, though the story the voice tells is one of what it was like to be a dirty, bloody, suffering soldier in a palpably idiotic war. Writing at a literal remove — twenty years later, and in exile owing to his outspoken opposition to Fascism — may have helped achieve this effect. It deserves a wide audience.
Manual trackback: Kottke.org.
Posted by crshalizi at November 03, 2007 22:00 | permanent link
I will be teaching 36-462, "topics in statistics", in the spring. This is a special topics course for advanced undergraduates, intended to expose them ideas they wouldn't see going through the ordinary curriculum.
A more detailed syllabus will follow on the course website once I actually draw it up. If you have any questions, please send e-mail.
Update, 7 January 2008: Behold the syllabus.
Posted by crshalizi at November 02, 2007 10:50 | permanent link
My mail spool got clobbered this afternoon; if you sent me anything between midnight last night and around 2 pm today, please resend. (If you sent something but have thought better of it, consider yourself reprieved.)
Posted by crshalizi at November 01, 2007 18:20 | permanent link
Never in history has such ruination — physical and moral — been associated with the name of one man. That the ruination had far deeper roots and far more profound causes than the aims and actions of this one man has been evident in the preceding chapters. That the previously unprobed depths of inhumanity plumbed by the Nazi regime could draw upon wide-ranging complicity at all levels of society has been equally apparent. But Hitler's name justifiably stands for all time as that of the chief instigator of the most profound collapse of civilization in modern times. The extreme form of personal rule which an ill-educated beerhall demagogue and racist bigot, a narcisstic, megalomaniac, self-styled national saviour was allowed to acquire and exercise in a modern, economically advanced, and cultured land known for its philosophers and poets, was absolutely decisive in the terrible unfolding of events in those fateful twelve years.Hitler was the main author of a war leaving over 50 million dead and millions more grieving their lost ones and trying to put their shattered lives together again. Hitler was the chief inspiration of a genocide the like of which the world had never known, rightly to be viewed in coming times as a defining episode of the twentieth century. The Reich whose glory he had sought lay at the end wrecked, its remnants to be divided among the victorious and occupying powers. The arch-enemy, Bolshevism, stood in the Reich capital itself and presided over half of Europe. Even the German people, whose survival he had said was the very reason for his political fight, had proved ultimately dispensable to him.
Books to Read While the Algae Grow in Your Fur; The Collective Use and Evolution of Concepts; The Dismal Science; Writing for Antiquity; Philosopy
Posted by crshalizi at October 31, 2007 23:59 | permanent link
Has it really been two years since I did one of these? Where has the time gone?
The Little Professor gives an assortment of late 19th and early 20th century horror stories online, which I hereby include by reference. (Le Fanu in particular is hard to beat.)
More-contemporary seasonal fiction: John Aegard's perfectly mixed mash-up of Lovecraft and Peanuts, "The Great Old Pumpkin".
Bruce Sterling's "A Plain Tale from Our Hills" is not seasonal, but, when you read to the climax, you'll see why it's appropriate for today. (It is also anything but a plain tale; it would be fascinating to pick apart all the ways in which the reader's mind is being messed with in this story.)
Further to the Lovecraftian theme, Ectoplasmosis links to Footnotes to a Species Once Called Humanity, some being dictated a little way up the Monongahela from here. In additional local color, Ectomo shares a warning of eldritch, tentacled things in the Market Square district of Pittsburgh. Can it be mere coincidence that the district association's website has gone dead?
Shifting the scene to our west and south, "Your classic 50s drive-in-movie-monster plant" has invaded a lake in east Texas (via Light Reading).
Mostly Harmless belies its title by providing clips from Nosferatu: A Symphony of Horror, warning that "Children under the age of 21 and all other persons who are easily frightened should leave now!"
You will recall that in March, some Serbian vampire hunters attempted to properly stake the mortal remains of Slobodan Milosevic, so as to prevent him from troubling the world in un-death as he had in life. (Via Warren Ellis.) By all accounts, half a year later Milosevic is still, thankfully, dead: once again, I ask, could this be coincidence? (ObBook: Vampires, Burial and Death.)
A colleague from the former Yugoslavia once delivered a drunken monologue in which he gave all the many reasons why one couldn't trust in or associate with the the Bosniaks, including in his list the self-evident wrongness of any culture which encouraged tolerance or even fondness for spiders. (The religious justification of this custom lies in this story about the hijra, but whether that's the reason for the custom is another issue.) While I don't think Chris at Mixing Memory has views on that question, he does find it important to ask: Do Infants Have an Innate Spider Detection Mechanism? And if Not, Shouldn't They?
Mind Hacks describes The Brain That Wouldn't Die as "another classic story of boy meets girl, boy loses girl in terrible car crash, boy keeps girl's head alive in neuroscience lab while looking for attractive new body." If you're not already intrigued, the fact that characters exclaim things like "The paths of experimentation twist and turn through mountains of miscalculations and often lose themselves in error and darkness!" probably wouldn't be enough to sell it to you. Fortunately enough, it's now free on line, and available for viewing tonight.
Mind Hacks has something of a thing going about brains, linking to Alex Klochkov's wonderfully creepy and gross pictures from an abandoned Russian neurological laboratory. This not only covers your "BRAINS! BRAINS! BRAINS!" needs, but even your need for actual brains in vats.
Finally, Nick Bostrom's fable of the dragon.
Manual Trackback: Muck and Mystery
Linkage; Scientifiction; Cthulhiana; Minds, Brains, and Neurons
Posted by crshalizi at October 31, 2007 15:15 | permanent link
Next week and the week after, I will attempt to convey everything in this class and my research which is relevant to machine learning in two one-hour talks on stochastic processes, over lunch.
If I am good, the hour next week will actually be enough to cover the material in the abstract, and the week after I'll talk about stochatic processes in space. If not, I'll finish up. (In other words: lectures will continue until morale improves.)
Posted by crshalizi at October 31, 2007 12:30 | permanent link
I broke down and submitted a paper to the Social Information Processing symposium; we'll see what the referees think of it. (I only submitted this to arxiv.org's "Computers and Society" category, and do not know how it got cross-listed under "Physics and Society".)
Long-time readers will probably not find much that's new in it, but it did force me to finally implement the model of neutral cultural diffusion in assortative social networks I've had in mind for about two years. The simulation (code available on request) confirmed that homophily plus diffusion together tend to make social types correlate with cultural traits, just as though there were some actual causal or expressive connection between the two. That needs to be another paper on its own, which will I dare say entail re-writing the simulation in something other than Perl.
Also, driven by peer pressure and not wanting to feel too much like the eunuch in the bordello, I went and got an account on del.icio.us and Facebook, to go with the ones on LibraryThing and (thanks, Aaron!) Dopplr. I share this in part as a warning: if I'm using them, they're about to become terribly, terribly unfashionable (if they haven't already).
The Collective Use and Evolution of Concepts; Networks; Complexity; Self-Centered
Posted by crshalizi at October 27, 2007 16:00 | permanent link
The Santa Fe Institute is accepting applications for post-docs. For sheer concentrated intellectual stimulation — not to mention views like this from your office window — there is probably no better position for an independent-minded young scientist with interdisciplinary interests. The deadline is 15 November 2007.
Postdoctoral Fellowship Opportunities at the Santa Fe InstituteThe Santa Fe Institute (SFI) is selectively seeking applications for Postdoctoral Fellows for appointments beginning Fall 2008.
Fellows are appointed for up to three years during which they pursue research questions of their own design and are encouraged to transcend disciplinary lines. SFI's unique structure and resources enable Fellows to collaborate with members of the SFI faculty, other Fellows, and researchers from around the world.
As the leader in multidisciplinary research, SFI has no formal programs or departments and we accept applications from any field. Research topics span the full range of natural and social sciences and often make connections with the humanities. Most research at SFI is theoretical and/or computational in nature, although some research includes an empirical component in collaboration with other institutions.
The compensation package includes a competitive salary and excellent health and retirement benefits. As full participants in the SFI community, Fellows are encouraged to invite speakers, organize workshops and working groups and engage in research outside their field. Funds are available to support this full range of research activities. Applications are welcome from candidates in any country. Successful foreign applicants must acquire an acceptable visa (usually a J-1) as a condition of employment. Women and minorities are especially encouraged to apply.
For complete information and application instructions, please follow the link to http://www.santafe.edu/postdocapp08.
The online application process opens October 15, 2007. Application deadline is November 15, 2007.
(I know I stole the phrase "adobe tower" from someone, but can't now remember who.)
Posted by crshalizi at October 24, 2007 09:50 | permanent link
The Neo-Futurists will be performing Too Much Light Makes the Baby Go Blind on the CMU campus, Wednesday the 24th and Thursday the 25th. (Porter Hall 100 is down the hall and a flight of stairs from my office.) The Neo-Futurists are like the original Futurists, only based in Chicago rather than Milan, and without the icky proto-fascist tendencies. This is your chance to see thirty quasi-randomly chosen plays performed in sixty minutes, and is so, so, worth paying $1d6 at the door.
Posted by crshalizi at October 22, 2007 10:44 | permanent link
What, if some day or night a demon were to steal after you in your loneliest loneliness and say to you: "This life as you now live it and have lived it, you will have to live once more and innumerable times more; and there will be nothing new in it, but every pain and every joy and every thought and sigh and everything unutterably small or great in your life will have to return to you, all in the same succession and sequence — even this spider and this moonlight between the trees, and even this moment and I myself. The eternal hourglass of existence is turned upside down again and again — and you with it, speck of dust!" — Would you not throw yourself down and gnash your teeth and curse the demon who spoke thus?
Posted by crshalizi at October 18, 2007 17:35 | permanent link
Attention Conservation Notice: About 11,000 words on the triviality of finding that positively correlated variables are all correlated with a linear combination of each other, and why this becomes no more profound when the variables are scores on intelligence tests. Unlikely to change the opinion of anyone who's read enough about the area to have one, but also unlikely to give enough information about the underlying statistical techniques to clarify them to novices. Includes multiple simulations, exasperation, and lots of unwarranted intellectual arrogance on my part.Follows, but is independent of, two earlier posts on the subject of intelligence and its biological basis, and their own sequel on heritability and malleability. This doubtless more than exhausts your interest in reading about the subject; it has certainly exhausted my interest in writing about it.
Disclaimer: A decade ago, some of the senior faculty in my department, i.e., some of the people who will be voting on my contract renewal and tenure, helped put together a book called Intelligence, Genes and Success: Scientists Respond to The Bell Curve. Most, but not all, of the responses in that book were exceedingly negative. I cite some of that work below. Whether this should alter your evaluation of the case I make is for you to decide.Thanks are due to (alphabetically) Carl Bergstrom, Matthew Berryman, John Burke, Mark Liberman, and Aaron Swartz for many helpful suggestions. But, of course, I'm the only one responsible for this, all remaining errors are my own, and it's not in any sense authorized or endorsed by anyone (in particular not by them).
Anyone who wanders into the bleak and monotonous desert of IQ and the nature-vs-nurture dispute eventually gets trapped in the especially arid question of what, if anything, g, the supposed general factor of intelligence, tells us about these matters. By calling g a "statistical myth" before, I made clear my conclusion, but none of my reasoning. This topic being what it is, I hardly expect this will change anyone's mind, but I feel a duty to explain myself.
To summarize what follows below ("shorter sloth", as it were), the case for g rests on a statistical technique, factor analysis, which works solely on correlations between tests. Factor analysis is handy for summarizing data, but can't tell us where the correlations came from; it always says that there is a general factor whenever there are only positive correlations. The appearance of g is a trivial reflection of that correlation structure. A clear example, known since 1916, shows that factor analysis can give the appearance of a general factor when there are actually many thousands of completely independent and equally strong causes at work. Heritability doesn't distinguish these alternatives either. Exploratory factor analysis being no good at discovering causal structure, it provides no support for the reality of g.
These purely methodological points don't, themselves, give reason to doubt the reality and importance of g, but do show that a certain line of argument is invalid and some supposed evidence is irrelevant. Since that's about the only case which anyone does advance for g, however, which accords very poorly with other evidence, from neuroscience and cognitive psychology, about the structure of the mind, it is very hard for me to find any reason to believe in the importance of g, and many to reject it. These are all pretty elementary points, and the persistence of the debates, and in particular the fossilized invocation of ancient statistical methods, is really pretty damn depressing.
Unfortunately, I lack the skill to explain what's going wrong here in a completely non-technical way, other than unsupported assertions of "such-and-such doesn't work". Rather than just pontificate, I will try to explain, but presume that you know what things like variance and correlation are, and what a correlation matrix is, or at least that you used to.
Scores on intelligence tests are correlated with each other; people who do better than average on one test tend to do better than average on another. The same is true of school grades and many other measures of human performance. The idea that these correlations are due to a single "general factor of intelligence", what has come to be called g, originated with Charles Spearman at the turn of the 20th century. Spearman's idea was that a student's grade in, say, English was the sum of two factors, a general factor, common to all subjects, and a specific factor unique to English, plus random, noisy, test-to-test variability. Similarly grades in math would be the sum of the general factor, a math-specific factor, and noise. The specific factors were, Spearman thought, completely uncorrelated, so all the correlations between math and English grades would be due to the general factor. This implied that the partial correlations among test scores — the residual correlation left after controlling for the common factor — should be zero, which, in Spearman's original data, they were, or near enough. [1] Even though g was not directly observable [2], these vanishing partial correlations gave Spearman considerable confidence in his theory, and launched it upon the world.
Spearman's political views were, by my lights, both abhorrent and stupid [3], but so what? (Fisher wasn't much better, but I'm not about to give up maximum likelihood, or even write off The Genetical Theory of Natural Selection.) The two-factor theory was a genuinely scientific theory of considerable scope and empirical content, which would have been very important if it was true. The way we can unambiguously tell that it had falsifiable empirical content is that it was, in fact, falsified. Looking at larger and more diverse data sets, it became clear that the partial correlations among scores on mental ability tests were not zero, or even close enough to attribute the difference to chance. Put to reasonably severe tests, it failed. I don't think there is anyone left who still seriously argues for Spearman's g in this sense.
Since Spearman's theory of g is about as refuted as a statistical hypothesis gets, why does g still feature in arguments about social policy and education? Isn't this as though some parties in the global warming debate had climate models involving phlogiston and caloric? The answer is that psychometricians responded to the difficulties of the one-general-factor theory by developing models with multiple unobserved factors. (The leading name here is that of Thurstone, whose classic paper "The Vectors of Mind" is definitely worth reading.) The bulk of the correlations between tests get attributed to a leading common factor, still called g. The smaller but non-negligble correlations left after accounting for this g are attributed to other, lesser factors. The reality and importance of g is held to follow from the fact that it accounts for so much of the correlations among the tests. A still later, and still subtler, strategy is that of hierarchical factor analysis: find multiple factors from the correlations among test scores, and then recursively find higher-order factors from the correlations among the lower-order factors, until finally only a single factor remains, which is declared to be g. (For an exposition of this last approach by one of its most prominent advocates, see John Carroll's contribution to Intelligence, Genes, and Success.)
These may sound like ad hoc face-saving maneuvers, but you could also argue that they're' reasonable extensions of theory, keeping the working bits and fixing some broken ones. (After all, an important part of the mechanical theory of heat is explaining why heat does, in many ways, act like a subtle fluid.) The trouble is that, in turning factor analysis into a better tool for describing correlations, Thurstone et al. made it into a next-to-useless tool for explaining correlations, and most users of the tool then and since have never really grasped the problem. Since this is very important to what follows, I want to take some time to explain the differences between "exploratory" and "confirmatory" factor analyses, and why using the former to find causal structure is telling time with a stopped clock. No part of my argument is at all original; it's old news to people who actually study psychometric methods and causal inference in general. In fact, even, or especially, if you think that what I'm saying is weak-minded politically correct rubbish and I really ought to know better, I strongly urge you to read the linked papers by Borsboom (with the discussion) and by Glymour.
Exploratory factor analysis is a useful piece of descriptive statistics, and as such I expound and endorse it quite happily when I teach data mining. [4] The basic idea, as we saw with Spearman's theory, is to take a large number of correlated measurements, and to represent them as linear combinations of a smaller number of "factors" which are common to all or many of the measurements, plus idiosyncratic noise terms. The trick with exploratory factor analysis is that the factors are, themselves, linear combinations of the observed measurements, and all the methods for factor analysis are ways of searching for combinations which correlate well with the observations. There are various criteria you can use to do this, which all tend to give similar-but-not-identical results in practice. [5]
This is a perfectly reasonable and useful way to do data reduction and exploratory pattern hunting. One of the examples in my data-mining class is to take a ten-dimensional data set about the attributes of different models of cars, and boil it down to two factors which, together, describe 83 percent of the variance across automobiles. [6] The leading factor, the automotive equivalent of g, is positively correlated with everything (price, engine size, passengers, length, wheelbase, weight, width, horsepower, turning radius) except gas mileage. It basically says whether the car is bigger or smaller than average. The second factor, which I picked to be uncorrelated with the first, is most positively correlated with price and horsepower, and negatively with the number of passengers — the sports-car/mini-van axis.
![]() | ![]() |
| Left: Relations between the first two principal components and the measured variables. Right: The individual cars plotted against the principal components. Click for full-size PDFs. | |
In this case, the analysis makes up some variables which aren't too implausible-sounding, given our background knowledge. Mathematically, however, the first factor is just a weighted sum of the traits, with big positive weights on most variables and a negative weight on gas mileage. That we can make verbal sense of it is, to use a technical term, pure gravy. Really it's all just about redescribing the data.
This brings me to the other major sort of factor analysis, what's called "confirmatory" factor analysis. This is about checking a model where some latent, unobserved variables are supposed to account for the relations among the actual observations. To simplify, the logic is that if the model is right, then we should get certain patterns of correlations and no others — like checking whether the partial correlations are zero, as Spearman's original model required them to be, but adapted to other latent structures. This is a genuinely inferential and not just descriptive piece of statistics. It's also a pretty modest one, since failing one of these tests is decisive, but passing often isn't very informative, because, as we'll see, radically different arrangements of latent factors can give basically the same pattern of observed correlations. (In the jargon, the power of these tests can be very low at reasonable sample sizes.) It is very striking how infrequently one finds people who use exploratory factor analysis checking things with confirmatory factor analysis, for which I think a lot of blame must rest with teachers of statistics, myself included. If my two-factor model for the cars was right, then all of the correlation between (say) gas mileage and horsepower should be due to their respective correlations with the two factors, with no partial correlation between them once the factors are accounted for. The data falsify this hypothesis at any reasonable level of significance. I do not, however, teach my students to do this: mea culpa.
(And it's not just me. One of the most prominent ideas put forward on the basis of these exploratory techniques, aside from the general intelligence factor, is what's called the five factor theory of personality traits. This quite robustly fails confirmatory factor analyses: the "Big Five", despite being made up for the purpose, don't actually fit the correlations in the data, even on personality tests designed using the theory. This has done next to nothing to make personality psychologists rethink, revise, or discard the theory, and leads mild-mannered psychometricians to tear their hair in frustration.)
So: exploratory factor analysis exploits correlations to summarize data, and confirmatory factor analysis — stuff like testing that the right partial correlations vanish — is a prudent way of checking whether a model with latent variables could possibly be right. What the modern g-mongers do, however, is try to use exploratory factor analysis to uncover hidden causal structures. I am very, very interested in the latter pursuit, and if factor analysis was a solution I would embrace it gladly. But if factor analysis was a solution, when my students asked me (as they inevitably do) "so, how do we know how many factors we need?", I would be able to do more than point them to rules of thumb based on squinting at "scree plots" like this and guessing where the slope begins. (There are ways of estimating the intrinsic dimension of noisily-sampled manifolds, but that's not at all the same.) More broadly, factor analysis is part of a larger circle of ideas which all more or less boil down to some combination of least squares, linear regression and singular value decomposition, which are used in the overwhelming majority of work in quantitative social science, including, very much, work which tries to draw causal inferences without the benefit of experiments. A natural question — but one almost never asked by users of these tools — is whether they are reliable instruments of causal inference. The answer, unequivocally, is "no".
I will push extra hard, once again, Clark Glymour's paper on The Bell Curve, which patiently explains why these tools are just not up to the job of causal inference. (Maybe more than two people will follow that link this time.) They do not, of course, become reliable when used by the righteous, and Glymour was issuing such warnings long before Herrnstein and Murray's book appeared to trouble our counsels. The conclusions people reach with such methods may be right and may be wrong, but you basically can't tell which from their reports, because their methods are unreliable.
This is why I said that using factor analysis to find causal structure is like telling time with a stopped clock. It is, occasionally, right. Maybe the clock stopped at 12, and looking at its face inspires you to look at the sun and see that it's near its zenith, and look at shadows and see that they're short, and confirm that it's near noon. Maybe you'd not have thought to do those things otherwise; but the clock gives no evidence that it's near noon, and becomes no more reliable when it's too cloudy for you to look at the sun.
Now, I could go over the statistical issues involved in reliable causal inference, and why factor analysis doesn't measure up. But if I've learned anything teaching it's that examples are vastly more effective than proofs. (If you really want to know, start with Pearl and Spirtes, Glymour and Scheines.) So I'm going to show you some cases where you can see that the data don't have a single dominant cause, because I made them up randomly, but they nonetheless give that appearance when viewed through the lens of factor analysis. I learned this argument from a colleague, but so that they can lead a quiet life I'll leave them out of this; versions of the argument date back to Godfrey Thomson in the 1910s [7].
If I take any group of variables which are positively correlated, there will, as a matter of algebraic necessity, be a single dominant general factor, which describes more of the variance than any other, and all of them will be "positively loaded" on this factor, i.e., positively correlated with it. Similarly, if you do hierarchical factor analysis, you will always be able to find a single higher-order factor which loads positively onto the lower-order factors and, through them, the actual observables [8] What psychologists sometimes call the "positive manifold" condition is enough, in and of itself, to guarantee that there will appear to be a general factor. Since intelligence tests are made to correlate with each other, it follows trivially that there must appear to be a general factor of intelligence. This is true whether or not there really is a single variable which explains test scores or not.
It is not an automatic consequence of the algebra that the apparent general factor describes a lot of the variance in the scores. Nonetheless, while less trivial, it is still trivial. Recall that factor analysis works only with the correlations among the measured variables. If I take an arbitrary set of positive correlations, provided there are not too many variables and the individual correlations are not too weak, then the apparent general factor will, typically, seem to describe a large chunk of the variance in the individual scores.
To support that statement, I want to show you some evidence from what happens with random, artificial patterns of correlation, where we know where the data came from (my computer), and can repeat the experiment many times to see what is, indeed, typical. So that you don't have to just take my word for this, I describe my procedure, and link to my simulation code, in a footnote [9].
Here is the first correlation matrix R produced for me after I debugged my code, for five variables:
1.000 0.399 0.683 0.774 0.241 0.399 1.000 0.403 0.251 0.002 0.683 0.403 1.000 0.823 0.336 0.774 0.251 0.823 1.000 0.665 0.241 0.002 0.336 0.665 1.000
The way to read this is that the number at the intersection of row number I and column number J is the correlation between variable number I and variable number J. All of the entries on the diagonal are 1, because everything is perfectly correlated with itself. Some of these variables are strongly correlated (e.g., the third and fourth, 0.823), while others are not (e.g., the second and fifth, 0.002). All of them, however, are positively correlated. If these variables represented actual observations, this pattern of correlations would rule out the possibility of some causal structures underlying the measurements, but would still be compatible with a huge range of different mechanisms. But remember, this is a completely random example, with no real causal factors behind it whatsoever.
At this stage, I could have done a factor analysis of the correlation matrix, but to make things look more realistic, I instead generated "test scores" for 1000 "subjects" with these correlations, with each test having a mean of 100 and a standard deviation of 15 (just like an IQ test). I then used a completely standard piece of software (R's factanal function; a maximum-likelihood routine) to find the single factor which best accounted for the correlations in the measurements.
| variable | 1 | 2 | 3 | 4 | 5 |
| loading | 0.782 | 0.279 | 0.814 | 0.998 | 0.668 |
This one factor would describe more than half (0.559) of the variance in the results, which is really quite respectable by many social-science standards. For example, a typical value for the fraction of variance described by g on actual intelligence tests seems to be somewhere in the range of a quarter to two-thirds, and generally in the lower part of that range, say around a third.
From looking at the table of loadings, it appears that variable #2, whatever it is, is not well-described by the factor. If I think of the factor as something real — intelligence or athleticism or neuroticism or car-bigness, it doesn't matter — I might then drop variable #2 from my battery of tests. If I do so and re-calculate the factor loadings, they hardly change,
| variable | 1 | 3 | 4 | 5 |
| loading | 0.782 | 0.814 | 0.998 | 0.669 |
and now the single factor accounts for more than two thirds (0.679) of the variance. I will come back to this point later.
These results are no fluke. [10] Repeating this a thousand times, each with a different randomly-generated correlation matrix, the mean proportion of variance described by a single factor is 0.471, with a standard deviation of 0.079. (So my first random sample was a little on the high side, but not remarkably so.) If I repeat the experiment with six imaginary tests rather than five, then the mean proportion of variance described is 0.432, with a standard deviation of 0.065. If I stick to five dimensions, and let the correlations go over the whole range from -1 to 1, then I get a somewhat smaller mean proportion-of-variance-described, namely 0.400, with nearly the same standard deviation (0.070). If, on the other hand, I confine myself to correlations in the range from 0 to 1/2, then I get a much smaller mean (0.289±0.041), but still one many people would be able to publish proudly. If I force the correlation coefficients to all be negative, in the range -1 to 0, again it's smaller but not negligible (0.294±0.029). And so on, and so on.
Why does this matter? Well, if you take people and give them pretty much any battery of tests of mental abilities, skills and knowledge you care to name, you will find positive correlations among the scores — especially if you exclude people who have received specialized training in skills relevant to one test or another, or the tests on which people have been trained. [11] In this situation, all that seeing a lot of variance described by the leading factor tells you is that, in fact, there are lots of positive correlations. This is what Thomson was pointing out, all those years ago, when he said that the apparent descriptive strength of the leading factor for test results was more a mathematical theorem than a psychological fact.
But — and I can hear people preparing this answer already — doesn't the fact that there are these correlations in test scores mean that there must be a single common factor somewhere? To which question a definite and unambiguous answer can be given: No. You can get strong positive correlations — even ones with vanishing partial correlations, so it looks like there's one factor — even when all the real causes are about equal in importance and completely independent of one another. This was, again, first demonstrated by Thomson — in 1914. I'll go over a slight variant of his original model, in the hope that it will lessen the odds that we have to spend the next 93 years debating what ought to be a closed issue.
The model goes like this: there are lots of different mental abilities, a huge number of them. (Thomson sometimes called them "factors", but I'll reserve that for the things found by factor analysis.) Any one given intelligence test calls on many of these abilities, some of which are shared with other tests, some of which are specific to that test (at least among those being analyzed). For each test, draw a number between 1 and 500; that is the number of shared abilities used in that test. Draw another number between 1 and 500; that is the number of test-specific abilities it uses. In my simulation, I used 11 tests, because one of the more widely used IQ measures, the revised Weschler Adult Intelligence Scale, was a battery of 11 tests. (The latest incarnation, the WAIS-III, has 14 tests, but the results would be the same.) So here's what I got:
| variable | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| shared abilities | 478 | 326 | 313 | 154 | 229 | 405 | 266 | 278 | 479 | 462 | 378 |
| specific abilities | 28 | 62 | 45 | 78 | 473 | 26 | 195 | 473 | 403 | 150 | 333 |
| total | 506 | 388 | 358 | 232 | 702 | 431 | 461 | 751 | 882 | 612 | 711 |
To determine which shared abilities go with which variable, I draw a sample of the specified size from my pool of 500 abilities. Now variable 1 (for example) is determined by 506 abilities, 478 of which it might have in common with other tests, and I know which abilities those 478 are. The total number of abilities invoked in this model is 2766. To make the result which is coming as stark as possible, Thomson assumed, as I will, that there is no dependence whatsoever among these abilities; they are totally and completely uncorrelated. For convenience, I'll assume that these abilities are not only independent but also identically distributed (IID); to keep things looking familiar, I made them normally distributed with a mean of 100 and a standard deviation of 15. Some abilities are involved in more than one test, but since there are 3 which are shared by all the tests, and 34 which are shared by at least ten of them, it's hard to say that there is a common ability. (Also, every shared ability is shared by at least three tests.) Since every test involves at least 232 distinct abilities, these widely-shared abilities are not overwhelming determinants of the test scores, either.
To generate test scores, I made up a random sample of 1000 independent individuals, and assigned them values of these 2766 abilities. I then summed the abilities, as prescribed, to get scores on the 11 tests. To add an air of verisimilitude, I topped each test score off with a little extra noise (mean zero, s.d. 15), so that if I re-tested my "subjects", I wouldn't get exactly the same results, but the tests would be highly reliable by the usual measures. Once again, let me emphasize that every ability contributing to the test scores is completely independent of every other, and none of them is preponderant on any of the tests, much less all of them.
When I do a factor analysis as before, I find that a single made-up factor, call it g, describes nearly half (0.478) of the standardized variance. The g loadings are as follows:
| variable | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| loadings | 0.955 | 0.756 | 0.758 | 0.465 | 0.416 | 0.859 | 0.539 | 0.431 | 0.708 | 0.829 | 0.634 |
Even the smallest of these would be pretty respectable, and some of them are great; you can compare them to the factor loadings for some data from the children's version of the Weschler scale obtained here by an advocate of g. The correlation between my variables' factor loadings and the number of shared abilities they draw on is +0.816. (As they used to say: This is no coincidence, comrades!) More, if I do a standard test for whether this pattern of correlations is adequately explained by a single factor, the data pass with flying colors (chi-squared is 38.41 on 44 degrees of freedom, p-value of 0.709). All of which is, by construction, a complete artifact.
Once again, this isn't a fluke. Repeating the simulation from scratch (i.e., coming up with completely new mappings between abilities and tests each time) a thousand times, I get an mean descriptive strength for the g factor of 0.290±0.066 — so, again, my initial trial was on the high side, but, honest, it was the first trial after I got the bugs out. You can use my code for this simulation to play around with what happens as you vary the number of tests and the number of abilities.
Now, I don't mean to suggest this model of thousands of IID abilities adding up as a serious depiction of how thought works, or even of how intelligence test scores work. My point, like Thomson's, is to show you that the signs which the g-mongers point to as evidence for its reality, for there having to be a single predominant common cause, actually indicate nothing of the kind. Thomson's model does this in a particularly extreme way, where those signs are generated entirely through the imprecision of our measurements. There are other models — for instance, the "dynamical mutualism" model of van der Maas et al. ("A Dynamical Model of General Intelligence: The Positive Manifold of Intelligence by Mutualism", Psychological Review 113 (2004): 842--861) which produce those signs from interacting processes, with nothing resembling a general factor in their causal structure. (This should surprise no one who's even casually familiar with distributed systems or self-organization.) Those supposed signs of a real general factor are thus completely uninformative as to the causes of performance on intelligence tests.
Someone will object that g is highly heritable, and say that this couldn't be true if it wasn't just an artifact. But this also has no force: Thomson's model can easily be extended to give the appearance of heritability, too.
Having spent far too long, in a previous post, covering what heritability is, why estimating the heritability of IQ is difficult to meaningless, and why it tells us nothing about how malleable IQ is, I won't re-traverse that ground here. Determining the heritability of an unobserved variable like g raises a whole extra set of problems — there is a reason you see so many more estimates of the heritability of IQ than of g — though if you want to define "general intelligence" as a certain weighted sum of test scores, that is at least operationally measurable. Suppose that, mirabile dictu, all the problems are solved and we learn the heritability of g, and it's about the same as the best estimate of the narrow-sense heritability of IQ, which is 0.34. Does it make sense to go from "g is heritable" to "g is real and important"?
I have to say that I find it an extraordinarily silly inference, and I'm astonished that anyone who understands how to calculate a heritability has ever thought otherwise. Height, in developed countries, has a heritability around 0.8. Blood triglyceride levels have a heritability of about 0.5. Thus the sum of height and triglycerides is heritable. How heritable will depend on the correlations between the additive components of height and those of triglycerides; assuming, for simplicity, that there aren't any, the heritability of their sum will be anywhere from 0.8 and 0.5, depending on the units we measure each variable in. The fact that this trait is heritable doesn't make it any less meaningless.
It'd still be embarrassing for the Thomson model if it couldn't produce its appearance, since after all no one is saying that the measured (or even real) heritability of IQ is always and exactly zero. But that's very easy, and the logic is the same as for combining height and triglycerides. Assume, as in classical biometric models, that the strength of each ability for each person is then the sum of three components, one purely genetic and additive across genes, one purely genetic and associated with gene interactions, and one purely environmental, and that these are perfectly independent of each other. Say that the strict-sense heritability of each ability, the ratio of the additive genetic variance to the total variance in the ability, is 0.5. The test scores, being linear combinations of abilities plus noise, will also be heritable. The g found by factor analysis, being a linear combination of the test scores, is itself a linear combination of the abilities and noise, and so, in turn, heritable. [12]
How heritable g would look would depend on whether the environmental contributions to the different abilities were correlated. If they are uncorrelated, then the heritability of the test scores will be slightly less than 0.5 (less, because of the extra measurement noise). If the environmental contributions to different abilities are positively correlated, the total environmental variance in the test scores will be larger, so their heritability will be lower. Since, to repeat, the meta-analysis of Devlin, Daniels and Roeder puts the heritability of IQ at around 0.34, that's fine.
In a sentence: Thomson's ability-sampling model not only creates the illusion of a general factor of intelligence where none exists, it can also make this illusory factor look heritable.
It might be the case that, while exploratory factor analysis isn't a generally reliable tool for causal inference, for some reason it happens to work in psychological testing. To believe this, I would want to see many cases where it had at least contributed to important discoveries about mental structure which had some other grounds of support. These are scarce. The five-factor theory of personality, as I mentioned above, is probably the best candidate, and it fails confirmatory factory analysis tests. As Clark Glymour points out, lesion studies in neuropsychology have uncovered a huge array of correlations among cognitive abilities, many of them very specific, none of which factor analyses predicted, or even hinted at. Similarly, congenital defects of cognition, like Williams's Syndrome, drive home the point that thought is a biological process with a genetic basis (if that needs driving). But Williams's Syndrome is simply not the kind of thing anyone would have expected from factor analysis, and for that matter a place where the IQ score, while not worthless, is not much help in understanding what's going on.
Stepping back a bit, the lack of success of factor analysis in psychology is actually surprising, because of the circularity in how psychological tests have come to be designed. The psychologists start with some traits or phenomena, which seem somehow similar to them, to exhibit a common quality, be it "intelligence" or "neuroticism" or "authoritarianism" or what-have-you. The psychologists make up some tests where a high score seems, to intuition, to go with a high degree of the quality. They will even draw up several such tests, and show that they are all correlated, and extract a common factor from those correlations. So far, so good; or at least, so far, so non-circular. This test or battery of tests might be good for something. But now new tests are validated by showing that they are highly correlated with the common factor, and the validity of g is confirmed by pointing to how well intelligence tests correlate with one another and how much of the inter-test correlations g accounts for. (That is, to the extent construct validity is worried about at all, which, as Borsboom explains, is not as much as it should be. There are better ideas about validity, but they drive us back to problems of causal inference.) By this point, I'd guess it's impossible for something to become accepted as an "intelligence test" if it doesn't correlate well with the Weschler and its kin, no matter how much intelligence, in the ordinary sense, it requires, but, as we saw with the first simulated factor analysis example, that makes it inevitable that the leading factor fits well. [13] This is circular and self-confirming, and the real surprise is that it doesn't work better.
I don't want to be mis-understood as being on some positivist-behaviorist crusade against inferences to latent mental variables or structures. As I said, my deepest research interest is, exactly, how to reconstruct hidden causal structures from data. Furthermore, I think it's pretty plain that psychologists have found compelling evidence for many kinds of latent mental structure. For instance, I defy anyone to explain the experimental results on mental rotation without positing mental representations which act in very specific ways. But exploratory factor analysis is not a solution to this problem.
The end result of the self-confirming circle of test construction is a peculiar beast. To the extent g correlates with anything from actual cognitive psychology, it's working memory capacity (see this, and especially the conclusion). If we want to understand the mechanisms of intelligent thought, how they are implemented biologically, and how they grow and flourish or fail to do so, I cannot see how this helps at all.
Of course, if g was the only way of accounting for the phenomena observed in psychological tests, then, despite all these problems, it would have some claim on us. But of course it isn't. My playing around with Thomson's ability-sampling model has taken, all told, about a day, and gotten me at least into back-of-the-envelope, Fermi-problem range. In fact, the biggest problem with Thomson's model is that the appearance of g is too strong, since it easily passes tests for there being only a single factor, when real intelligence tests, such as the Weschler, all fail them. If it wasn't a distraction from my real work, I'd look into whether weakening the assumption that tests are completely independent, uniform samples from the pool of shared abilities couldn't produce something more realistic. (In particular, I'd try self-reinforcing urn schemes.) If we must argue about the mind in terms of early-twentieth-century psychometric models, I'd suggest that Thomson's is a lot closer than the factor-analytical ones to what's suggested by the evidence from cognitive psychology, neuropsychology, functional brain imaging, general evolutionary considerations and, yes, evolutionary psychology (which I think well of, when it's done right): that there are lots of mental modules, which are highly specialized in their information-processing, and that almost any meaningful task calls on many of them, their pattern of interaction shifting from task to task. [14] There is, of course, no need to limit ourselves to early 20th century psychometrics.
All of this, of course, is completely compatible with IQ having some ability, when plugged into a linear regression, to predict things like college grades or salaries or the odds of being arrested by age 30. (This predictive ability is vastly less than many people would lead you to believe [cf.], but I'm happy to give them that point for the sake of argument.) This would still be true if I introduced a broader mens sana in corpore sano score, which combined IQ tests, physical fitness tests, and (to really return to the classical roots of Western civilization) rated hot-or-not sexiness. Indeed, since all these things predict success in life (of one form or another), and are all more or less positively correlated, I would guess that MSICS scores would do an even better job than IQ scores. I could even attribute them all to a single factor, a (for arete), and start treating it as a real causal variable. By that point, however, I'd be doing something so obviously dumb that I'd be accused of unfair parody and arguing against caricatures and straw-men.
If, after looking at your watch, you say that it's 12 o'clock, and I point out that your watch has stopped at 12, I am not saying that it's not 12 o'clock, just that your watch doesn't actually give you any evidence about the time. Similarly, pointing out that factor analysis and related techniques are unreliable guides to causal structure does not establish the non-existence of a one-dimensional latent variable driving the success of almost all human mental performance. It's possible that there is such a thing. But the major supposed evidence for it is irrelevant, and it accords very badly with what we actually know about the functioning of the brain and the mind.
I am not sure what the oddest aspect of this situation is, because there are so many. It may be a statistician's bias, but the things I keep dwelling on are the failures of methodology, which are not, alas, confined to all-correlations-all-the-time psychologists, but also seen in the right (that is, wrong) sort of labor-market sociologist, economists who regress countries' growth rates on government policies, etc., etc. As the late sociologist Aage Sorensen said (e.g. here), the sort of social science which tries to identify causal effects by calculating regression coefficients or factor loadings stops where the scientist's work ought, properly, to begin. (A more charitable view would be that these researchers are piling up descriptions, and hoping that someone will come along, any decade now, with explanations.) Many psychometric and econometric theorists know much better, but they seem to have little influence on practice. To paraphrase Hume:
When we run over libraries, persuaded of these principles, what havoc must we make? If we take in our hand any paper; of macroeconomics or correlational psychology, for instance; let us ask, Does it draw its causal inferences from observations with consistent methods? No. Does it draw its causal inferences from experiments, controlled or randomized? No. Commit it then to the recycling bin: for it can contain nothing but sophistry and illusion.
If I want quick summaries of my data, then means, variances and correlations are reasonable things to use, especially if all the distributions are close to Gaussian. If I want to do serious analyses, I need to start comparing distributions, and it's not as if there aren't methods to do this. If I want to do data mining, then sticking to easily-manipulated linear models makes lots of sense; if I want to find causal relationships, at the very least I should test for nonlinearities (which hardly anyone ever seems to do in the IQ field), or, better yet, turn to non-parametric estimates. If there are lots of positive correlations and I want to summarize them, then finding some factors and checking them by decomposing the variance is one reasonable trick. If I want to argue that there must be a preponderant common cause, it's no good to keep pointing out how much of the variance that first factor describes, when plenty of other, incompatible causal structures will give me that too. (There is a name for this mode of reasoning.) An intelligent response to this criticism would be to look for other aspects of the data (including things other than correlation coefficients), or maybe even new experiments, which could tell apart different causal structures. The fact that, 103 years after Spearman, everyone is still just manipulating the correlation matrix shows the lack of such intelligence.
I have deliberately tried to avoid, here, the issues which make the argument about g and IQ so much more heated than ones about, say, labor-market sociology. But those issues do exist, and are heated, and so you might think that they might drive people to use better methods which could help settle the questions. This doesn't seem to happen. Some examples:
The psychologist Robert Abelson has a very nice book on Statistics as Principled Argument where he writes that "Criticism is the mother of methodology". I was going to say that such episodes cast that in doubt, but it occurred to me that Abelson never says what kind of mother. To combine Abelson's metaphor with Harlow's famous experiments on love in monkeys, observational social science has been offered a choice between two methodological mothers, one of the warm and cuddly and familiar and utterly un-nourishing (the old world of linear regression, analysis of variance, factor analysis, etc.), the other cold, metallic, hurtful and actually able to help materially (statistical methods which are at least not definitely unable to do what people want). Not surprisingly, social scientists, being primates, overwhelmingly go for the warm fuzzies. This, to me, indicates a deep failure on the part of the statistical profession to which I am otherwise proud to belong. It is never a good sign when your discipline's knowledge is the wire-mesh mother all the baby monkeys avoid if at all possible. Less metaphorically, the perpetuation of these fallacies decade after decade shows there is something deeply amiss with the statistical education of social scientists.
Building factors from correlations is fine as data reduction, but deeply unsuited to finding causal structures. The mythical aspect of g isn't that it can be defined, or, having been defined, that it describes a lot of the correlations on intelligence tests; the myth is that this tells us anything more than that those tests are positively correlated. It has been known for almost as long as factor analysis has been around that positive correlations can arise in many ways which involve nothing remotely like a general factor of intelligence. Thomson's ability-sampling model, with its myriad independent causes rather than a single general cause, is the oldest and most extreme counter-example, but it is far from the only one. It is still conceivable that those positive correlations are all caused by a general factor of intelligence, but we ought to be long since past the point where supporters of that view were advancing arguments on the basis of evidence other than those correlations. So far as I can tell, however, nobody has presented a case for g apart from thoroughly invalid arguments from factor analysis; that is, the myth.
In primitive societies, or so Malinowski taught, myths serve as the legitimating charters of practices and institutions. Just so here: the myth of g legitimates a vast enterprise of intelligence testing and theorizing. There should be no dispute that, when we lack specialized and valid instruments, general IQ tests can be better than nothing. Claims that they are anything more than such stop-gaps — that they are triumphs of psychological science, illuminating the workings of the mind; keys to the fates of individuals and peoples; sources of harsh truths which only a courageous few have the strength to bear; etc., etc., — such claims are at present entirely unjustified, though not, perhaps, unmotivated. They are supported only by the myth, and acceptance of the myth itself rests on what I can only call an astonishing methodological backwardness.
The bottom line is: The sooner we stop paying attention to g, the sooner we can devote our energies to understanding the mind.
[1]: To be pedantic, Spearman didn't see that partial correlation coefficients were nearly zero, because I think partial correlations weren't invented yet. Instead he saw that certain "tetrad" equations involving the total correlations held, nearly; satisfying those equations is equivalent to the partial correlations vanishing.
[2]: It's worth noting a subtle point here: even if the two-factor theory is true, g cannot, under any circumstances, be calculated directly from observed test scores, so the idea that it gives us an operational and objective definition of general intelligence is, in a word, wrong.
Remember that each test score is supposed to be the sum of g, a test-specific factor, and a little bit of noise. If we repeated the tests many times on the same person and averaged the results, the noise terms would start to cancel each other out. Suppose we could even repeat the tests infinitely often, so that the noise went away completely. If we had k different tests, we would still be left with k equations, one for each test, and k+1 unknowns, those being the k specific factors and the one general factor. Such an underdetermined system of equations has no unique solution. (Unless, that is, one of the tests measures the general factor alone, or two of the tests have exactly the same "specific" factor, in which case there are only k unknowns.) This "underdetermination" appears to have been first pointed out by E. B. Wilson in Science in 1928, and has, unsurprisingly, never been resolved. Statistically speaking, the model is over-parameterized and so non-identifiable, because different combinations of the factors give exactly the same distribution of results. Scoring above average on all the tests, e.g., might be due to (i) an above-average general factor and average specific factors, (ii) an average general factor and above-average specific factors, (iii) really-above-average specific factors and a way-below-average general factor, etc. (More combinations arise if someone scores above average on some but not all tests.) Assuming a certain prior distribution for the general and specific factors themselves, Bayes's rule gives the posterior distribution for the factors, which will put more weight on some of these possibilities than others, but that ranking comes solely from the prior. If we decide a priori that case (i) is more probable than case (ii), then our posterior estimates will reflect this. It's hard to see what non-circular grounds we'd have for such a decision. (Empirical Bayes doesn't seem to help.)
Of course, if one makes up the general factor so that it's a linear combination of the observed test scores, as in modern exploratory factor analysis, then the general factor is, by definition, calculable from the data. Unfortunately, using a different battery of tests would give you a different common factor, and these will not, except by sheer luck, agree.
Peter Schonemann has written extensively on this problem, and may be the best starting point if you want to know more. (His paper "Factorial Definitions of Intelligence" [abstract, PDF] reproduces Wilson's review as an appendix.) I give this consideration less weight than he does. (So, apparently, does his sometime-collaborator J. H. Steiger.) It definitely rules out claims that g gives us an operational and measurable definition of general intelligence. It does not rule out factor analysis as a means of discovering causal structure. If factor analysis could do that, the fact that it couldn't also give us precise estimates (even in the limit) of the causal variables it found would be unfortunate, but we would still know they mattered, and that should encourage us to find ways of measuring them directly.
[3]: Spearman advocated making the right to vote contingent on demonstrating at least a certain minimum level of general intelligence, on the grounds that stupid people cannot exercise political judgment. Implicit in this is the idea that democracy is justified only to the extent that it makes the right decision or picks the best rulers, which I find repugnant; the point of democracy ought to be that it gives people power over their own lives, and makes rulers accountable to the ruled. (That it also tends to lead to better decisions and more social power helps it survive, but those aren't the point.) On his own theoretical principles, however, all Spearman should have cared about was the sum of the general factor and a specific factor of political judgment, not either factor alone.
[4] I recommend Loehlin's Latent Variable Models: An Introduction to Factor, Path, and Structural Analysis to students who need to know more, since it's clear, practical, decent on the strengths and limitations of the methods, sound on the need for statistical power, and written in an actual human voice. My copy is the second edition; I'm told, but haven't checked, that the fourth finally updated the vintage MacDraw diagrams. Loehlin has been contributing to the literature on IQ and its heritability since the 1970s, at least, and is far more of a hereditarian than I am, though not of the creepy (i.e., Jensenist) school — see e.g. Loehlin, Lindzey and Spuhler, Race Differences in Intelligence (W. H. Freeman, 1975), which contains some useful criticisms of, e.g., Leon Kamin.
— This paper by Bartholomew is a nice over-view from a statistician's perspective, but the implication (in the abstract) that graphical models neglect latent variables is more than a little boggling, and his description of Thomson's model (of which much more below) is not altogether accurate.
[5]: Somewhat more seriously, you can choose whether to make the constructed factors have correlations amongst themselves or be uncorrelated, to prefer factors which involve only a few observations or factors which involve many, etc. There are elaborate techniques ("rotations") for turning factor models which do well by one of these standards into ones which are better by another criterion, all the while giving you completely equivalent observational results, at least at the level of means and correlations. Stephen Jay Gould, in The Mismeasure of Man, made a big deal out of this point. This has met with a lot of objection, sometimes by people saying that one particular way of rotating the factors is correct, and sometimes by people saying that correlations among the factors would need to be explained by other factors in turn ("hierarchical factor analysis"), and doing so just amounts to admitting that there is a general factor g after all. Who was right about this is, however, quite irrelevant here.
As a footnote to this footnote, however, let me point out that even if (miraculously) hierarchical factor analysis gets at the causal structure, it can give you very different kinds of causes at different levels. Suppose for example, that there really were just three distinct skills tapped by intelligence tests — verbal ability, spatial ability, and problem-solving ability. Suppose further (for clarity, not plausibility) that these three abilities were localized in completely distinct parts of the brain, that genes which influenced one had no effect on the others, that they could be trained separately without any transfer of learning, etc. One would then say that these were, indeed, causally distinct and separable talents. They might nonetheless well be positively correlated in a factor analysis, because certain environmental influences would affect them all the same way (nutrition, disease, stress), and even if there was no transfer of training, social processes would tend to correlate verbal schooling with, say, schooling in problem-solving. The higher-level factor associated with their correlations would then just be something like "quality of the developmental environment", and not another, more general mental ability.
[6]: Technically, what I'm showing here is a principle component analysis, rather than a factor analysis in the strict sense. PCA finds linear combinations of the original variables which (1) are uncorrelated and (2) describe as much of the variance as possible, maximizing our ability to reconstruct the original variables. Factor analysis, in the Spearman-Thurstone-etc. sense, aims to describe the correlations of the variables, not their actual values. If I do a strict factor analysis of this data with two factors, the first looks more or less the same as the first principle component (which is not surprising), but the second factor does not, and it's very hard to come up with any meaningful interpretation of it. (This is so no matter what rotation algorithm I try.)
[7]: Which I only learned about from a chance encounter with his 1939 book on The Factorial Analysis of Human Ability at a library booksale; I'm not an expert on the history here. (One doesn't have to be to see the logical problems.) It is striking how many of even the more technical difficulties which Thomson raised there remain issues with factor analysis: underdetermination of factor values, the inability of correlations to distinguish between different analyses, the dependence of estimated factors and factor loadings on accidents of test construction and the population tested, the neglect of statistical power when evaluating analyses, etc.
Thomson's description in Factorial Analysis of his sampling model is mostly concerned with the case where the variables are binary, which simplifies some calculations but obscures the general point. In part this was because he was identifying his variables with the "bonds" of Thorndike's proto-connectionism; he does not explain this to readers, evidently supposing them to be familiar with it. (See note 14 below for more.)
Thomson's original paper ("A Hierarchy without a General Factor", British Journal of Psychology 8 (1916): 271--281), reporting results he obtained in 1914, does not seem to be available electronically, but a follow-up ("On the Cause of Hierarchical Order among the Correlation Coefficients of a Number of Variates Taken in Pairs", Proceedings of the Royal Society of London A 95 (1919): 400--408) is in JSTOR, and worth reading.
[8] The reason is more technical than the rest, so I'll stick it in this footnote. The correlations among the components in an intelligence test, and between tests themselves, are all positive, because that's how we design tests. But that means that the correlation matrix only has positive entries. This has implications for factor analysis, because the usual way of finding the factors is to take the eigenvectors of the correlation matrix (after an adjustment which reduces the diagonal entries but leaves them positive). The larger the corresponding eigenvalue, the more of the variance is described by that eigenvector. The Frobenius-Perron Theorem, however, tells us that any matrix with all-positive entries has a unique largest eigenvalue, and that the corresponding eigenvector only has positive entries. (If some of the correlations are allowed to be zero, there can be multiple eigenvectors which all have the largest eigenvalue, and all their components are non-negative.) Translated into factor analysis: there has to be a factor which describes more of the variance than any other, and every variable is positively loaded on it. So making up tests so that they're positively correlated and discovering they have a dominant factor is just like putting together a list of big square numbers and discovering that none of them is prime — it's necessary side-effect of the construction, nothing more. However, the Frobenius-Perron Theorem doesn't say by how much the largest factor has to dominate. What my little simulations show is that in completely random cases it can dominate quite a lot, and this can happen even when the theorem doesn't apply (because some correlations are negative).
Similarly for hierarchical factor analysis. Wim Krijnen ("Positive Loadings and Factor Correlations from Positive Covariances", Psychometrika 69 (2004): 655--660) has algebraically proved what was intuitively clear, that an all-positive correlation matrix is itself sufficient to ensure that there will be a single higher-order factor, with positive loadings on the lower-order factors. Given this fact, the occasional dispute about whether g is a second-order or third-order factor seem not so much like debating how many angels can dance on the head of a pin, as like assuming that there must be an angel, since the pin after all comes to a single sharp point, and debating the angel's place in the celestial hierarchy.
[9] The entries above the diagonal were picked uniformly and independently on the interval from 0 to 1; those below the diagonal are their mirror images (because correlations are symmetric); and the diagonal itself is 1 (because everything is perfectly correlated with itself). I had R generate matrices according to that procedure until it came up with one which was also positive definite, as a correlation matrix must be; this was just the first output of the random number generator which wasn't rejected. You can see how I implemented the test in my code. That code also includes an option for generating correlation matrices which are not just symmetric and positive-definite but also diagonally dominated.
The "test scores" of the subjects were Gaussian random vectors, with the prescribe correlations, a mean on each dimension of 100 and a standard deviation on each dimension of 15. Each vector was independent of all the other vectors.
[10]: It's worth doing a quick check that I haven't, by chance, produced a set of correlations which "really" are all due to a single factor, whatever that would mean here. If that was the case, the correlation between any two variables should be the product of their factor loadings. Some are close, e.g., variables 1 and 3 have a correlation of 0.683, and that would predict (0.782)*(0.814) = 0.637, but others are way too far off, e.g., 1 and 5, where (0.782)*(0.669) = 0.523, more than twice the real value of 0.241. The chi-squared statistic for testing the hypothesis that there is only one factor, and all apparent partial correlations are due to chance, is 1023.81 on 5 degrees of freedom, which translates to a p-value of about 4 * 10-219.
[11] I am also assuming that you don't do something silly like sample the tail of one of the variables. In my random data, if I confine myself to the samples where the first variable is > 120, I find that the correlation between variable 1 and variable 5 is not 0.241, but 0.017, and the correlation between variables 2 and 5 is not 0.002 but -0.161. This is, or ought to be, a duh. (If you want to see the point really driven home, or to see just how much it can screw up your factor analyses, consult the chapters on sampling of test-takers in Thomson's Factorial Analysis.)
[12]: Here is another example of the same effect. We now have complete DNA sequences for a number of people, including James Watson. So in principle we could do the experiment were we take a large group of people, and take samples of some 10 or 100 or 1000 different cell types from each, and identify which of, say, 5000 different genes are being expressed in each cell type in each person. The cell type gets one point for each expressed gene which is different from the version in Watson's genome. The score of each cell type will thus depend on many factors, some shared (genes expressed across many or all cell types) and some specific. Now extract the leading factor. Mathematically, it will exist, and it will even be heritable. It's also quite meaningless biologically, even as an index of not-James-Watson-ness. [This paragraph was written in the summer of 2007, well before the latest brouhaha over Watson's views on the stupidity of black people.]
[13]: To see the follies such circularity leads to, compare this post by Tyler Cowen with the scoldings he gets in his comments section. Cowen points out behaviors which call for intelligence, in the ordinary meaning of the word, and that these intelligent people would score badly on IQ tests. A reasonable counter-argument would be something like: "It's true that 'intelligence', in the ordinary sense, is a very broad and imprecise concept, and it's not surprising the tests don't capture it perfectly. But the aspects of 'intelligence' they do capture are ones which are vastly more important for economic development than the ones displayed by Cowen's friends in San Agustin Oapan, however amiable or even admirable those traits might be in their own right." This would be a position about which one could have a rational argument. (Indeed, I might even agree with that statement, as far as it goes, as might A. R. Luria.) Instead, Cowen gets told over and over that if it's not showing up in IQ it's not intelligence, and it's unscientific and sentimental of him to think otherwise.
And people wonder why I don't set up comments.
[14]: It's not really relevant to the question at hand, but I should say a little about the neural interpretation Thomson gave his model. He tended to think of what I've been calling the abilities as "on the bodily side ... neurone arcs" (Factorial Analysis, p. 271), and to think, by analogy with the all-or-none firing of individual neurons, that they should be binary variables (p. 54). Highly g-loaded questions and tests were thus ones which engaged large parts of the brain, or at least (pp. 50ff) large fractions of the part of the brain which could be probed with intelligence tests. Following the general trend of neuroscientific opinion in his day, he thought the brain had little, if any, important localization of function, though he allowed that there could be several "sub-pools". (There were actually good reasons, based on experiments, for people to take this view, which was nonetheless wrong. Anne Harrington's books — Medicine, Mind, and the Double Brain and Reenchanted Science — provide nice historical perspectives.) But, of course, all this is an interpretation of the stochastic model, not the model itself; if I really wanted to push for a revival of the model — which I don't — I'd prefer to interpret the abilities as how well different specialized cognitive modules function, along with some invocation of the distributed nature of neural information processing.
[15]: There are a very narrow set of technical conditions which let you out of this, i.e., which let you combine group differences, measurement unbiasedness and predictive unbiasedness. (See Millsap's papers for details.) That IQ tests satisfy them is highly implausible, and, to say the least, empirically unsupported. If someone wants to show that IQ tests are unbiased, that's what they need to be working on, not pounding their tables of regression coefficients.
[16] This is not to endorse Fagan's theory of intelligence, which seems to me far too simple as well. Also, that paper sticks to the same sort of linear regression/ANOVA/etc. techniques as the others in such journals, which, as I keep saying, have a lot of problems. But, again, if you find Arthur Jensen's methods acceptable, let alone Richard Lynn's or Philippe Rushton's or Charles Murray's, then you really have no right to quibble, and (unlike those three) Fagan and Holland at least do basic covariate matching.
Minor updates 19 October: Fixed typos,
broken link to Millsap and Meredith's paper (thanks to Bill Raynor). 22
October: Fixed typos (thanks to Dave Kane). 19 November: Fixed typos
(thanks to Jutta Degener).
Enigmas of Chance;
The Natural Science of the Human Species;
Minds, Brains, and Neurons;
IQ
Posted by crshalizi at October 18, 2007 17:20 | permanent link
Attention conservation notice: Links to more-or-less amsuing and pleasant things to look at online, assembled with no particular order or design. It's very likely that you've already seen all the ones you'd really enjoy here.Mostly written before the root canal.
How to make a clockwork powerbook; the finished device (via Ectoplasmosis).
Scot McLemee
on library
culling, of which I need to do some a lot. (Not that that
should inhibit you from buying me books
from my
wishlist.)
Progress in the eradication of the guinea worm parasite. Also from Chapati Mystery: William Moorcroft, veterinarian-adventurer.
Color photographs of the pre-revolutionary Russian Empire, including a section from Central Asia (via Light Reading).
"Dear Mr. Turing: We regret to inform you that your submission 'On Computable Numbers, With an Application to the Entscheidungsproblem' was not accepted to appear in FOCS 1936..." (more).
BLDGBLOG is always reliable for cwonders: building San Francisco on old ships; "A completely automated world of self-assembling machine-flowers"; Derinkuyu; Gunkanjima; Cecil B. DeMille's lost city.
On a grander scale: Astronomical alignments of the streets of Manhattan, sure to provoke intense speculation among future archaeoastronomers. Of course, as the global financial capital, what New York really needs is invincibility, and where shall invincibility be found other than through the Maharishi? (Via John Burke).
Doubtless it was owing to a lack of adequate understanding of unified field theory that Jerash, Jordan became an abandoned place. More abandoned places: Crumbling concrete dinosaurs in desert California. An old New England Mill. "Night photography of the abandoned west" (again via John Burke in e-mail). The Times runs a story about "urban exploration" and pleaures-of-modern-ruins photograph, with pictures (some including one of the photographer in various stages of undress; not safe for work if your boss is more uptight about nekkid wimmen than the pages of the Times). A secret garden, not abanonded, in London.
Cats versus English professors (with conclusions generalizing to others professors); cats versus dogs.
Geo. Chaucer asks: I Can Hath Cheezburger? Also: LOLthulhu (of which this might be the best).
Speaking of Wikipedia and who writes it, Aaron Swartz discovers the lived value of Aufklärung, and explores the wonders of biotechnology. (The comments on the latter are... remarkable.)
A defiant daisy, no relations of Bob the Angry Flower. Bob's appropriately thoughtful sequel to Atlas Shrugged in turn calls to mind The 25 Most Inappropriate Things An Objectivist Can Say During Sex. (Number 1 is too obvious, but number 11 is almost sweet.)
Edward Gorey's adaptation of The Trouble with Tribbles (via Making Light).
Garance Franke-Ruta catches glimpses of a kind of Heaven in Iowa.
Inside the Large Hadron Collider (via Fed By Birds) — huge, striking wrap-around photos with sound.
The Miserable World of Prometheus.
An alternate history of Chinese science fiction (via Chrononautic Log).
Michael Dirda appreciates the Book of Imaginary Beings and Clark Ashton Smith.
Only one writer deals adequately with Svalbard's threat to peace.
What is best in life? Perhaps the New Mexico sky.
Ediacaran fossil embryos. "It appears that by the time the Doushantuo Formation was deposited ~550 million years ago, cnidarians had already split off from the lineage that evolved into cuter, fuzzier animals like nematodes and kitty-cats. So the discovery of a complete embryonic sequence of early cnidarian wouldn't help much in the race to figure out what the hell was happening with bilateral symmetry and the general body plan confusion of the late Vendian and early Cambrian. In that respect, these findings are a little disappointing - what we really want are bilaterian embryos. But y'know, cnidarians are awesome too, and deserve love just as much as any other fossil embryos."
NERO, the video game of evolutionary autonomous system design. Back in the realized world, how our soldiers in Iraq feel about their bomb-exploding robots.
Some people look at developments like those and start muttering darkly about how the Singularity is nigh. In fact, of course, the Singularity happened in the last years of the nineteenth century. (A recurring theme around here.)
Essential post-Singularity technologies: How JPEG image compression works (viaDanny Yee). How spam filtering works. (Incidentally: Programming Language Inventor or Serial Killer?)
Spam can be interpreted for divinatory purposes, as can the 80s Tarot.
See the 80s Tarot come to life by watching the Talking Heads perform "Take Me to the River" in Rome in 1980.
Paul Krugman uses the Talking Heads to explain it all to you:
Now, as they survey the wreckage of their cause, conservatives may ask themselves: "Well, how did we get here?" They may tell themselves: "This is not my beautiful Right." They may ask themselves: "My God, what have we done?"But their movement is the same as it ever was.
On which note, I'll leave you with some tranquil photos of a road to nowhere.
Manual trackback: Danny Yee.
Posted by crshalizi at October 16, 2007 15:02 | permanent link
One root canal later, this no longer seems like a monstrous curiosity, but a sign on the way to the promised land, in which our descendants' teeth will be continuously replaced, like sharks'. Now excuse my while I go wallow in self-pity (more than usual).
Update, 17 October: reading this makes me want to pull into a protective crouch in sympathy. (Thanks, if that is the word, to Jay Han in e-mail.)
Biology; Self-Centered
Posted by crshalizi at October 15, 2007 19:20 | permanent link
Posted by crshalizi at September 30, 2007 23:59 | permanent link
Attention conservation notice: It's long, and it's about something which makes eyes glaze over even as tempers flare up, and it's not funny at all. Worse yet, there's a part II which is even more mathematical and boring. You could always read it later, but time spent now is gone forever.
Disclaimer: A decade ago, some of the senior faculty in my department, i.e., some of the people who will be voting on my contract renewal and tenure, helped put together a book called Intelligence, Genes and Success: Scientists Respond to The Bell Curve. Most, but not all, of the responses in that book were exceedingly negative. I cite some of that work below. Whether this should alter your evaluation of the case I make is for you to decide.Thanks are due to (alphabetically) Carl Bergstrom, John Burke, Henry Farrell, Mark Liberman, Aryaman Shalizi and Aaron Swartz for many helpful suggestions. But, of course, I'm the only one responsible for this, all remaining errors are my own, and it's not in any sense authorized or endorsed by anyone (in particular not by them).
People seem to be experiencing more than the usual difficulty grasping what I was getting at in my posts on accent and intelligence. This is my fault, for trying to be cute rather than trying to be clear. (I realize I'm too murky even when I am trying to be clear.) I am already heartily sick of the subject, which is turning into the huge time-suck I was afraid it would be, and which presents a depressing prospect from every point of view, not least those which make it clear how rare it is for anyone to change their mind on any aspect of it for any cause at all. (I do wonder if I should've stuck with the original title of "Duet for Leo and Razib.") My aim here is to lay everything out cleanly and explicitly, and be done with this matter.
I was originally going to do just one post, explaining why I called the general factor of intelligence a "statistical myth", why I don't put any real faith in what I regard as even the best of the current estimates of IQ's heritability, and the evidence for IQ's malleability. But the thing grew unwieldy, and the only thing which I find more dreary, right now, than discussing heritability and malleability is explaining why factor analysis can't do what people want it to, so I'll save that for later, and stick to the heritability and plasticity of IQ here. [That post is now out.] Whether IQ means anything or not, it is, unlike general intelligence, unquestionably something we can measure, so we can consider how heritable and malleable it is. I am going to assume that you know what "variance" and "correlation" are, but not too much else.
To summarize: Heritability is a technical measure of how much of the variance in a quantitative trait (such as IQ) is associated with genetic differences, in a population with a certain distribution of genotypes and environments. Under some very strong simplifying assumptions, quantitative geneticists use it to calculate the changes to be expected from artificial or natural selection in a statistically steady environment. It says nothing about how much the over-all level of the trait is under genetic control, and it says nothing about how much the trait can change under environmental interventions. If, despite this, one does want to find out the heritability of IQ for some human population, the fact that the simplifying assumptions I mentioned are clearly false in this case means that existing estimates are unreliable, and probably too high, maybe much too high.
I should add that nothing I'm saying here is in any way original. Almost thirty years ago, Oscar Kempthorne — a man who knew a thing or two about statistical genetics — made pretty much all these points in a paper in Biometrics, working in swipes at Dick Lewontin while he was at it. (I would quibble with him some about the possibility of causal inference from observational data, but these rely on methods which didn't then exist, and are certainly not used by the parties to this dispute.) I do not, of course, pretend to be in Kempthorne's league, or anywhere close. For me, this is another episode of "Why oh why can't we have a better intelligentsia?". Kempthorne's exasperation, on the other hand, was that of someone seeing the tools of their life's work being wretchedly abused.
When we take our favorite population of organisms (e.g., last year's residents of the Morewood Gardens dorm at CMU), and measure the value of our favorite quantitative trait for each organism (e.g., their present zip code), we get a certain distribution of this trait:
| Genotype | ZIP |
| AATGAAATAAAAAAAAACGAAAATAAAAAA... | 15232 |
| AAGGCCATTAAAGTTAAAATAATGAAAGGA... | 15213 |
| AAGGCCATTAAAGTTAAAATAATGAAAGGA... | 48104 |
| CAATGATTAGGACAATAACATACAAGTTAT... | 15212 |
| GGGGTTAATTAATGGTTAGGATGGGTTTTT... | 87501 |
| CCTTCAAAGTTAATGAAAAGTTAAAATTTA... | 15217 |
| CCTTCAAAGTTAATGAAAAGTTAAAATTTA... | 15217 |
| TAAGTATTTGAAGCACAGCAACAACTAGGT... | 02474 |
If we are limited to the tools of early 20th century statistics (in particular, if we are the great R. A. Fisher, and so simultaneously forging those tools while helping to found evolutionary genetics), we summarize the distribution with a mean and a variance. We can inquire as to where the variance in the population comes from. In particular, assuming the organisms are not all clones, it is reasonable to suppose that some of the variation goes along with differences in genes. The fraction of variance which does so is, roughly speaking, the "heritability" of the trait.
The most basic sort of analysis of variance (see also: Fisher) would make this conceptually simple, though practically unsuccessful. Simply take all the organisms in the population, and group them by their genotypes. For each group of genetically identical organisms, compute the average value of the trait. Compare the variance of these within-genotype averages (that is, the across-genotype variance) to the total population variance; this is the fraction of variation associated with genotypes. In most mammalian populations, where clones (identical twins, triplets, ...) are rare and every organism otherwise has a unique genotype, this would tell you that almost all of the variance of any trait is associated with genetic differences. On such an analysis, almost all of the variance in zip codes in my example would be "due to" genetic differences, and the same would be true of telephone numbers, social security numbers, etc.
To see why, look at my table again. With one exception (the twins who live in 15213 and 48104), in this population changing zip code means changing your genotype. The vast majority (81%) of the variance in zip codes is between genotypes, not within them. With real human data, a quarter of the people wouldn't be twins living apart, and the proportion of variance in zip codes "due to" genotype would be even higher.
Naively, then, on this analysis we would say that the "heritability" of zip code, the fraction of its variance which goes along with genetic variations, is 81%. It is crucial to be clear on what this means, which is merely and exactly this: in this population, if we take a random group of genetically identical people, the variance within that group should be 19% (=100-81) of the total variance in the population.
Of course, there is nothing special about the genotype, and one could do a completely parallel analysis of variance based on environmental histories. If those are captured at a fine-enough grained level, every organism has a unique history, so all variance is "due to" the environment. Clearly, the sense in which the phrases "due to", "explained by", "accounted for", etc., are used in analysis of variance and regression have nothing to do with their ordinary, causal meanings. I am going to try to avoid using the causal-sounding phrases, because I think they encourage confusion, and instead stick with the vocabulary of "associated with", or at most "described by", because anything stronger is unjustified here.
Nobody credible has ever seriously proposed doing this sort of ridiculous analysis of variance. It seems clear that not every aspect of the environment matters. (The number of snowflakes which hit my face during February was either odd or even, but it's hard to see how that could change my present weight.) Similarly, not every genetic distinction makes a difference to the trait. If we could somehow identify relevant distinctions, and group together organisms with relevantly-identical genotypes, we'd be doing something much more reasonable. The true heritability of the trait is defined to be the ratio between the variance associated with genetic differences and the total variance in the trait.
If you've taken any kind of statistics course at all, what I've just said may be enough to give you an idea of how to figure out heritability: identify the relevant environmental variables, measure them, regress the trait on them, and figure that the residual variance has to be genetic. Many people, I find, have the impression that heritability studies control for the environment, in the sense of regression. (Leave aside, for now, whether "controlling for" really does what people seem to think.) Some studies in experimental genetics on plants and animals do this, near enough, but that's basically never how it's done with human beings. Instead, the procedure is vastly more indirect and model-dependent, which matters a lot when evaluating the results, as we'll see.
Supposing that we somehow learn the genetic variance, the usual next step is to split it into two uncorrelated components, one associated with the distinct, additive contribution of each individual gene to the trait ("additive genetic variance"), and one associated with specific combinations of genes. (Making this split is a straightforward problem in linear algebra; I won't get into it.) The ratio of additive genetic variance to total trait variance is the "strict" or "narrow", as opposed to the earlier "broad" heritability. Conventionally, the symbol for heritability is h2 (not h), with a subscript to indicate whether it's meant narrowly or broadly; I'll write h2 for the narrow sense and H2 for the broad. (The square is here for the same reason that the fraction of variance accounted for by a regression is R2 and not R.)
Implicit in the last steps is the assumption that the value of the trait is, in each organism, just the sum of a genetic contribution and an environmental one, i.e., that there is no interaction between the relevant genes and the relevant environments; also the assumption that the genetic contribution to the trait is completely uncorrelated with the environmental contribution. If these assumptions fail, one can still calculate heritability-like quantities, somewhat like in the simple analyses of variance, which can play similar roles in some evolutionary calculations, but they become strongly context-dependent, so it no longer makes any sense to speak of the heritability of a trait. We also come back to absurdities, since the tendency of families to cluster geographically makes zip codes look heritable — and more heritable in larger and more representative samples of the nation than in more geographically localized ones!
Saying a trait is highly heritable is saying that, in a given distribution of genotypes and environments, most of the variance in that trait is associated with genetic differences. Maybe the most important point I'll make here is that this is not the same most of the value of the trait being genetically controlled. The textbook example is that (essentially) all of the variance in the number of eyes, hearts, hands, kidneys, heads, etc. people have is environmental. (There are very, very few mutations which alter how many eyes people have, because those are strongly selected against, but people do lose eyes to environmental causes, such as accident, disease, torture, etc.) The heritability of these numbers is about as close to zero as possible, but the genetic control of them is about as absolute as possible. Similarly, heritability says nothing about malleability, about how much or how easily the trait changes in response to environmental manipulations: heritability is defined with respect to a given distribution of environments, and does not predict the response to environmental changes. (I will come back to this below.)
What heritability does predict is the response to selection, in a constant distribution of environments. This is why quantitative geneticists developed and retained the concept. If a population is subjected to directional selection on a trait, whether the selection is natural or artificial, and the trait follows the classical decomposition into additive, uncorrelated components, the degree to which the genetic component of the trait changes will depend on the intensity of selection, the variance in the trait, and its heritability. The response to selection, the phenotypic change in the next generation, will be large if the selection pressure, the trait's variance, and the trait's heritability are all high, assuming that the distribution of environments is held fixed and uncorrelated with genotype. In sexually-reproducing organisms, where the genes get reshuffled every generation, the relevant heritability is the narrow heritability, involving only the additive term for the genes, not the broad heritability, involving the total genetic variance. (Feel free to guess which number The Bell Curve obsesses over, despite supposedly being concerned with evolution.) To reinforce the context-dependence of heritability, note that selection will tend to reduce genetic variance, and so heritability, especially when selection is strong.
(Fisher, incidentally, was very far from being enslaved by the product of his own labors, at least here, describing it [1] as "one of those unfortunate short-cuts, which have often emerged in biometry for lack of a more thorough analysis of the data", dismissing the denominator as a "hotch-potch", and complaining that "the same herd, measured in the same character, would give widely different estimates of 'heritability' according to the practical precision obtained by the care and skilful control of the experimenter.")
So how does one estimate the heritability? The typical tactic, which again goes back to the early days of Fisher and friends, is to measure the covariance among individuals who share some but not all of their genes, and some but not all of their environments, and use the differences in covariances to gauge the size of the sundry components of variance. The standard biometric model decomposes the trait, call it Q, as follows:
If one takes identical (monozygotic) twins, raised together, and assumes that they are otherwise just like other members of the population, one obtains for the covariance
Let's break for a moment to be sure we understand this estimate, which is easy to get wrong. Heritability tells us how much IQ should differ among people who all happen to be genetically identical (twins, triplets, clones...), but are otherwise randomly distributed across the population, as compared to how much it differs across the whole population. By a convention of the test-makers, the standard deviation of IQ for the whole population is 15 points. That is, if we take a totally random person, we should expect their IQ to be about 15 points from the population average, which another convention fixes at 100. If we take two totally random individuals, then, we'd expect them to differ in IQ by about 22 points [= 15 * sqrt(2)]. Now imagine we have one of these groups of genetically-identical-but-otherwise-random people. If the heritability of IQ is 0.75, we expect the variance of IQ scores within such a group to be only 0.25 (=1-0.75) of the population variance in IQ. The standard deviation is the square root of the variance (that's why it's h2), so the expected standard deviation of a genetically-identical group should be 0.5 of the population standard deviation. Thus, in the end, what h2 amounts to is that two randomly chosen but genetically identical people should differ in IQ by 11 points rather than 22. It is not the case that a heritability of .75 means that there is a three-quarters chance of having identical IQ scores, or that three-quarters of the value of the IQ score is set genetically --- heritability is always about dividing up shares in the spread around an average level, not the level itself.
How could this possibly go wrong? Well, one sign that all is not well comes from comparing such "direct" estimates with "indirect" ones. The full algebra is tedious, but I'll go through a little bit of it to show the logic of what's going on. If we take fraternal ("dizygotic") twins, or other siblings, they will share somewhat more than half of their genes, because assortative mating means that their parents are already apt to be more similar genetically than a random pair of individuals from the population. Taking this into account, the covariances are
This is a good place to remind ourselves that the different contributions to the variance are basically never, in humans, measured in any direct way, or "controlled for" by regression. C and S are not measured variables, nor are A and D. Rather, they are all inferred indirectly, by comparing the correlations predicted by the model with observed correlations. (Doing otherwise, in experiments on other organisms, involves doing things like breeding lineages of known genetic composition.) If the model's specification is wrong, then even the best estimator of a parameter like heritability can give you rubbish. (Incidentally, almost nobody who works with these models in psychology seems to use the kind of mis-specification testing or more elaborate specification analyses econometricians have developed — but, to be fair, most economists don't.) In particular, to get the right values for the genetic contributions to the variance, it is essential to account for all environmental sources of correlations, otherwise we are back to the genetics of zip codes. It is also crucial that the form of the model — additive contributions from uncorrelated sources — be correct. If there are correlation between genes and environments, or there are interactions between genetic contributions and environmental ones, then all bets are off.
To see why gene-environment interactions matter, consider one of the best-established links between genetic variations and intelligence, phenylketonuria. This is a recessive genetic disease which interferes with the normal metabolism of the amino acid phenylalanine. If someone with one of the defective forms of the gene for phenylalanine hydroxylase consumes too much dietary phenylalanine, it leads, among other problems, to serious mental retardation. Under suitable diets low in phenylalanine, however, they grow up mentally normal. Assigning shares of this effect to the genes and to the environment is exactly as sensible as trying to say how much of the fact that a car can go is due to its having an engine and how much is due to their being fuel in the tank. The best the usual biometric model could do here would be to predict that having the gene always reduced intelligence, as did consuming phenylalanine (which would be bad news for makers of artificial sweeteners); the fact that it's the combination, and only the combination, which is a problem would be missed, and the predicted size of the effect would be badly wrong. (The situation is similar for, say, hypothyroidism and a lack, rather than an excess, of iodine, but the genetics there are messier.) So while everyone piously says that genes and environments interact in development, they typically use models which assume that they do so only in trivial ways, and hope that any actual interactions are small enough to be treated as noise.
The basic biometric model predicts that direct and indirect estimates of heritability should agree. Since they do not, the model cannot be right. It needs to be modified to include additional sources of variance, or correlated components, or interactions, or some combination of these. Which?
The best estimate of IQ's heritability that I've seen, at least within the usual biometric framework, is that of Devlin, Daniels and Roeder (Nature 388 (1997): 468--471), and they take the first tack, identifying a neglected source of variance. (Disclaimer: Roeder is one of the aforementioned senior members of my department. On the other hand, this is one of the few studies of the question by actual statistical geneticists.) They performed a meta-analysis of all the usable correlation studies they could find, including the justly famous Minnesota twins study, which came to 212 estimated correlations covering 50,470 different pairs of people. This gave them data on identical twins raised together and apart, fraternal twins raised together, non-twin siblings raised together and apart, parents and children living together and apart, and adoptive parents and their adopted children. It's worth noting that they were able to find only five studies on identical twins raised apart (none with more than 60 pairs), two studies on non-twin siblings raised apart (none with more than 150 pairs), and three studies of separated parents and children (none with more than 400 pairs). Even assuming the data quality is excellent, there isn't a lot of it that isn't "contaminated" with shared family environments.
What Devlin et al. took from the studies were the estimated correlations and the sample sizes, not the heritability estimates. Then they tried to fit them all into a single coherent picture. If you assume any particular magnitudes for all the different components of the variance, you can work out the expected correlation you'd see for any one of the nine kinds of pairings, and, using the sample sizes, the likelihoods of getting the observed configuration of sample correlations. If I'd been the one doing it, I'd have tried to find the likelihood-maximizing value of the parameters, and then done some bootstrapping to get confidence regions for this estimate. Since I suspect that would turn out to be horridly intractable, I am happy with what they did instead, which is to do Bayesian estimation with a thoroughly uninformative prior, one basically indifferent to all decompositions of the variance which weren't arithmetically impossible. (The paper I ought to be finishing, instead of writing this, is about when my fellow frequentists should find such procedures unobjectionable.) In other words, they did a perfectly standard Bayesian meta-analysis.
Beyond that, though, they changed the model, so it looked like this:
Devlin et al. fit several permutations of this model, including ones with and without the maternal effect term, and allowing more or less shared environmental influence on people with different sorts of relationships. What they found was that including the maternal effects substantially improved the fit, despite the fact that just about everyone previously had ignored it as negligible. To summarize, in their preferred model, the narrow-sense heritability (h2) was 0.34 (with a 95% credible interval of 0.27 to 0.40), the broad-sense heritability (H2) was 0.48 (with a CI of 0.43 to 0.54; coincidentally, very close to the estimate in Jencks's old book [2]), and share of the maternal term was 0.20 for twins (CI of 0.15 to 0.24) and 0.05 for non-twin siblings (CI of 0.01 to 0.08). They also compared their model with maternal effects to an alternative which embodied the often-repeated claim that the heritability of IQ rises with age: maternal effects fit much better. In retrospect, it's kind of astonishing that everyone had ignored maternal effects before, if only because including them is quite standard in animal quantitative genetics. Once, like Devlin et al., you do include them, voila, the discrepancy between direct and indirect estimates of heritability goes away. There's more to this study (read it, if you're not sick of these matters already), but that's the core of it.
So, if I like this study so much, and it puts the narrow-sense heritability of IQ at 0.34 and the broad-sense heritability at 0.48, why do I say that we presently know squat about the heritability of intelligence? Partly this is because I deeply object to the confusion of "IQ" with "intelligence", but that's a subject for the sequel. Even if we stick to IQ, whatever that might be, though, I don't see how this study, or any similar one, can really answer the question on technical grounds. It's about as far as you can go within the classical assumptions, but those assumptions are horridly shaky.
Let me point out three things that Devlin et al. didn't do — limitations they're fully aware of, I hasten to add, but couldn't be helped given the data available. First, they had to assume that the magnitude of the different additive components was the same across all the studies in their meta-analysis. In particular, the magnitude of the environmental components not included in maternal or within-family terms were assumed to be the same for everyone. Second, they neglected any correlations between the components. Third, they neglected any interactions between the components, and in particular any gene-environment interactions. Not only did they neglect the last two possibilities, they did not compare their model to models incorporating them, so they give no reason to think that they are negligible. I think there are pretty good reasons to believe these things would make a big difference. I am not going to say much more about the first problem, of heterogeneity or heteroskedasticity, but I want to expand on the issues of missing environmental correlations, of interactions and nonlinearities, and of cultural transmission and gene-environment correlations.
Many twin data sets show that the correlations in twins' IQs actually change with the environment, in a pretty crude way which nonetheless goes beyond what people generally include in their models. I will quote from an old paper by Bronfenbrenner [3, pp. 159--160], because it's handy and it makes the point:
The importance of degree of environmental variation in influencing the correlation between identical twins reared apart, and hence the estimate of heritability based on this statistic, is revealed by the following examples.(Let us pause a moment to contemplate the sense in which identical twins, growing up in the same town and attending the same school, are "raised apart".)a. Among 35 pairs of separated twins for whom information was available about the community in which they lived, the correlation in Binet IQ for those raised in the same town was .83; for those brought up in different towns, the figure was .67.
b. In another sample of 38 separated twins, tested with a combination of verbal and non-verbal intelligence scales, the correlation for those attending the same school in the same town was .87; for those attending schools in different towns, the coefficient was .66. In the same sample, separated twins raised by relatives showed a correlation of .82; for those brought up by unrelated persons, the coefficient was .63.
c. When the communities in the preceding sample were classified as similar vs. dissimilar on the basis of size and economic base (e.g. mining vs. agricultural), the correlation for separated twins living in similar communities was .86; for those residing in dissimilar localities the coefficient was .26.
d. In the Newman, Holzinger, and Freeman study, ratings are reported of the degree of similarity between the environments into which the twins were separated. When these ratings were divided at the median, the twins reared in the more similar environments showed a correlation of .91 between their IQ's; for those brought up in less similar environments, the coefficient was .42.
By the time you're done partitioning the twin pairs into classes in this way, n is pretty small, and the sampling errors in the correlations are going to be large, so I wouldn't give the 0.86 vs 0.26 contrast a lot of credence, but the fact that the differences are all in the same direction, and get pretty big, ought to be hard to ignore. (And it's worth remembering that n is never very large for identical-twins-raised-apart studies.) The obvious explanation for such results is that the developmentally-relevant environment of twins raised apart, but in similar towns, is much more highly correlated than that of twins raised apart in dissimilar towns. This means that a substantial chunk of the correlation you thought was genetic is actually due to shared environment, and pushes your heritability estimate down. Alternately, you could abandon the lack of correlation between genetic and environmental contributions, or the strictly additive nature of the model by including a very substantial interaction between genes and environment, so that identical genotypes respond very differently to those differences in environment. However you slice it, your estimate of heritability was too high.
This is, of course, an old criticism; also a correct one. Kempthorne put it like this:
What, indeed, is the "grip" on environment in the human IQ area? It is no more than "reared together" versus "reared apart", and what does "reared apart" mean? Nothing more than at some age two related individuals, e.g., identical twins or full-sibs were separated by adoption, and then placed in homes that could be related familially and/or, of similar economic and social nature. I can only comment: Really, how naive can one be? The Burt study was characterized in the literature as the "only experiment". Some experiment!Devlin et al. were only able to detect maternal ("intra-uterine") effects because their meta-analysis had a very large sample, which made their tests more sensitive to its presence, and because it is fairly easy to tell when people had the same birth-mother.As soon as one turns to any behavioral measurements, the need to incorporate intra-uterine, family and community environment is obvious. I have the view that the "hereditarians" are utterly naive. It is obvious that parental IQ influences offspring environment. It is obvious that there is cultural transmission. To ignore the existence of this is merely stupid. I see no point in mincing words. If non-scientists sometimes have scorn for some supposed "scientific work", they should not be faulted.
To this naivete in model formulation, must be added a statistical naivete. Any statistical test has, within its conceptual underpinning, a sensitivity or power function. It needs essentially no deep thought to realize that the sensitivity of statistical tests for maternal effects, for genotype-environment interaction, etc., etc., is so low that to say, as has been said often, that such and such a model modification has been examined and found unnecessary is utter naivete. [p. 18, his italics, my links]
When people have included interactions between genetic and environmental variables, and done so in an even half-way decent manner, the results are quite dramatic, and make it impossible to talk about a value of heritability at all. For instance, allowing a (sucky) measure of socioeconomic status to interact with genetic and environmental variables, Turkheimer and co. found a massive dependence of the broad-sense heritability on IQ on status, running from nearly zero at low status to about 0.8 at high status. (Since the data, very unusually for this sort of study, actually included a lot of poor people, the former number should be rather more precise than the latter.) Of course, this is another one of the uses of regression which is merely descriptive, and drawing any causal inferences from it is very risky indeed (a point I'll expand on in the sequel; for now, you can read Clark Glymour explaining why this is Bad and Wrong). But if you're willing to believe, say, Edward Prescott's statistics, let alone Arthur Jensen's, you have no right to complain about Turkheimer et al. Anyone who tells you that the heritability of IQ has any particular value needs to explain away findings like this.
Let us consider trying to estimate the heritability of something which is transmitted culturally. Its real heritability is zero, since there is no genetic component to its variance, but the question is rather what the estimated heritability would be, employing the usual methods. In particular, suppose I came up with some quantitative measures of, say, accent. (Talk to some phoneticians if you think that wouldn't be possible.) Now I attempt to estimate their heritability. I'd find that identical twins reared apart had more similarity in accent than random members of the general population. They were born at the same time (and population-wide accents drift over time: see, again, Labov), in the same place (and so they will tend to grow up near each other, even though raised apart). Moreover, the kind of families receiving children being raised apart are not random samples of the general population. (To pick an example, in one of the Minnesota studies on IQ, race, and adoption, the average IQ of the adoptive fathers, all of them white, was over 120, while the state average for white males was 105. For a more formal and detailed version of this critique, see, e.g., Mike Stoolmiller.) The more geographic and social (and, to a lesser extent, temporal) variability I include in my sample, the more such twins will stand out as more similar in accent than the general population. Of course children born and raised in the same family are going to be even more similar, through obvious non-genetic processes. Applying the usual direct estimate based on twins raised apart, or even the kind of analysis done by Devlin et al., I will estimate a non-zero heritability, which is entirely an artifact of neglecting cultural transmission. Another direct estimate of the narrow heritability is proportional to the correlation between separated parents and children. (It's twice that correlation if there's no assortative mating, shading down to just equal to the correlation when mating is perfectly assortative.) This, too, will be positive, if only owing to geographic clustering. The typical indirect estimate, based on comparing the correlation of identical and fraternal twins raised together, is the only one which should work, if they experience environments of equal variance. If identical twins experience more similar environmental influences on accent than fraternal twins, then even it will conclude that accent is, in fact, heritable.
(In fact, the usual methods should lead us to conclude that latitude and longitude are heritable. [To be concrete, imagine measuring this by averaging 1000 GPS readings taken at random times over the course of a week, commencing when the subjects are exactly 17 years and 17 days old.] Take identical twins who are adopted out right after birth — what should be ideal cases. Being born in the same place, at the same time, and placed in new households by the same mechanism, they will wind up physically closer than randomly-selected infants born at the same time. [Remember, on top of the tendency to wind up near where they were born, the kind of households which adopt are not socially representative and so will tend to cluster in space, a phenomenon demographers refer to as "neighborhoods".] Their latitude and longitude will thus be positively correlated, and since this correlation is the "direct" estimate of broad-sense heritability, that will be non-zero. I'd even be willing to bet, modestly, that identical twins adopted out will be more correlated in location than fraternal twins adopted out, if only because some of the fraternal twins will be of different sexes and some adoption processes will tend to treat them differently. Since identical twins raised together tend to be emotionally and physically closer than fraternal twins raised together, even the indirect method will tell us that location is heritable. If they applied the same principles here as they do in formally-similar cases, the hereditarian psychologists would recommend inflating the sample correlations before using them to estimate heritability, to compensate for the restricted geographic range of such studies, making the error worse.)
One of the sound tenets of a lot of conservative social and political thought is an insistence on the importance of tradition and tacit knowledge, its transmission through families and communities, and the difficulty of making up for the absence of early immersion in a tradition with later explicit instruction. The fact is, however, that if I studied anything which is transmitted via tradition in the way people estimate IQ's heritability, I'd conclude that it had a genetic component. If, in particular, there are traditions which affect IQ, the estimated genetic component of the variance is going to actually include at least some of the variance in traditions.
It's not hard to see how to do a proper study of heritability for accent. What you would need to do is take separated twins, adopted into other families, and see if they were more or less similar to each other than randomly-chosen children adopted into the same family, after matching on covariates (age at adoption, time with the family, time spent in one place, etc.). Similarly, the requirements for a really sound twin study of IQ have been known at least since Flynn laid them out in his 1980 book on on Race, IQ, and Jensen: twins from a representative sample of the genetic variation in the population would need to be adopted into families with a representative sample of the environmental variation in the population, with no gene-environment correlations introduced by the adoption process itself. Even so, we would need an indirect method, or lots of standardized uterine replicators, to remove maternal effects, and the problem of post-adoption processes creating such correlations would remain. I am unable to discover anyone, in the last twenty-seven years, actually doing such a study, though I'd be interested to learn of one which even comes close.
This leads us to the topic of gene-environment correlation, as opposed to interaction. It is very easy to think of ways in which genes and environments can become correlated during the life cycle. A standard example is the child who shows some early proclivity for music, perhaps genetically primed, which results in music lessons, more exposure to music, praise and interest in their efforts, support for them practicing, etc., the flip side being the child who seems dull early on, and so gets written off. The well-known paper by Dickens and Flynn shows how far you can push this idea, especially if you include some social amplification. But making this the main scenario for such correlations is indulging in that optimistic individualism which is one of our more admirable national traits (really!), but here gets in the way of thinking clearly.
All hitherto-existing civilized societies are divided, more or less sharply, into classes or groups, which tend to reproduce themselves through non-genetic means. (Bowles and Gintis give an excellent review of the contemporary situation, though even they use too high an estimate of the heritability of IQ.) The members of these groups differ systematically in their access to material and cultural resources. These are a large part of what is meant by "environment" in the biometric models. (Throwing "socioeconomic status" into a regression does not make these variables go away.) Social classes also have a tendency towards endogamy, of variable strength. The result will be a tendency to create genetic differences between groups, and so correlation between genes and environment. If we look at traits which respond favorably to material resources (like height) or cultural resources (like cognitive skills), and ignore this correlation, we will get a systematically upward-biased estimate of heritability, because genetic similarity will predict the value of the trait. Moreover, selection on the trait will alter the frequency of the associated genes. This is true even if the genetic differences have no causal influence on the trait at all, i.e., if there would be no systematic differences were the distribution of environments equalized (hopefully, up).
To the best of my knowledge, we currently have no idea of the magnitude of this effect, because:
I want to expand a bit on the importance of cultural resources, and how it plays in here. Consider a society where (for the sake of argument) the middle classes have dramatically expanded within living memory, and where our sample is disproportionately biased towards those classes. (Most of the family and adoption studies used for IQ are, in fact, biased in that way.) Many middle class families will then have parents one or two generations removed from poverty; others will have been in the middle classes for much longer. While the two groups may have comparable incomes and even formal educational credentials, any conservative thinker (Oakeshott, Barzun, Loury...) will be happy to explain to you that the latter will be more likely to have traditions which are fitted to their station in life, and adaptive in society more generally. (The difference will attenuate over time.) There will also, because of previous endogamy, tend to be genetic differences between the two sub-groups. Even if those genetic differences are causally irrelevant, genes and environment will be correlated, and genetic differences will predict trait values.
(An aside, because I'm a glutton for punishment: Ashkenazi Jews in the United States are represented in high-average-IQ professions far out of proportion to their share in the population, and are also more educated than the general population. It would be astonishing if they did not have above-average IQs. They are systematically different, genetically, from the rest of the population. Some genetic diseases, e.g., Tay-Sachs, are more common among them, while others, e.g., cystinosis, are rarer. More trivially, and so more suitably for my purposes, they are much more likely to be able to wear their hair in what one of my Israeli colleagues calls his "Isro". It follows that genetic variation at Isro-relevant loci has some ability to predict IQ, at least in the US. [It would not be so predictive in the Sitka District, another sign that we are dealing with correlates and not causes.] Ashkenazi Jews are also, by definition, systematically different culturally from the rest of the US population, so the environment in which children grow up differs systematically too. It's been argued, with some plausibility, that many of those cultural traditions are highly adapted to modern life, while not, of course, developed for that purpose — "pre-adaptations" or "exaptations", in the jargon. Now postulate — I hope this will not be a stretch — some scientists who are very sophisticated about molecular biology, but very naive about tradition, and for that matter about statistical methods for structured populations. It is only too easy to imagine them establishing quite precise degrees of genetic commonality by tracking the Isro loci or ones linked to them, but neglecting gene-culture covariation, and concluding that those loci are pleiotropic, affecting both hair curliness and IQ. Similar remarks apply to, say, the association, at least in the US, between the Scots-Irish and violence in defense of personal honor. Designing a study which could handle this kind of covariation is left to the reader.)
I could go on with other reasons why, even if the genetic variance-component of IQ was always zero (which, for the record, I doubt it is), we should expect a higher correlation in the IQ of twins' raised in the same family than of other siblings raised together. For starters, any fluctuations in the family's resources, quality of available schools, etc., are going to hit the twins at the same point in their development, which will not be the case with other siblings. But this grows (or has grown) tedious.
Does a trait's heritability tells us anything about its malleability, about how easy it is to change the trait with environmental manipulations? The answer is "no, of course not", even assuming (1) the basic biometric model holds, and (2) we are talking about true heritability and not biased-to-nonsensical estimated heritabilities.
It's banging on an often-sounded drum, but it's worth doing because it makes the point clearly: height is heritable, and estimates for the population of developed countries put the heritability around 0.8. Moreover, tall people tend to be at something of a reproductive advantage. Applying the standard formulas for response to selection, we straightforwardly predict that average height should increase. If we select a population without a lot of immigration or emigration to mess this up, say 20th century Norway, we find that that's true: the average height of Norwegian men increased by about 10 centimeters over the century. But that's much more than selection can account for. Doing things by discrete generations, rather than in continuous time, height grew by 2.5 centimeters per generation. (The conclusion is not substantially altered by going to continuous time.) If the heritability of height is 0.8, for this change to be due entirely to selection, the average Norwegian parent must have been 3 centimeters taller than the average Norwegian. This, needless to say, was not how it happened; the change was almost entirely environmental. The moral is that highly heritable traits with an indubitable genetic basis can be highly responsive to changes in environment (such as nutrition, disease, environmental influences on hormone levels, etc.).
Conversely, the very low heritability of eye number does not tell us that it is easy to increase how many eyes someone has by exercise, education and training, manipulating diet, manipulating ambient light, trepanation, etc.
So, does this apply to IQ? Well, if we couldn't find any environmental interventions which affected IQ, that would indeed be strange and suspicious. But in fact it's really not hard. Winship and Korenman (pp. 215--234 of Intelligence, Genes, and Success; large PDF reprint) re-analyzed the National Longitudinal Study of Youth data used by Herrnstein and Murray, but did it without their technical blunders. (I omit a blow-by-blow.) They found that the impact of each extra year of schooling beyond 8th grade is somewhere between 2 and 4 IQ points, depending on exactly how the model is specified, with most specifications giving estimates of 2.5 to 3 points per year. (This is in line with other studies, including some that Herrnstein and Murray cite and claim, contrary to fact, show negligible impact). The difference in IQ between someone who drops out of school in the 8th grade and someone who finishes college, all else being equal, would thus be somewhere between 20 and 24 points. Now, you might object that maybe this is because a higher IQ makes you stay in school longer, but Winship and Korenman use two measures of IQ, at different ages, and control for the direct effect of early IQ on later IQ, the direct effect of early IQ on years of education, and sundry other covariates. This study isn't perfect, methodologically, because regression generally sucks as a tool for causal inference, and here in particular there could be all kinds of unmeasured influences, hiding in their structural equations, confounding their results. But if you start discounting studies on these grounds — which, let's be honest, you should — you soon find that you can not only recycle The Bell Curve, but also safely ignore the bulk of what's published in Intelligence, The American Economic Review, etc. If you are not prepared to do that, it's hard to see how you can object to Winship and Korenman's methods.
There are, however, other studies of education's impact on IQ where it's hard to see how endogeneity could creep in. (I draw these from Wahlsteen, pp. 71--88 of Intelligence, Genes, and Success, because it's to hand, but it would be easy to multiply examples.) A cute one comes from Alberta, where there used to be an arbitrary birth-day cut-off for starting schooling; children born just before it thus start school a year before those born just a day later. The mean difference in IQ between such children is significant, positive, and about four points in favor of the early starters. (The same thing happens with many sports, where the discrepancies grow with age, perhaps due to a positive feedback loop from practice to success to motivation to practice.) A smaller effect, but broadly applicable, is that children's test scores are higher at the end of one school year than at the beginning of the next, after which they recover. There are also randomized studies of interventions with "at risk" families, e.g. ones with unusually low birth-weight children. Depending on the study, the treated groups had IQs 10 to 15 points above the controls. Because of the random assignment, not only is there no problem of endogeneity, it's also idle to worry about placebo effects — it would be fantastic if a placebo raised IQ by 10 points. Another nice example (not from Wahlsteen) comes from Heber's work on Rehabilitation of Families at Risk for Mental Retardation. Rather than summarizing it my own words, I'll quote someone else's summary (though no doubt I'll be told he understood neither genetics nor experimental methods):
It describes an experiment on ghetto children whose mothers had IQs of below 70. Some of these children received special care and training, while others were a control group. Four years after the training period the IQs of the former averaged 127 and those of the latter 90, a spectacular difference of 37 points. The fact that the control children had a 20-point advantage over their mothers is not unexpected [because of regression toward the mean]. [4, pp. 14--15]
At this point, the ritual is for people to begin saying things like "there's nothing you can do if the environment is already decent", "the changes didn't last long after the program" (which would equally show that exercise can't really change physical fitness; see below), or to raise irrelevancies. (My favorite, among the last, is to point to adoption studies showing that adoptees' IQs are more correlated with their biological than their adoptive parents' IQs, conveniently side-stepping which set of parents had scores closer to the adoptees'.) But now "we've established what you are, we're just haggling over your price".
On top of all this, there is, to repeat what I said in the dialogue, the Flynn Effect. (See the links there for references.) The population average IQ rose monotonically, and pretty steadily, over the 20th century in every country for which we can find suitable records, including ones where we can definitely rule out immigration or emigration as significant contributory causes. (If it really is global, and I think we don't know enough yet to say either way, then the idea that it could be due to migration is — peculiar.) The magnitude of the gains are, as these things go, huge: two to three IQ points per decade. As I said in the earlier post, this puts the average 1900 IQ at 70 to 80 in 2000 terms. Let's check how intense natural selection would have to be to explain this. Over a twenty-five-year generation, we're looking at an IQ change of 5 to 7.5 IQ points. Sticking with the usual biometric model, and taking the best estimate of heritability within that model, namely 0.34, we'd have to see a reproductive differential of between 14 and 22 points, i.e., the average parent would have to have an IQ that much higher than the average person. (I am neglecting correcting for assortative mating and for continuous time, which don't change things much.) Since 15 IQ points is one standard deviation, this would imply a huge bias in reproductive rates towards those with higher IQs. Needless to say, nothing of the kind is observed in any of the countries where the Flynn Effect has been documented.
(If you follow Herrnstein and Murray's account of the data they used, which is always hazardous, you should conclude that the mean IQ of parents is about 1 point below that of the general population; the result of this adaptive advantage of stupidity would be to lower the mean IQ by about a third of a point per generation. For comparison, the black-white IQ gap is about 15 points, and even, or rather especially, such raving hereditarians will tell you it is at least 20 to 50 percent "environmental". Following such people in placing a completely unwarranted causal interpretation on the models, this would mean that bringing the environment of black Americans up to the present white standard would raise the formers' IQ by 3 to 7 points, and so the national average by somewhere in the range of one-third to two-thirds of a point. It is hard to understand why anyone who values average IQ so much they care about such small differences would worry more about dysgenic pressure than racial injustice.)
Nobody doubts that athletic abilities have genetic causes. Bodily shape is strongly influenced genetically, as, undoubtedly, are all manner of things like lung capacity, the properties of muscle fibers, reflexes, visual acuity, etc. I can say this with complete confidence because these traits have clearly evolved, and so must be under substantial genetic control. (Even so, careful attempts to find genetic bases for even very striking group differences in high-level performance fail to identify any. [Thanks to Leo Kontorovich for that link.]) On the other hand, it is plainly insane to suppose that athletic performance is not very largely learned, and a result of interaction with the environment.
To be really concrete, think about distance running. With practice, just about everyone can increase the distance they can run, the speed they can sustain, etc. Presumably there are physiological limits on what any one body can attain, even with an ideal training regimen, but performance (which is all one sees on any test) is malleable. Practice can take someone from being winded after sprinting two blocks to being able to run a marathon. Very basic physiological parameters, like the rate of oxygen uptake, demonstrably respond to training, and over a matter of weeks at that.
Of course, the flip side of this is that not practicing reduces ability, and sufficiently drastic lack of practice takes someone from being able to run marathons to being winded after two blocks. It is not enough to have practiced at some point in the past. There needs to be continuing practice, which means continuing opportunity and motivation, as well as sheer physical capacity. If we took a bunch of kids from an environment where physical exercise is discouraged, and make them run laps every day for a year, at the end of that they will (on average) be better at running than their peers. (They may have acquired other issues, but they will be better at running.) If we now return them to their environment, with a pat on the back and perhaps a souvenir pair of sneakers, is there anyone who doubts that in, say, five years most of them will be pretty much as sluggish as their peers? Is there anyone who would look at the result of such an experiment and conclude that exercise cannot, in fact, alter the ability to run?
Of course, not every kind of physical performance is as malleable as is distance running. No amount of training is ever going to let anyone hold their breath under water for an hour. Similarly, I do not expect any sort of learning will be able to alter some fairly basic aspects of the mind, e.g., to force the capacity of short-term working memory up to twenty chunks (rather than "the magical number seven plus or minus two"). We have evolved certain kinds of physical adaptability — feel free to speculate on why running might've been more useful than breath-holding, but not so useful as to be automatic — and similarly we have evolved some kinds of mental adaptability, but not others.
Let me sum up.
I realize I'm inviting the suspicion that I'm protesting too much. If I really think heritability is irrelevant to malleability, why shouldn't I be happy to accept, say, Jensen's old favored value of 0.8 for IQ's broad-sense heritability, which puts in the same range as highly-malleable height? Why go on at such length about an irrelevancy? I can only offer two replies. One is that I am trying to meet people half way: even if I can't persuade you that heritability has nothing to do with malleability, I hope to persuade you that the current estimates are not reliable, that the notion of a value for IQ's heritability is silly, and that we do, indeed, know squat about that question. The other and more basic reply, however, is that these people are wrong in ways I find intensely irritating.
So: Do I really believe that the heritability of IQ is zero? Well, I hope by this point I've persuaded you that's not a well-posed question. What I hope you really want to ask is something like: Do I think there are currently any genetic variations which, holding environment fixed to within some reasonable norms for prosperous, democratic, industrial or post-industrial societies, would tend to lead to differences in IQ? There my answer is "yes, of course". I've mentioned phenylketonuria and hypothyroidism already, and many other in-born errors of metabolism also lead to cognitive deficits, including lower IQ, at least in certain environments. More interestingly, conditions like Williams's Syndrome, Downs's Syndrome, etc., are genetically caused, and lead to reasonably predictable patterns of cognitive deficits, affecting different abilities in different ways. In many of these cases, it seems very likely (but is not yet established) that these variants cause problems with the signaling pathways which set how gene expression responds to environmental cues. Manipulating those signaling pathways during the right time windows would change what kind of mind the organism has later. The fact that different genetic disorders lead to different patterns of cognitive deficits, rather than just generally making people duller all around, suggests ways of disentangling which genes are relevant to which abilities through which molecular mechanisms. (Cf.) At a popular level, I've still not run across a better description of way the regulation of gene expression couples genotypes and environments during mental development than Gary Marcus's writings, but if you want details there is a whole rapidly-growing field of molecular developmental neurobiology (as I'm not-infrequently reminded).
I suspect this answer will still not satisfy some people, who really want to know about differences between people who do not have significant developmental disorders. Here, my honest answer would be that I presently have no evidence one way or the other. If you put a gun to my head and asked me to guess, and I couldn't tell what answer you wanted to hear, I'd say that my suspicion is that there are, mostly on the strength of analogy to other areas of biology where we know much more. I would then — cautiously, because you have a gun to my head — suggest that you read, say, Dobzhansky on the distinction between "human equality" and "genetic identity", and ask why it is so important to you that IQ be heritable and unchangeable.
[1]: "Limits to Intensive Production in Animals", British Agricultural Bulletin 4 (1951): 217--218, reprinted on pp. 219--223 of volume 5 of his Collected Papers (ed. J. H. Bennett, Adelaide: University of Adelaide Press, 1974). My quotations are all from p. 221 of the reprint.
[2]: Christopher Jencks et al., Inequality: A reassessment of the effect of family and schooling in America (New York: Basic Books, 1972). I have not gone back over Jencks's calculations to see if they are sound, so this may indeed just be coincidence.
[3]: Urie Bronfenbrenner, "Nature with Nurture: A Reinterpretation of the Evidence", pp. 153--183 of Ashley Montagu (ed.), Race and IQ (second edition, New York: Oxford University Press, 1999; first edition 1975), ISBN 0-19-510220-7.
[4]: Theodosius Dobzhansky, Genetic Diversity and Human Equality (New York: Basic Books, 1973).
Update, 23 November: Typo correction (thanks to Matt Bonakdarpour).
The Natural Science of the Human Species; Enigmas of Chance; IQ
Posted by crshalizi at September 27, 2007 14:55 | permanent link
If James Wimberley can invoke Timur-i-Lang for a discussion of climate change, I feel free to resurrect his old interlocutor, the great historian and pioneering social scientist ibn Khaldun, in regards to our strategy in Iraq.
Ibn Khaldun's theory of culture and society was appropriately complicated; it was, in fact, a science in the proper Aristotelian mode, starting from certain premises regarded as secured by other sciences, observation, etc., and concerning itself with the formal, material, efficient and final causes of human societies, especially their growth, their decay, and their built-in drives to attain certain ends. (I refer the reader curious about this notion of entelechy to the brilliant two-part exegesis due to G. Clinton.) This is not the place for a full discussion, which I'm not really qualified to give, anyway, but for the present purposes what concerns us is the core of the historical cycle ibn Khaldun thought he had observed. This concerns the inter-relationships between economic life, social solidarity, cultural refinement, and military effectiveness.
The goal of human society, ibn Khaldun thought, was the development of culture and the sciences. For the arts and sciences to become developed and refined, specialists must train and practice for long periods of time, in order to develop the necessary habits to a high pitch. (Ibn Khaldun, a noted poet in his time, nicely described poetry as "a technical habit of the tongue".) For these specialists to be able to make a living while doing so, they must live in cities, and those cities must be flourishing economically, so that there is enough demand for their specialties, and so that there is a surplus to pay for such luxuries as poetry, craftwork and astronomy. This is only possible if there is government and the state --- ibn Khaldun, rather more realistically than Weber, defined the state as that institution whose function is to suppress all such injustices as it does not itself commit. For the state to be able to do this, it must be militarily effective. Military effectiveness, he thought, depends not just on individual courage, but also the solidarity ('asabiyya) of the soldiers with one another and with their leaders. People raised in conditions of luxury do not (reliably, or for the most part) have such feelings of solidarity, nor do ordinary townsmen and peasants, since their safety and survival is guaranteed for them by the state. It is only barbarians living in mountains and deserts, whose survival is crucially dependent on mutual support against the elements and against other tribes, who will develop the feelings of solidarity on which military power rests.
Fortunately enough, men are naturally ambitious for power, wealth and a life of ease. Thus, the leaders of tribal groups which possess the necessary size and solidarity to have military power will desire to seize control of cities and their states, and become governing powers. The size of the state they will be able to found will depend on their degree of solidarity and the size of their armies. Initially, the rulers will be vigorous, expansive, and uncultured. Gradually their descendants, raised in the luxury and security of cities, will grow more refined and improve their patronage of the arts and sciences; this condition, at the peak of a dynasty, is in ibn Khaldun's view the natural end (telos) of human society. Everything that grows must decay, however, and for ibn Khaldun this decay takes the conjoined form of the dynasty losing the feelings of tribal solidarity which was the basis for its power, owing to the dynasts' new, softer mode of life, and at the same time hastening their economic decline through corruption and excessive taxation. This sets the stage for a new dynasty to emerge from the hills or deserts.
Why is the United States government unable to impose its will on Iraq? It is because it has too few soldiers, too far from home, among too alien a population. (If our army of occupation was a million soldiers strong, the fact that almost none of them can make themselves understood would be much less of a problem.) Some have suggested that the problem is insufficient will or solidarity on the American side, but this seems implausible; assuming we actually want there to be an Iraqi population to govern, simply killing more of them is unlikely to work (never mind the moral issues). What ibn Khaldun would advise, I think, is to find an Iraqi group which is numerous, has the solidarity needed to dominate the rest of Iraqi society, and can be brought into alliance with us; and he would advise us to look at either the deserts or the mountains. I submit that there is exactly one group which fulfills the necessary conditions: the Kurds.
They comprise a reasonable fraction of the Iraqi population; their effective 'asabiyya is demonstrated by the fact their militias, a.k.a. peshmerga, already control Iraqi Kurdistan militarily; and they have, notwithstanding the unpleasantness of the 1970s and 1980s, a by-now long-standing alliance with us. Our strategy, then, should be to offer them our support in a bid for military and political domination over the rest of Iraq — with the understanding that they are to leave Turkey strictly alone. That is, they not only get Kirkuk, they get Baghdad and Basra, and not just the north's oil but all of Iraq's oil. Of course, this will be horribly undemocratic and bloody, and it will make anyone even remotely sympathetic with Arab nationalism hate us even more, but I suspect many in Washington would view those attributes as features rather than bugs in any policy.
Update (25 September): No, I am not actually advocating this. (Note the "modest proposal" category below.) No, I do not expect to see this happen. Yes, I agree that if this did happen it would be very bad for just about everyone in Iraq. Yes, I am seriously saying that Iraq has long since run out merely bad options. Yes, there are more direct ways to make that last point.
Manual trackback: Egregious Moderation; MetaFilter; Paperpools (I am not worthy!)
Writing for Antiquity; The Continuing Crisis; Modest Proposals
Posted by crshalizi at September 15, 2007 17:54 | permanent link
Attention conservation notice: 1200-odd words on two old papers on the foundations of economics, one of them by two of my fellow SFI external faculty members. Only of interest if you are the kind of person who'd care about the summer's Methodenstreit: The Extended Blogospheric Remix, but posted too late to contribute to that discussion, which in fact it does not address. Mostly written in November 2005 and then abandoned tothe gnawing criticism of the micefinish a grant proposal.
Who knew that Reinhard Selten was the author of a dialogue on the foundations of economics?
The participants are a Bayesian, an economist, an experimental psychologist, an "adaptationist" (i.e., an evolutionary game theorist), a population geneticist, an ethologist, and a "Chairman" standing in for Selten. The topic, rather than the nature of love or justice, or even the ideal city, is, in the Chairman's words, "What do we know about the structure of human economic behavior?" The ultimate answer is "squat all, but it's not Bayesian; we need lots of experiments" (Prof. Dr. Dr. h.c. mult. Selten phrases this somewhat more elegantly).
This is a lot more empirical than I, for one, expected from the inventor of the idea of sub-game perfect equilibria, but it turns out, from his autobiographical sketch, that Selten was doing experimental economics back in the early 1960s, before anyone recognized there was such a field...
For that matter, who knew that Reinhard Selten maintains a page about his cats?
On a not-unrelated note, Bowles and Gintis are also looking at the foundations, and bringing to mind a pair of contractors who are telling the homeowner that, sadly, sadly, everything will have to be replaced.
Walras helped inaugurate the modern form of mathematical economics in many ways, but one of his contributions which has really stuck is a picture of the market economy as a kind of centralized system of simultaneous auctions. In this picture, everyone starts with some initial allocation of goods and other resources, and a utility function which tells them how satisfied they would be with some other basket of goods. The auctioneer calls out a vector of prices, and everyone indicates to the auctioneer how much of each good they would be willing to buy or sell at that price. The auctioneer then adjusts the prices to try to make supply equal demand; when they do and the market will clear, trades take place at that price. Finis. Now, this is clearly not at all how a real market system works --- though I have been told that the Paris stock exchange used did use a system along these lines back in the day --- but that particular scheme is not the important part of Walrasian economics. The bits that really matter are the assumptions that (1) everyone has a well-behaved, fixed, completely self-regarding utility function, and (2) there are markets in everything, and participating in these markets (finding counter-parties, discovering prices, etc.) is automatic and costless.
What Bowles and Gintis say makes a lot of sense to me:
[E]conomic analysis must become more social and psychological in its treatment of the human actor, more institutional in its description of the exchange process, yet no less analytical in its model-building and no less dedicated to the construction of general equilibrium models.
Though I don't completely agree with the very last bit. The point of a general equilibrium model is that it depicts the economy as an interconnected system, and lets us model the global effects of locally-applied changes. They're quite right that we should want to keep doing that; but they don't seem to have a strong argument that the only way to do this is through an equilibrium model.
[W]e must judge policies and institutions not by how closely they approximate the assumptions of the fundamental theorems of welfare economics, but rather according to their ability to function effectively in the second-best world of ineradicable state and market failuresThey do insist, correctly, that, "Contrary to the claims of many of its critics, Walrasian economics never had a policy agenda." I'd go further and say that many forms of heterodox economics — in particular, the Austrian school — are much better suited to ideological mystification in the service of capitalism. (But many of them also can't, as my mother would say, think their way out of a wet paper bag, which may limit their effectiveness.)
They have some fun with the absurdity of trying to use the Walrasian framework to explain the differing fates of, say, East Asia and sub-Saharan Africa. To give a concrete example, though not one they mention: South Korea and Ghana had nearly the same per-capita GDP in 1958, but by 1998 South Korea's was about seven times larger. Differences in investment, which is nearly all that the Walrasian framework can appeal to, can account for less than half of this difference; and this includes the somewhat dubious notion of investing in human capital. (See chapter 1 [PDF] of the World Bank's 1998--1999 World Development Report .) This is not unrelated to what they describe as an "equally telling failure":
[Walrasian economics'] surprising inability to understand the shortcomings of the main competitor to capitalism in this century, state ownership and central planning. The basic problem with the Walrasian model in this respect is that it is essentially about allocations and only tangentially about markets — as one of us (Bowles) learned when he noticed that the graduate microeconomics course that he taught at Harvard was easily repackaged as "The Theory of Economic Planning" at the University of Havana in 1969.
This is a point elaborated on at book length in Joseph Stiglitz's Whither Socialism? (a much better guide to recent work in the economics of imperfect information and imperfect competition, especially Stiglitz's work on those subjects, and the light they shed on how not to run a socialist or capitalist economy, than it is to feasible, market-based socialisms), and is quite correct.
Were I less lazy, I'd describe Bowles and Gintis's plans for the new structure, and how it would actually deal with things like markets, competition, and economic growth, but, well, I'm lazy, so go read them.
Manual Trackback: Thoughts on Economics.
Posted by crshalizi at September 15, 2007 17:08 | permanent link
Attention conservation notice: A rant, running to over 6000 words, about the horrors of physicists trying to do economics, by someone who used to be more sympathetic, but has since left physics, and has no credentials in economics. Some may detect an unpleasant sour, musty odor, not entirely due to my having begun writing in June 2006. Includes lots of amateur sociology of science, unsupported by evidence, or, again, any credentials on my part. Even if you care about these fields, wouldn't you rather read science than science criticism?
The occasion for the present rant was being asked by a number of people what I thought about a news piece by Philip Ball, in Nature, on the state of what's come to be called "econophysics". (It lurks behind a pay-wall, without free access, alas.) This in turn was prompted by a paper in Physica A, which sometimes seems like the house organ of the movement, titled "Worrying Trends in Econophysics", by the economists Mauro Gallegati, Steve Keen, Thomas Lux and Paul Ormerod. (Unfortunately, it's not at arxiv.org, but Keen provides a copy, which is more up to date than others on the web.) What follows is some discussion of the paper, shading into the wider theme given by my title.
Back in May of 2005, when there was some dispute at Crooked Timber about whether physicists had contributed anything worthwhile to the study of networks, Henry Farrell asked me to comment. Not being one to miss a chance to procrastinate by pontificating, I did so. Afterwards, my physicist friends told me they appreciated my defense of the honor of statistical mechanics, while my social-scientist friends told me they were glad to see physicists put in their place. Somehow, I doubt that this post will pull off the same trick of serving both duck and rabbit in a single dish.
Ball's piece has a lot of "he said, she said" — or rather, the mathematical sciences being what they are, "he said, he said". (It's also a touch misleading about background issues, though not in very consequential ways [1].) Basically, it reports that some of the economists who had been OK with econophysics are saying "enough, already!", and some of the econophysicists are wringing their hands, while others are charging ahead unperturbed.
The charges made in the "Worrying Trends" paper are reasonably summarized in the abstract:
Econophysics has already made a number of important empirical contributions to our understanding of the social and economic world. These fall mainly into the areas of finance and industrial economics, where in each case there is a large amount of reasonably well-defined data.More recently, econophysics has also begun to tackle other areas of economics where data is much more sparse and much less reliable. In addition, econophysicists have attempt to apply the theoretical approach of statistical physics to try to understand empirical findings.
Our concerns are fourfold. First, a lack of awareness of work that has been done within economics itself. Second, resistance to more rigorous and robust statistical methodology. Third, the belief that universal empirical regularities can be found in many areas of economic activity. Fourth, the theoretical models which are being used to explain empirical phenomena.
The latter point is of particular concern. Essentially, the models are based upon models of statistical physics in which energy is conserved in exchange processes. There are examples in economics where the principle of conservation may be a reasonable approximation to reality, such as primitive hunter-gatherer societies. But in the industrialized capitalist economies, income is most definitely not conserved. The process of production and not exchange is responsible for this. Models which focus purely on exchange and not on production cannot by definition offer a realistic description of the generation of income in the capitalist, industrialized economies.
Before we gone on, it's worth noting that the authors of the "Worrying Trends" paper are, in fact, coming from positions which make them about as predisposed to favor econophysics as possible. While they are economists, none of them, so far as I can tell, has much time for orthodox, neo-classical economics. (Ormerod, for instance, has written at least one popular book about how it's junk, and Keen's personal website is debunkingeconomics.com.) In fact, they have been, as it were, fellow travelers, publishing in Physica A, etc. (E.g., Ormerod, whose work I happen to know better than the others, has several papers arguing that the distribution of the duration of recessions follows a power-law, and that this reflects a kind of critical cascade of failures among firms. Paul Krugman made a similar suggestion in his book The Self-Organizing Economy, but doesn't seem to have followed it up.) So, the people lodging these complaints are not just very familiar with econophysics, they have public commitments which should, if anything, bias them in its favor.
All of their charges seem fair to me; I would only say that they do not go far enough. Some of these remarks from the abstract are amplified in the body of the paper. E.g., the first sentence of the paper proper repeats the first sentence of the abstract, but the second sentence adds "Many of these were anticipated in two truly remarkable papers written in 1955 by Simon and in 1963 by Mandelbrot, the latter in a leading economic journal." Nonetheless, they do give a fair-minded summary of what econophysics has actually positively achieved, so I'll quote that, too (omitting references).
The evidence for the fat-tailed distribution of asset prices changes noted [by Mandelbrot] has now been established beyond doubt as a truly universal feature of financial markets. A genuinely original and very important contribution of econophysics, using the technique of random matrix theory, has been the discovery that the empirical correlation matrix of price changes of different assets or classes of assets is very poorly determined. [This finding] undermines Markowitz portfolio theory and the capital asset pricing model, still regarded as powerful and valid theories by many economists. There have been very few extensions of the random matrix approach outside financial markets, though there are many potential applications....(For reasons to be skeptical of the extinctions stuff, however, see Newman and Palmer.)Another active area of empirical investigation for econophysics has been industrial structure and its evolution. As with financial markets, large amounts of generally reliable data are available in this area, too. It should be said that some of the econophysics literature is perhaps less original and/or well established than physicists might appreciate. Decisive evidence on the right-skew distribution of firm sizes, for example, has been both available and well known in industrial economics for many years. Plausible candidates in the economics literature to represent the empirical size distribution are the lognormal, the Pareto and the Yule. The main problem is in capturing the coverage of small firms. Recent attempts to do this ... lend support to a power-law distribution... However, this may well be an as yet unexplained outcome of aggregation... A more decisive finding by econophysicists is that the variance of firm growth rates falls as firm size increase, although this too was anticipated in the early 1960s. A further discovery is that the size-frequency relationship, which describes the pattern of firm extinctions, appears to be very similar to that which describes biological extinctions in the fossil record.
From here on, the "Worrying Trends" authors imply, it's all down hill, and I have to agree. When you are dealing with a very, very large amount of very high quality data, you can get away with using sloppy, not-too-reliably means of analysis, because your data will save you unless your analytical procedures are actively malicious. (That goes double if the data are generated by active processes over which you have considerable control, as in real physics experiments.) It also helps if you are merely trying to describe patterns in the data, without framing serious hypotheses about the mechanisms which produce them. Note, also, that none of the genuine accomplishments really draw on any of the distinctive concepts or mathematical structures of theoretical physics, as opposed to the more elementary reaches of probability and stochastic processes. In other words, these are things economists could have done themselves, if they'd read better books on random processes (e.g.) early in their training.
If econophysicists were content to stick to this sort of thing, to the phenomenology of financial time series (etc.), the situation would be almost OK. Almost, because even there it is far too common to see (for example) claims that such-and-such a distribution is a power law, because there's a straightish bit on a log-log plot. I have discussed that particular fallacy ad nauseam elsewhere, and will just note in passing that even here, the refusal to do statistics dooms people to talk nonsense. The real point is that econophysicists are not content with phenomenology based on overwhelmingly large data sets, but want to get at mechanisms, in all kinds of social and economic systems, and there things get really ugly. There is a large literature in econophysics, for instance, on agent-based models of financial markets and (supposedly) other forms of collective behavior, including the formation of public opinion, but not even the authors of "Worrying Trends", who are about maximally sympathetic to the movement, list it as a success. (I will return to the minority game and why I'm so down on it below.) This is a shame, because it's actually in the area of mechanistic models that one might imagine (as I have) that physicists could make a distinctive contribution.
So then: why oh why don't we have better econophysics?
The first reason which occurs to me, now that I'm a dues-paying, card-carrying statistician, is that almost all econophysicists are theoretical physicists, and moreover statistical physicists. (I'm one myself, or at least was through my Ph.D.) Modern physics began, in the 17th century, by fusing mathematical theorizing and artisanal craft, but one result of our progress has been to impose a specialized division of labor, sharply separating theory and experiment; Fermi was probably the last physicist to be both a great theorist and a great experimenter. (Perhaps this is connected to his invention of Monte Carlo?) This means that it is very rare for a theoretical physicist to analyze actual empirical data (say, measurements of magnetic susceptibility), which is what the experimentalists do. Theorists instead deal with experimental results (say, that the susceptibility depends on temperature in such-and-such a way). In high energy physics, theorists are actually so remote from contact with experimentalists that a separate guild of interface specialists ("phenomenologists") has arisen to mediate between them. As a natural consequence of this division of labor, theorists receive no instruction at all in data analysis, let alone statistical inference.
Statistical physicists haven't gone to the same extreme of separation from experiment as particle physics, but in some ways its theorists are especially badly prepared to analyze empirical data. Despite the name, statistical physicists do not learn any statistics. The founders of statistical mechanics, namely Maxwell and Boltzmann, were influenced by then-contemporary ideas on social statistics, as found in e.g. Quetelet and Buckle, but that was long ago, and the lesson was that small-scale random processes can produce stable, nearly-deterministic patterns at the large scale, which we would now say is probability theory, not statistics. (Not that we do such a good job of teaching or learning probability theory, either.) However, more perhaps than other parts of theoretical physics, statistical mechanics relies heavily on simulations. Interpreting Monte Carlo results is like interpreting empirical data, but with the crucial difference that the volume and quality of the pseudo-data is limited only by your patience. If your results are unsatisfactory, you are almost always better off either refining your simulation, or just running it longer, than you are improving your analytical procedures. Real data is not like this.
(I doubt it helps matters that many statistical physicists are in the grip of sub-Bayesian ideas about maximum entropy, but I can't, honestly, say that this has played a direct role in the development of econophysics. It has, however, contributed to susceptibility to some dubious ideas, like Tsallis statistics, but that is really another story for another time.)
Worse, over the last half century or so, the statistical mechanics community has devoted much of its research energy, and its pedagogy, to the theory of phase transitions, such as those between solids and liquids, or liquids and liquid crystals, or between magnetic and non-magnetic materials. (Onsager, incidentally, contributed fundamentally to the theory of phase transitions for both liquid crystals and ferromagnets.) I don't mean that this is bad for statistical mechanics, because phase transitions are important and the theory developed around them is one of the jewels of the field. But phase transitions have the weird property of "universality". In the vicinity of the critical point, the behavior of the system comes to depend only on a few key parameters, so that any two systems in the same "universality class" are quantitatively similar near the transition, even if they are otherwise as different as chalk and cheese. If what you are interested in is this behavior near the critical point, then, you can get away with analyzing or simulating ridiculously over-simplified models, if only you get their universality class right. The implicit lesson is that details don't matter, and results on toy models should generalize directly to real systems. (Of course, details can matter a lot, even with toy models.) I can't make myself believe it's coincidence that so many of the people active in econophysics come from a background in the theory of critical phenomena.
What statistical mechanics does give its students is a way to approach the "many-body problem", a set of techniques for deducing the macroscopic patterns which are produced by the interactions of large numbers of material bodies. These techniques do not try to exactly find the consequences of all the interactions and feedbacks, but rather make some probabilistic assumptions and then work out the typical consequences at the macroscopic level. Thanks to the laws of large numbers, the odds are overwhelmingly good that the actual behavior of the system will be incredibly close to this typical result, at least, as we say "in the thermodynamic limit" where the number of bodies goes to infinity (but the density stays finite). On a smaller scale, these techniques let us calculate fluctuations around the typical values; one of the things that makes phase transitions interesting is that there the fluctuations dominate. (The probabilistic basis of fluctuation theory is not the law of large numbers, but rather the large deviations principle.) Let me emphasize, as a once and future statistical physicist, that this is a really exceptionally powerful and beautiful body of theory, which is capable of explaining everything from why sugar cubes dissolve in coffee and the machinery of cells to the evolution of the stars, and the heart of it, again, is solving many-body problems, which means calculating the macroscopic consequences of microscopic interactions.
If econophysics is dignified enough to have a tragic flaw, it is this. I have lost count of the number of times I have heard other statistical physicists insist, or explain, or just assume, that ecology, or evolution, or neuroscience, or, social networks, or, yes, economics, "is, after all, just another many-body problem", so of course it must yield to the insights of statistical mechanics. This is why our conquistador spirit leads us to make assaults on these disciplines, and not, say, classical philology. I don't even think that this is wrong. I think the problem is that we have a drastically impoverished notion of bodies, and how they might interact.
Let me illustrate this by talking about the minority game, which I mentioned a while ago. This is a simplification of a problem originally posed by the economist Brian Arthur, called the El Farol Bar Problem, which goes (in my own phrasing) as follows:
There are 100 people who like to go out in Santa Fe, and there is only one bar worth going out to, namely El Farol. Sadly, if more than 40 people show up on any given night, it gets too crowded and fight breaks out, and everyone who went would have been happier staying home and looking up at the stars. How do you decide, on any given night, whether or not you should go, without knowing who has already decided to go?Two econophysicists --- Challet and Zhang --- simplified this into the minority game. There are still only two choices in each round of the game, imaginatively called "0" and "1"; whoever picks with the minority gets rewarded, while the majority gets nothing. The utility-maximizing outcome would be to have fifty-percent-less-one in the minority in each turn, so the collective efficiency can be measured by how far the group departs from this. Arthur's original paper on the El Farol problem included a rather complicated model of inductive learning. Challet and Zhang instead assumed that each agent remembers which action won on the last m rounds, and feeds this sequence of 0s and 1s into a look-up table, which in turn tells it whether to chose 0 or 1. Their original paper had a complicated evolutionary scheme for updating these look-up tables or strategies, but almost all subsequent work on the minority game has just said that agents have k different look-up tables, keep track of which one would have done the best, and use that one.
The literature on this little model is immense. Considerable effort and ingenuity have been put into deriving all manner of supposedly "universal" results, about the length of memory (m) which maximizes global efficiency, for example, the formation of "crowds and anti-crowds", etc., etc. Some of this work is actually quite clever, and worthy of approbation. Much less worthy are the claims that these findings apply to financial markets, congestion problems, and even a general theory of what all systems of multiple adaptive agents must be like. (The latter claim actually exists in several different versions, at least one of which is simply a rediscovery of the first elements of large deviations theory.) Much of the appeal of the model has rested on the particular form of the decision rule, which is comparatively easy to analyze if you know a lot about disordered spin systems, one of the core topics of modern statistical mechanics.
What appears to have entirely escaped the notice of everyone working on the minority game is that way people make choices and predictions in competitive situations has been extensively studied in experimental psychology and experimental economics (which has existed for some time now). The results look nothing at all like their chose-a-look-up-table models. If there was some kind of result saying that any plausible model of human choice was in the same universality class as these models, and we only cared about scaling relations near the critical point, this wouldn't matter, but of course no one has any such theorem, nor do we only care about critical scaling. No one has even looked into whether these sorts of look-up tables produce behavior which is even remotely similar to more realistic models. There is exactly no reason to believe that they do — if anything there's evidence that they don't — and every reason to suspect that this whole sub-sub-field is at best a mathematical curiosity. The closest thing we have to a study of the statistical mechanics of even vaguely-realistic adaptive agents is, in fact, the work of economists. So the fact that something is, or can be treated as, a many-body system doesn't mean that the bodies are of the kind we know from our elementary textbooks.
At this point, if I have any readers left (which I admit is unlikely), the economists among them will be nodding complacently, especially the smug neo-classical ones. (To be clear, it is the smugness which causes complacency, not neo-classicism.) "After all", they are saying to themselves, "we've been building a science of 'micromotives and macrobehavior' ever since 1776. How arrogant of these physicists to ignore our works, and how unsurprising that their own have come to nothing." (The math of neo-classical economics is different from that of statistical mechanics, but it's not harder. If you can learn how to do renormalization group calculations, you can learn the Arrow-Debreu model, or game theory.) The arrogant, cultivated ignorance of physicists is indeed reprehensible, but I don't want to give the economists a pass.
To begin with, mainstream economics is clearly false. I don't say this just because perfectly competitive markets aren't the only economic institution in this world; the neo-classical framework now includes very sophisticated theories of imperfect competition, imperfect information and non-market institutions, and these developments are mainstream enough to result in Nobel Prizes (in, e.g., 1993, 1994 and 2001). The foundation on which the neo-classical framework is raised, though, is an idea about rational agents: rationality means maximizing expected utility, where expectations come from maintaining a coherent subjective probability distribution, updated through Bayes's rule; moreover, the utility function is strictly self-regarding. This is a very well-specified idea, readily formalized in clean and elegant mathematics. Moreover, there's pretty much only one way to formalize it, which makes the mathematical modeler's life much easier. All of this appeals to certain temperaments, mine very much included. Alas, experimental psychology, and still more experimental economics, amply demonstrate that empirically it's just wrong. We are boundedly rational, and, for good or for ill, we give a damn about others. Moreover, there are very general reasons, having to do with the computational intractability of optimization problems, and the severe limitations of computationally feasible Bayesian learners, to think that no creature could ever be a "rational agent" in the neo-classical sense. Bounded rationality is the only kind we encounter, and the only kind we are going to encounter. (Yes, yes, there are selectionist arguments a la Friedman or Alchian: they fail. But this post is supposed to say mean things about physicists, not the Chicago School, so another time. Also, I'd say it's an abuse of the language to describe our deviations from the impossible neo-classical ideal as "failures of rationality", but, again, another time.) But, as I said, all the rest of the neo-classical framework rests on this conception of individual decision-making; remove it and all the models are standing on air. So: neo-classical economics is false.
Why then is it not totally crazy to pursue mainstream economics, and why do I fault physicists for not bothering to learn it? Well, classical physics is false, too: the combination of Newton's laws of motion with Maxwell's equations for electromagnetism straightforwardly predicts that ordinary atomic matter, composed as it is of moving charged particles, should be unstable. The theory predicts that every material body should rapidly collapse in an ultraviolet flash; manifestly, the theory is false. [2] Nonetheless, physicists know very well that classical physics is an extremely good theory in the right limits, and that a lot of the situations of practical or intellectual interest in the world are in those limits.
In a similar way, I think, the neo-classical ideal is a tolerable approximation in certain limits. Since I am writing a blog post, and not a treatise on economic methodology, I will be vague about specifying those limits, but one aspect ought to be low computational complexity, relative to the cognitive capacity of the decision-maker. (It's no accident that Herbert Simon combined general arguments about bounded rationality with both experimental studies of choice and expertise and computational models of cognition.) A lot of the organization of society, especially modern society, can be usefully seen as ways of drastically simplifying and restricting decision problems, so that bounded human rationality can cope with them. (This, at least as I understand it, is a big part of Hayek's argument in "Economics and Knowledge" and "The Use of Knowledge in Society".) Homo economicus would would also easily solve these problems, so neo-classical models don't do a bad job, provided we also don't have to worry about other aspects of human motivation. (All over the world, neo-classical predictions fail in the ultimatum game not because the game is hard but because people care about fairness.) It's possible, I think, to get a glimpse of what a real economics, going beyond neo-classicism in both its assumptions about individuals and about institutions, would look like; to get that glimpse, one reads Sam Bowles's Microeconomics.
So much for all of the ways in which there isn't enough "econo-" in econophysics. Let me also complain that there isn't enough physics: the repertoire of ideas taken from physics is very impoverished. Basically, we see random walks, power laws, and spin systems over and over again. These are important ideas, but they're just a small part of theoretical physics! To give an example, Eric Smith and Duncan Foley have a fun paper working through detailed mathematical analogies between the axiomatic versions of utility theory and thermodynamics, leading to a reversible "engine" that runs on credit. (Disclaimer: Eric is a friend.) Or: it's obvious, once pointed out (by K. Ilinski ), that prices and discount rates form a gauge connection, a mathematical object we study to death in field theory. These ideas have met with, so far as I can tell, absolute indifference from econophysicists.
Or again: a large part of modern statistical physics is concerned with pattern formation (in no small measure inspired by Turing, though anticipated by the great, semi-crazy Nicholas Rashevsky [3]). There are now very interesting models of urban morphogenesis, explicitly drawing on this work, and at the same time grounding these large-scale patterns in the market interactions of individual decision-makers. These are not, however, due to econophysicists, but to economists, notably Fujita, Venables, and Krugman. (This is from the forgotten days before Krugman's encounter with the eldritch horror that is the mendacity, malevolence, incompetence and disconnection from reality of the Bush Administration turned him into the Grand Heresiarch of the Ancient and Hermetic Order of the Shrill.) So far as I can tell [June 2006], this work has been cited exactly three times on the arxiv (once by an economist, and once by me).
In fact, the most interesting physics-inspired idea now going in economics is due to a rather orthodox, if imposingly erudite, economist, namely John Sutton. In his work on industrial organization since the mid-1990s, he's actually come up with a genuine innovation in economic methodology, explicitly inspired by an analogy with thermodynamics. Consider, he says, trying to calculate the efficiency of a heat engine. One way would be to set up a very detailed mechanistic model, incorporating condensation, combustion, friction, etc. You'd have a huge number of parameters, but if you could estimate them you could, in principle, solve the model and calculate the efficiency of the engine. The thermodynamic approach is instead to come up with a result which is valid for any heat engine, and depends on only a few (two) parameters — but is only an inequality, an upper bound on the efficiency. This is the lesson Sutton takes from thermodynamics, not a desire to find an enthalpy. In his work on the evolution of industrial structure, accordingly, he avoids the usual econometric path of specifying a detailed model, fitting all the parameters, etc.; instead he finds simple inequalities which must hold across large classes of models, if certain basic assumptions are right, and then show that those inequalities are in fact satisfied. It is, if a non-economist can say so, brilliant stuff. (If you want to know more, at great length, with abundant historical and technological detail, apt yet obscure quotations from R. A. Fisher, etc., read his Technology and Market Structure; if you want to know more, with methodological reflections and amusing anecdotes, read Marshall's Tendencies; and if you just want a review paper, read this.) Naturally, this has made no impact at all on econophysics, though Sutton has published a paper in Physica A, which sometimes gets cited as a reference for Gibrat's Law.
To sum up my rant so far: our current econophysics is not very good. Nonetheless, we could have a better econophysics, if the physicists in question would bother to learn more about the kind of bodies whose macroscopic behavior they are supposed to be modeling. So why don't we have this better econophysics? This is a very hard question, because it's essentially a causal, counter-factual one: what would have to have been different in order to lead to the better econophysics? At best, I can make semi-informed guesses here, and I shouldn't pretend that they're anything more than that, though my tendency to strident over-statement may make it sound like I think I've got the one true explanation.
I have already suggested some aspects of the disciplinary culture of physics, especially theoretical statistical physics, which contributed to the problem. That is, if statistical physicists knew more about statistics, had fewer presumptions about universality, and were willing to learn more from others, they could do better econophysics. This may go some way towards explaining why they do bad econophysics, but I don't think it explains why they do econophysics. Here I'd speculate that there are two forces at work, one pushing physicists to do things which were not-physics-as-we-know-it, the other pulling them towards doing stuff about financial markets.
The first is the "ecological" crisis of physics as an academic discipline: the rate of production of new physics Ph.D.s has for some time greatly exceeded the rate at which professorial jobs for them arise. The classical Malthusian resolutions of such a dilemma are war, disease, emigration, misery and "vice", i.e., birth control. Physicists have not taken to slaughtering each other to secure faculty positions (outside of Dorothy Sucher's very amusing mystery novel, Dead Men Don't Give Seminars), nor are there many illnesses which reduce our numbers. There are deep institutional incentives which keep us from limiting our reproduction, but physicists who go into industry, as opposed to academia, do not train up new physicists in turn, so they are effectively sterile, and one might regard this as the analog of Malthusian vice.
But what about emigration? That would correspond to entering a new field, where there was less competition. But, once you have gotten all the way through a Ph.D. in one field, retraining yourself in another, starting from the basics, is a deep pain on many levels. (Trust me on this one.) Much easier, emotionally and practically, to decide that the skills and ideas you have acquired, at so much cost, are actually just what's needed to make progress in the other field, the one which seems so attractive. In fact, if those ideas are fairly new ones, which seem very powerful and have been received with a lot of excitement in the field, it is very natural to want to push them even further, to see just how far they can go. ("Too much of a good thing is wonderful".) If they seem to succeed in the new area — either because they really work, or because the people applying them don't know enough about the area to recognize failure (as has been suggested) — then the usual processes of social learning will lead to more "emigration" into the new area. Thus, in part, the rise of "complex systems" within statistical physics, and the many conversations I recall with colleagues on the theme that the appropriate domain of physics was whatever could be studied using our methods, not just "matter and motion".
(Incidentally, a very similar story could be told of the rise of "cultural studies" in literature departments: responding to over-crowding in the original problem-area by deciding that existing, but still newish and cool, tools and ideas should be applied over a much broader domain. Network analyses of super-heroes (1, 2) or of EuroVision are, as it were, our equivalent of the proverbial deconstructions of Batman. But one could argue that classical deconstruction itself was much more like string theory.)
So, we have ecological pressures forcing physicists out of physics and into complex systems. Econophysics is a part of this general movement. But why econo-physics? Why not, say, eco-physics, following the pattern set by Bob May in the 1970s (to say nothing of Lotka in the 1920s)? Well, we have some ecophysics too, of course, but on nowhere the same scale as econophysics, despite the great scientific and practical importance of ecological problems. Something seems to make econophysics more attractive.
This could just be chance. Certainly I haven't constructed a good neutral model of the situation, which would have to include herding, and shown that the growth of econophysics has been greater than it predicts, which is what I ought to do before trying to explain a phenomenon which might need no explanation. (At least, that is what I ought to do according to myself.) But, with that caveat noted, let's suppose that there is something to be explained here.
One thing which might make econophysics especially attractive among complex systems fields is that, over the last few decades, we've seen a new sink for industrial employment of physicists, in finance. Following the rise of the Black-Scholes formula for option pricing, which is closely tied mathematically to path integrals of Gaussian processes, there has been a lot of demand for physicists as "quants" or "rocket scientists" in the financial industry. (A truly astonishing, and depressing, fraction of the people I went to graduate school with are now working for banks or investment funds.) This may have made physicists interested in complex systems especially apt to think of applications to finance; if so, it's at least a little ironic, because Black-Scholes is all about neo-classical equilibrium, efficiency, Gaussian distributions, etc.
Personally, and on the basis of no systematic studies whatsoever, I tend to discount this in favor of another, even less edifying factor. I have posted about this before, but it bears explicit repetition. Wolfgang Beirl put it very nicely: "cointegration of multi-agent research networks with financial markets and in particular the Nasdaq stock market bubble".
It will be based on a simple search of the word "market" in abstracts of papers stored at xxx.lanl.gov for the years 1998--2004 with the following results:(Extending the series, I get 105 for 2005 and 124 for 2006.) Less formally, the rise of econophysics coincides with a period when the whole damn culture went ga-ga over the financial markets (or at least the relevant stratum of the culture did). As Robert Shiller can explain to you, this was part of the "naturally-occurring Ponzi process" which gave us the bubble, or more accurately bubbles. Physicists began obsessing over the stock market when everyone began obsessing over the stock market; it's just that their obsession took the form of papers rather than day-trading.
1998 25
1999 49
2000 70
2001 108
2002 94
2003 94
2004 79
I don't know about you but it leaves me somewhat depressed. We don't, yet, have a really good general way, or even a decent one, of understanding how the interactions of lots of adaptive agents produce social phenomena. It would be very nice to understand this, both intellectually and because it might (sometimes, a little bit) help to keep us from making such a mess of things. (Do not take that for a plea for social planning, or rule by an enlightened elite of scientists, or some such bullshit.) Economics seems like a good place to start building such a science, largely for reasons of data and concreteness. Physicists have as much right to contribute to such a project as anyone else, and statistical physics really has discovered a lot of useful things about many-body problems. I would love to see a statistical mechanics of social and economic behavior. Even partial solutions of the social problem would drastically transform and improve our mathematical theory of many-body systems. For all these reasons and more, I would really like there to be a good and successful econophysics. Nonetheless, despite a lot of activity by many smart physicists, there isn't. This seems like a waste, but also not something I can do much about, except to hope piously that the usual self-correcting mechanisms of the scientific community will come into play.
[1]: Bachelier's
random-walk
theory of financial market prices actually predates
Einstein's
work on Brownian motion, which was done independently. Bachelier's work
was rediscovered in the mid-20th century by the economists who elaborated the
efficient market hypothesis, since it predicts just this "random character of
stock market prices". For that matter, Ball runs smack into one of the endemic
confusions in this area, which is over the word "equilibrium". In dynamics,
"equilibrium" just means a steady state, or fixed point, where no variables of
the system are changing. (In statistical mechanics, this sense is
strengthened: only steady states
of minimum
free energy, obeying the
principle of detailed
balance, are true equilibria.) The economic concept is related but more
subtle: an equilibrium is a fixed point of strategies, where no agent
would want to unilaterally change its decision rule. Prices can
certainly fluctuate in economic equilibrium, either in response to new
information and external shocks, or without them, along a path which ensures
that at no point would anyone be better off by changing their actions. (More
exactly, no agent would think they would be better off by deviating.)
One can even
build equilibrium
models of economic growth.
[2]: The stability of matter is
actually depends on quantum effects, but remains a very hard
problem. There is a nice account in
Krieger's Constitutions
of Matter, with about as little technical detail as possible (which
is still a lot).
[3]: The full
story of Nicholas Rashevsky and his "mathematical biophysics" movement,
centered at the University of Chicago in the 1930s and 1940s, is an important
subterranean theme in the development of artificial intelligence, neural
networks, social networks, quantitative modeling in biology and social science,
etc., as well as being a fore-runner of such later movements as cybernetics,
general systems theory and complex systems. Unfortunately, and curiously, I
can't find any full-length studies of the man or his movement, and he doesn't
even have a Wikipedia entry. Dover Books used to publish a reprint of
his Mathematical Biophysics, but that, too, has been out of print
for a long time now.
Manual Trackback: The Statistical Mechanic; Nanopolitan; Crooked Timber; Uncertain Principles; Cosmic Variance; Ars Mathematica; The Abstract Factory; Thoughts on Economics; Vlorbik on Math Ed; Juan de Mairena
Posted by crshalizi at September 15, 2007 16:40 | permanent link
Via fellow ex-physicist Kristina Lerman (one of the few people who's said something mathematically sensible about swarm systems), the call for papers of what promises to be a very interesting symposium, on a topic whose interest needs, for the present audience, no elaboration:
Social Information Processing
March 26-28, 2008
Stanford University, California, USAThe label 'social media' has been attached to a quickly growing number of Web sites, such as blogs, wikis, Flickr, and Del.icio.us, whose content is primarily user-driven. In the process of using social media sites, users are creating content and adding metadata in the form of: (1) tags: content annotations using free-form keywords; (2) ratings: passive or active evaluation of content; and (3) social networks: where users designate others as friends so as to track their activities. The connections between content, users and metadata create layers of rich interlinked data that will revolutionize information processing. New applications will include personalized information discovery; applications that exploit the 'wisdom of crowds,' for example, emergent semantics and collaborative information evaluation; deeper analysis of community structure to identify trends and experts, and many others.
Social media facilitate new ways of interacting with information - what we call social information processing. Social information processing allows users to collaborate implicitly by leveraging the opinions and knowledge generated by others. In addition to collaborative problem solving, social information processing may lead to wholly new kinds of knowledge, that emerge from the distributed activities of many users.
The symposium will bring together researchers from academia and industry, who are interested in the emergent field of social information processing. We are soliciting papers that present recent results, as well as more speculative presentations that discuss research challenges, define new applications, propose methodologies for evaluating and the roadmap for achieving the vision of social computing.
Important dates
- October 5, 2007: Papers due
- November 2, 2007: Notifications of acceptances mailed out
- March 26-28, 2008: Spring Symposium Series, Stanford University
If this sounds intriguing, check out Kristina's page on the symposium for more details, submission instructions, etc.
The Collective Use and Evolution of Concepts; Networks; Complexity; Incestuous Amplification
Posted by crshalizi at September 14, 2007 11:30 | permanent link
I'm giving two talks in New York state in the first week of October.
I'll also give a "how to get a job like mine" talk to Chris's class for advanced undegraduate applied math majors. I will try to limit the amount of wrongthink I dispense, but I doubt I'll be able to resist the temptation to show them questionable pictures.
I have, to my embarrassment, gone this far in life without ever having been to New York City. Drop me a line if you'd like to get in touch, or have some suggestions on what a traveler from the provinces should do on his first visit to the world capital.
Posted by crshalizi at September 14, 2007 11:15 | permanent link
Posted by crshalizi at August 31, 2007 23:59 | permanent link
No, this is not another rant about people who hallucinate power laws; this is about social networks which are constituted by sharing the same (or, rather, similar) delusional beliefs:
The abstract doesn't leave very much for me to say about this. The initial sample of web-pages were presented to psychiatrists who didn't know about the larger purpose of the study, without identifying information, in standardized format, etc., and the mind-control pages were very clearly and easily distinguished from the others as delusional. Having established that, Bell et al. then did snow-ball sampling to gather a network of pages, starting from the ones on mind control, and compared three aspects of that network (average distance, clustering coefficient, and degree centralization) to three controls: three of the classic small-scale social networks, and one randomized network of the same size and link-density as their mind-control network. The latter looked rather more like the real social networks than the random control. Conclusion: these people are, in fact, organizing themselves around their shared delusional beliefs. (This is not a surprise to anyone who has looked at these sites, or even read reasonable newspaper articles.) Absent a really good way to argue that this is not a sub-culture, which is not forthcoming, it follows that, under the DSM criteria, these are no longer delusions.
(Technically, an Erdos-Renyi network makes a poor control, because it's just so different from anything people put together either online or in the realized world. I would be happier if they had used randomized networks with matched degree distributions. Even better would have been to pick ten random, unrelated websites and do snowball sampling from them, and use that as the control. While this would be nicer, I doubt it would change the result.)
I've long been interested in fringe beliefs and how people use stories to sustain communities, so this is really interesting to me. If someone somehow acquires a very rare belief, very far from the mainstream of their neighbors, back in the old days there were fairly few circumstances under which they could express it and develop it without being shot down and/or put away. It was pretty hard to find a social space which would shelter such thought. This is one reason religious innovations often (but, of course, not always) come from socially marginal and/or isolated settings, where it has been easier to get away with saying wild things and get a critical mass of like-minded people together. With the combination of cheap, anonymous electronic communications and decent search engines, this is changed drastically: the holders of very rare beliefs can find like-minded partners, and create social spaces where those beliefs can be cultivated. (Thus, they do not have to find social spaces which are generally accepting of arbitrary unconventional thoughts or acts.) Not surprisingly, then, the stories these people tell about their experiences with mind control are much more similar to each other than are the delusions of control expressed by lone psychotics. So the same principles which make the Internet a haven for every conceivable fetish, even those which are (literally) exponentially unlikely, also contribute to the social elaboration of heresies and delusions. Mind-control fetishists even prove that the intersection of these sets is non-empty. (Go ahead, the link's just to Salon.)
Of course, we have long had social networks where membership depends on commitment to certain rather rare beliefs, and which revolve around the cultivation of those beliefs and making them even more outlandish; we call them "scholarly disciplines". The latter, however, at least most of the time, are actually pursuing cumulative processes of inquiry generally yielding reliable (if approximate) knowledge, and my hunch is that this is because they are responding to things like "evidence", as well as the social forces within the group and whatever physical communities its members happen to reside in. It's early days, of course, but I would be very surprised if the community of mind-control believers also develops reliable knowledge about mind control, if only because they are not actually being mind-controlled. So there should be fascinating opportunities for philosophers and sociologists of science to see what difference it makes to an epistemic community whether what it studies is real or not. To repeat a joke, psychoceramics is to social epistemology what lesion studies are to neuropsychology...
Manual trackback: Chrononautic Log; Mind Hacks
Psychoceramica; Networks; The Collective Use and Evolution of Concepts
Posted by crshalizi at August 30, 2007 17:20 | permanent link
Ladies and gentlemen, Buffon, the greatest of 18th century naturalists (and pioneer of Monte Carlo estimation), on the cat:
The cat is an unfaithful domestic, and kept only from the necessity we find of opposing him to other domestics still more incommodious, and which cannot be hunted; for we value not those people, who, being fond of all brutes, foolishly keep cats for their amusement. Though these animals, when young, are frolicksome and beautiful, they possess, at the same time, an innate malice, and perverse disposition, which increase as they grow up, and which education learns them to conceal, but not to subdue. From determined robbers, the best education can only convert them into flattering thieves; for they have the same address, subtlety, and desire of plunder. Like thieves, they know how to conceal their steps and their designs, to watch opportunities, to catch the proper moment for laying hold of their prey, to fly from punishment, and to remain at a distance till solicited to return. They easily assume the habits of society, but never acquire its manners; for they have only the appearance of attachment or friendship. This disingenuity of character is betrayed by the obliquity of their movements, and the duplicity of their eyes. They never look their best benefactor in the face; but, either from distrust or falseness, they approach him by windings, in order to procure caresses, in which they have no other pleasure than what arises from flattering those who bestow them.
Clearly, a profound difference in episteme separates the author of Natural History, General and Particular from ourselves.
(Via Light Reading)
Posted by crshalizi at August 24, 2007 16:00 | permanent link
Last summer, I was conned into arranging happily
volunteered to organize a small session on complex networks at the Joint
Statistical Meeting, and naturally invited people whose only real connection
was that they study networks and I found their work interesting. I am
immensely pleased to be able to say that one outcome of this is a new joint
paper between two of the speakers, who I dare say would not have collaborated
if I hadn't brought them together in Seattle:
Roughly speaking, the idea behind the (horribly-named) "generalized block models" of networks is that the nodes (people, countries, companies, proteins) can be broken up into a certain number of discrete classes, or blocks. All of the nodes in a given block have the same kind of pattern of connectivity, meaning that they connect to other nodes in other blocks, or within the block in the same kind of ways. Put like this, it sounds (like many network concepts) viciously self-referential, but then so does page-rank. Still, its resemblance to some aspects of structuralist anthropology is no accident, as many sorts of kinship-groups and castes are similarly defined by how they relate to other kinship-groups and castes, which in turn are defined by their relations. [Update: see below.] The conventional ethnographic approach to figuring out such groupings and their relationships might be caricatured as a combination of asking your native informants how they see the matter, combined with more or less inspired guess-work on the part of the ethnographer. This does not work so well when dealing with, say, protein networks, since it is hard to find a native informant among the proteins. Even with social networks, it is not at all obvious that the members understand all the different roles people play in the network (though they may think they do), so block-modelers generally rely less on self-presentations, and more on trial-and-error efforts to find blocks that fit the connectivity patterns.
Rather than floating off into the empyrean, what Jörg and Doug have done here is to develop an algorithm for automatically extracting a set of roles and their inter-relations from the observable ties in the network. While their idea of structurally equivalent roles comes from mathematical theories of social structure (see below), the algorithm grows out of earlier work by Jörg and Stefan Bornholdt on decomposing networks into nearly-independent communities or modules (cond-mat/0402349 and cond-mat/0603718), itself part of a recent burst of work on that problem by statistical physicists.
What Jörg and Stefan realized is that the problem of finding modules could be mapped on to a problem in statistical mechanics, called the Potts model. Imagine that each node can have one of a certain number of colors. When two nodes share the same color, they are in the same module or community. The Potts model also has nodes with multiple colors, which can interact in one of two ways. Two nodes can have an attractive ("ferromagnetic") interaction, which means it is energetically favorable for them to have the same color; or they can have a repulsive ("antiferromagnetic") interaction, which means it is energetically favorable for them to have different colors. The problem then is to find the assignment of colors to nodes which gives the best over-all value of the energy. This is hard to do exactly, because different interactions can push the same node in different directions (cf.), but one can quite rapidly find configurations which come close to the minimum, through the magic of simulated annealing. (If you play around with this Java applet, cooling it slowly, you get a bit of a sense of how that works.)
In the original modularity papers, what Jörg and Stefan did was introduce an attractive interaction between two nodes if they were more tightly linked to each other than average, and a repulsive interaction if they were less tightly linked. (See their papers for a more exact statement.) This let them very quickly discover significant modular structure in very large networks. It also nicely adapts to more information-theoretic notions of "connection".
In the new paper, they employ essentially the same idea. Given a "role model" of blocks and their connections, links, or lacks of links, which follow what one expects from the model are energetically rewards; those which deviate from their assigned roles are energetically dis-favored. Optimizing the total energy then gives the best-fitting assignment of nodes to roles. (Community discovery re-emerges as a special case, when each block ideally connects only to itself.) This still leaves the problem of discovering the roles and their relations from the data; here they use what I can only describe as a very clever trick, which I will not attempt to explain in this space.
As an illustration, they apply this new method to analyzing a part of the international flow of commodities. Amusing though it is to see patterns of combined and uneven development pop out of adjacency matrices, I keep wondering what could be done with this technique and good data on brains...
Incidentally, while Jörg is a statistical physicist, Doug has been studying social networks, and doing anthropological fieldwork, for longer than I've been alive; so I'd very much like to know whether their paper would be a black dot or a white one in the graph showing the near-disconnection of the two approaches to networks.
Update, that afternoon: Just to clarify, block modeling, and the notions of structural and regular equivalence underlying it, are autonomous developments of sociology and anthropology in the 1970s and 1980s, and they do in fact grow out of structuralist anthropology. In Doug's words (from an e-mail which he was kind enough to let me quote):
The first definition that I know of for structural equivalence, which is what we measure, was in an MA thesis by Fran&ccdeil;ois Lorrain, 1968, Ecole des Hautes Etudes en Sciences Sociales, Paris, under Guilbaud and Lévi Strauss, showing up an English version in Parts I and II, 1968 and 1969 MS, Harvard, "Tools for the Formal Study of Networks." and in Lorrain's Harvard PhD thesis (1969). From there it travelled to Lorrain and Harrison C. White's 1971 article while Lorrain was a graduate student at Harvard ["Structural Equivalence in Social Networks", Journal of Mathematical Sociology 1: 49--80], then to H. White et al under the name of blockmodeling... and on through a long series of sociological publications and those in related fields, only a few of which we cite in our paper. ** Lorrain was working on a more general algorithm of for "correlative classes" defined not in their own terms, or in terms of attributes, but in terms of relations that connect their elements, a concept found in Ossowski (1963, Class Structure in the Social Consciousness, Routledge and Kagan Paul). Now called "regular equivalence" as in my 1983 paper with Karl Reitz (a term linked to earlier work of Claude Flament), algorithms for computing regular equivalence (or "correlative") classes were also developed in the fields of logic and computer science (e.g.. by Higgens (1971), working with groupoids and category theory, similar to the work on Lorrain, and again by Stark (1972 in automata theory) under the name of dynamic logic (Harel et al. 2002). Mark and Masuch's (2003) article in Social Networks provides a review and further formalization.Significantly, the weightings used in Reichardt and my Potts' method for finding role model structures in a network of multiple relations can be adjusted to identify regular equivalence of positions.
So, never let it be said that structuralist anthropology was completely barren of fruit; it's just the productive part had to do with social organization and not with culture — as Ernest Gellner used to say, who you can marry, as opposed to what to wear to the wedding.
Posted by crshalizi at August 22, 2007 12:10 | permanent link
Attention conservation notice: Unpleasant recent bookmarks, in no particular order.
Health insurance companies: more morally culpable than flesh-eating zombies or blood-sucking vampires; for example. (But really, zombies should eat brains.)
Mississippi: sacrificing sixty-five black babies a year to the Moloch of tax cuts, the abomination of the sons of Grover Norquist. You may guess which of our parties opposses suppressing this practice, and why.
"Deceiving us has become an industrial process": your bought-and-paid-for climate-change denial operation at work. When you drive alone, you drive with the Competitive Enterprise Institute. (And, because I get tired of pointing people to it individually, read the damned IPCC report already.)
An entire city "bids for the Darwin Award".
As earlier remarked, a post-apocalyptic wasteland now covers approximately one-sixth of the habitable Earth, and what would one of those be without a multitude of insane new cults?
The Supreme Court has decided that, in fact, Strom Thurmond had the right taken on Brown vs. Board of Education. I eagerly await their reinterpretation of Loving vs. Virginia.
In a late-breaking development, The Party of Fear, the Party without a Spine, and the National Surveillance State. Which, apparently, does not go far enough.
An example of our "national security letters" at work. (Via.)
Our government's renewed authorization of torture. Our proud precedents. (Via, and.) "So in summary, what they've hit upon is a protocol based on the best practices developed by Soviet and medieval torturers alike to accomplish torture's traditional goal -- the extraction of false confessions -- and seem to have wound up with a bunch of false confessions. Which, of course, is precisely what you'd expect to wind up with if you thought for a minute about why governments have, historically, resorted to the systemic deployment of torture." These are war crimes.
Some scenes from within our latest experiment in utopian social engineering. More. Counting the eggs broken to make this lovely omelet. (Did I mention war crimes?) Those who threw in their lot with us are going to be shamefully betrayed, of course.
The Army isn't even planning to fight the last war. Some diagnoses from someone who's never going to make colonel, and someone who got out. The blame will, of course, be placed elsewhere.
"An engaged couple reflects on their future together on a beautiful Memorial Day." (Further to the theme.)
The War Prayer. (Via.)
All of this, of course, is part of the decay of the Republican Party' once-proud traditions of statesmanship.
An incredibly depressing look at a tiny sliver of the Yugoslav civil wars and their fall out. (Via.)
A member of the foreign policy community — evidently smart and well-intentioned — is "SHOCKED" when one of the senior members of her field gets questioned in an open forum. If that's really an accurate reflection of their intellectual norms, well, pardon me if I'm not impressed, and not surprised that lessons go unlearned, and we are somehow debating the wisdom of using nuclear weapons against a bunch of criminals in caves. But, of course, the real problem is unauthorized people complaining about catastrophically bad decisions, not the bad decisions themselves. The idea that some of the strengths of democracies are that everyone can criticize policies and mistaken policies can be changed by argument and voting appears to have sunk from view. Similarly, what can one say about Anne-Marie Slaughter, who is certainly no dummy, pointing favorably to the continuing public role of John Negroponte, a man who helped implement our policy of support for death squads in Central America, describing him as a "seasoned moderate"? What, for that matter, can one say about our hiring (or re-hiring) of Latin American mercenaries for the war in Iraq? "They know what we like", perhaps?
It is a further sign of our intellectual depravity that people take Bryan Caplan seriously, even when he is obviously a cheap imitation of The Onion (via). Economics does, however, have some scientific content, and does not consist entirely of rationalistic myth-making and elitist visions of the radical reconstruction of society according to abstract plans.
I sometimes think those who lament the weakening of the American moral fiber since the 1960s are right — and that the best evidence is the popularity of conservatism, or what passes for it these days. It didn't used to need reminders that "infrastructure is patriotic" (unlike torture, unchecked surveillance powers, indefinite detention without trial or charge, kangaroo courts, the aforementioned sacrifices to Moloch, etc.).
A noted English author records his observations of the domestic manners of the Americans.
Manual trackback: Amygdala.
Posted by crshalizi at August 07, 2007 12:07 | permanent link
I think I have mentioned here, before, that my mother and brother are both experimental biologists, leading to a certain "that's not real science" inferiority complex on my part. Like any schoolchild, I know about the Hodgkin-Huxley equations for nerve impulse propagation, and vaguely remember that this was derived from studies of the squid's "giant axon". But I've never actually prepared the giant axon, or clamped the voltage across its membrane to measure the current flow, etc. To help people like me keep it real, then: a nice series of video clips taken from The Squid and its Giant Nerve Fiber, made at the Plymouth Marine Laboratory in the mid-1970s, featuring performances by Alan Hodgkin on the voltage clamp and J. Z. Young on the dissecting table — actually it's a squid on the dissecting table, rather than the author of Doubt and Certainty in Science, but you understand what I mean.
(Via Light Reading.)
Manual Trackback: Pharyngula; The Futile Cycle; Neurophilosophy; Mind Hacks
Posted by crshalizi at August 04, 2007 19:40 | permanent link
The scepticism that I advocate amounts only to this: (1) that when the experts are agreed, the opposite opinion cannot be held to be certain; (2) that when they are not agreed, no opinion can be regarded as certain by a non-expert; and (3) that when they all hold that no sufficient grounds for a positive opinion exist, the ordinary man would do well to suspend his judgment.
Books to Read While the Algae Grow in Your Fur; The Pleasures of Detection
Posted by crshalizi at July 31, 2007 23:59 | permanent link
I will be speaking at this workshop, sponsored by the British Antarctic Survey, in Cambridge next month; my talk, on complexity measures, will be on Friday the 17th. I am looking forward to the conference (including, among other things, hearing Sandra Chapman talk about ecosystems and turbulence, Jörn Davidsen talk about spatio-temporal clustering, and Nick Watkins talk about albatrosses), and to finally visiting Cambridge. But the news leads me to wonder whether England will not have sunk beneath the waters by then, like another Lyonesse...
(Conference notes from Montreal forthcoming, sooner or later.)
Posted by crshalizi at July 26, 2007 19:40 | permanent link
I am currently holding my lovely, brand-new copy of John Emerson's Substantific Marrow. If you enjoy this blog, then it is very likely that you, too, want to buy a copy of Substantific Marrow, though you may not be aware of this fact. In saying this, I am not trying to sway your mind, merely to help bring to consciousness one of your own intrinsic desires.
Posted by crshalizi at July 24, 2007 13:47 | permanent link
Mohammad Zahir Shah has died yesterday in Kabul. Lawyers, Guns and Money provides a capsule summary of the history of the dynasty.* This will, in all likelihood, have absolutely no impact on Afghan politics, which had long since passed him by. A well-intentioned man, he was not a strong ruler, and had long since become merely a symbol of a better time, an era before the birth of much, perhaps most, of the current population of Afghanistan and its diaspora.
It was good that he got to go home, at the end. May he rest in peace.
*: Farley omits the fact that the man who overthrew Zahir Shah in 1973 and declared himself president of the new republic of Afghanistan was his cousin, the former premier Mohammad Daoud Khan. One may thus quibble as to whether Zahir or Daoud was the last member of the family to rule from Kabul.
Posted by crshalizi at July 24, 2007 13:13 | permanent link
Mark Kleiman, among others, is being naive:
The more I think about it, the more convinced I am that the real action is on the appropriations side. By de-funding a bunch of units the average voter never heard of and doesn't care about, Congress can bring the Bush Administration to its knees. And unlike the Iraq pullout measures, this doesn't require 60 votes in the Senate.In fact, it doesn't even require 50 votes in the Senate: just a majority in the House and a Senate Majority Leader willing to play rough.
Assume the House passes a General Government appropriation bill zeroing out the White House press office, political office, personnel office, and counsel's office. And assume that Lieberman defects and that all the Senate Republicans stand with Bush, so the Senate votes to restore those cuts. (That seems unlikely, but assume it for the sake of argument.) The House stands firm and asks for a conference. Reid and Pelosi make sure that all the Democratic conferees support the cuts. So the bill comes out of conference with no funding for those offices.
A conference report can't be amended. So Bush's supporters in the Senate couldn't put the money back. Their choices would be to (1) vote for the bill, thus de-funding those offices or (2) vote against the bill, thus de-funding the entire White House, Treasury Department, and several other agencies. The same goes for Bush: he can veto the bill, but that doesn't keep the money coming. As long as the House majority stands firm, the Democrats hold all the cards.
What on Earth makes him think that these offices will cease to have money just because Congress has defunded them? The natural response would be to denounce this as an unconstitutional interference with the unitary executive, and order them to go on as before.
Are taxes collected by the executive branch? Yes. Are Treasury bonds sold by the executive branch? Yes. Is money printed by the executive branch? Yes. Are the checks to pay for government operations cut by the executive branch? Yes. For Congress rather than the President to control the budget, executive branch employees must be so unwilling to break the law in response to orders that those orders will not be given, or, having been given, not be followed — and those employees must not just be purged until the remnant get with the program.
I'm sure most of the relevant civil servants are praiseworthy bureaucrats, but it would be interesting to know how many of the people running the OMB, for instance, think they swore fealty to the President's person.
Update, 22 July: Fixed sentence fragment.
Manual trackback: Unfogged
Posted by crshalizi at July 20, 2007 17:50 | permanent link
Los Alamos, March 1943: a Primer on how to build a "gadget".
Alamogordo, New Mexico, 16 July 1945: a gadget.

We were lying there, very tense, in the early dawn, and there were just a few streaks of gold in the east; you could see your neighbor very dimly. Those ten seconds were the longest ten seconds that I ever experienced. Suddenly, there was an enormous flash of light, the brightest light I have ever seen or that I think anyone has ever seen. It blasted; it pounced; it bored its way right through you. It was a vision which was seen with more than the eye. It was seen to last forever. You would wish it to stop; altogether it lasted about two seconds. Finally it was over, diminishing, and we looked toward the place where the bomb had been; there was an enormous ball of fire which grew and grew and it rolled as it grew; it went up into the air, in yellow flashes and into scarlet and green. It looked menacing. It seemed to come toward one.
A new thing had just been born.... — RabiAt the instant of the explosion I was looking directly at it, with no eye protection of any kind. I saw first a yellow glow, which grew almost instantly to an overwhelming white flash, so intense that I was completely blinded.... By twenty or thirty seconds after the explosion I was regaining normal vision.... The grandeur and magnitude of the phenomenon were completely breath-taking. — Serber
From ten miles away, we saw the unbelievably brilliant flash. That was not the most impressive thing. We knew it was going to be blinding. We wore welder's glasses. The thing that got me was not the flash but the blinding heat of a bright day on your face in the cold desert morning. It was like opening a hot oven with the sun coming out like a sunrise. — Morrison
We waited until the blast had passed, walked out of the shelter and then it was extremely solemn. We knew the world would not be the same. A few people laughed, a few people cried. Most people were silent. I remembered the line from the Hindu scripture, the Bhagavad-Gita: Vishnu is trying to persuade the Prince [Arjuna] that he should do his duty and to impress him he takes on his multi-armed form and says, "Now I am become Death, the destroyer of worlds." I suppose we all thought that, one way or another. — Oppenheimer
Trinity site is open the first weekend in April and the first weekend in October. You don't see much. You should go.
If atomic bombs are to be added as new weapons to the arsenals of a warring world, or to the arsenals of nations preparing for war, then the time will come when mankind will curse the names of Los Alamos and Hiroshima.
The peoples of the world must unite, or they will perish. This war, that has ravaged so much of the earth, has written these words. The atomic bomb has spelled them out for all men to understand. Other men have spoken them, in other times, of other wars, of other weapons. They have not prevailed. There are some, mislead by a false sense of human history, who hold that they will not prevail today. It is not for us to believe that. By our works we are committed, committed to a world united, before the common peril, in law, and in humanity. — Oppenheimer, 16 October 1945
Posted by crshalizi at July 16, 2007 21:45 | permanent link
Via Jay Han in e-mail comes an hour-long video of a very cool talk Marjorie Shapiro recently gave at Google on what physicists hope to learn from the eagerly-awaited LHC at CERN, in particular the ATLAS experiment, and on the way data from any high-energy experiment of this type gets made and massaged. (There are also PDF slides.) Given the audience, the emphasis is on the latter, which might sound duller than talking about strings or loops or even the Higgs boson, but which I think is really deeply impressive. Incredible challenges in "data engineering" arise when you need to design your system to keep less than one observation in a hundred thousand, and that still produces petabytes of data, which must be analyzed by a collaboration of two thousand physicists and engineers. The ways physicists achieve all this are worth pondering by anyone who has, or hopes to have, huge bodies of data concealing a rare, relevant pieces of information.
Prof. Shapiro taught Jay and me particle physics when we were both undergrads at Cal. Watching the talk reminded me of why I liked her class so much, while still making me glad I wound up in statistics.
Posted by crshalizi at July 11, 2007 22:30 | permanent link
Attention conservation notice: I know nothing about music and have no taste.
Unsurprising observation: all the performers here are, in fact, really good. Two highlights, neither of which I'd heard before:
— I am liking Montreal on my first twenty four hours' acquaintance; it seems friendly, civlized and enjoyable. The fact that my hotel has a combination bookstore-bistro on the first floor, and there is a 24/7 fresh produce stand down the block, has something to do with this. (This looks like it will be useful.) So does anticipation of the conference, conveniently not interferring with the jazz festival, which ends tonight.
Update, 8 July: two additional recommendations from the last day of the festival: Manouche (locals) and the Elizabeth Shepherd Quartet (not trio!). Disappointingly, the on-site record store did not have CDs from any of these groups (though they did have some Didier Lockwood).
Posted by crshalizi at July 08, 2007 12:00 | permanent link
But I don't know how else to feel, when dubiously legal and definitely undemocratic programs of spying on domestic political dissenters get shopped to private companies through a profoundly corrupt contracting process, and records conveniently disappear without causing any official comment. (Via Laura Rozen, who has been following this story from the beginning.) — The really depressing thing is that even if, inshallah, the GOP loses the White House, and doesn't gain the House or the Senate, in 2008, it's not clear how much of this will change. If the last sixty years of the military-indutrial complex is anything to go by, the rapidly-growing espionage-industrial complex of spooks and contractors will be very hard indeed to uproot. Wasting money on jets and battle-ships for never-going-to-happen wars is one thing, and might even be excused as Keynesianism-that-dare-not-speak-its-name, but making money out of classifying peaceful political opponents of the current administration as enemies of the state seems, not put too fine a point on it, like a danger to the republic.
Further scenes of your semi-privatized national surveillance state at work:
(Updated 11 July to fix sloppy phrasing about the upcoming elections; thanks to "shelby" at Crooked Timber for pointing this out.)
Manual trackback: Crooked Timber; Wintry Smile; Log; Three Quarks Daily
Posted by crshalizi at July 06, 2007 23:55 | permanent link
The generalissimo and president-for-life of a banana republic is talking with some favored reporters from the tame press. One of them was slightly more brave, more honest, or perhaps just more stupid than the rest.
Reporter: Your Excellency, many people are curious about the way so many of your friends have become so extremely... fortunate... since the formation of your government of national salvation.
President: For friends, anything!
Reporter: And for your enemies, Excellency?
President: For enemies, the law!
Posted by crshalizi at July 04, 2007 12:52 | permanent link
Ladies, Gentlemen, and Distinguished Others, I present for your consideration the incomparable Ursula K. Le Guin's
On Serious Literature
`Michael Chabon has spent considerable energy trying to drag the decaying corpse of genre fiction out of the shallow grave where writers of serious literature abandoned it.'* Ruth Franklin (Slate, 8 May 2007)Something woke her in the night. Was it steps she heard, coming up the stairs -- somebody in wet training shoes, climbing the stairs very slowly ... but who? And why wet shoes? It hadn't rained. There, again, the heavy, soggy sound. But it hadn't rained for weeks, it was only sultry, the air close, with a cloying hint of mildew or rot, sweet rot, like very old finiocchiona, or perhaps liverwurst gone green. There, again -- the slow, squelching, sucking steps, and the foul smell was stronger. Something was climbing her stairs, coming closer to her door. As she heard the click of heel bones that had broken through rotting flesh, she knew what it was. But it was dead, dead! God damn that Chabon, dragging it out of the grave where she and the other serious writers had buried it to save serious literature from its polluting touch, the horror of its blank, pustular face, the lifeless, meaningless glare of its decaying eyes! What did the fool think he was doing? Had he paid no attention at all to the endless rituals of the serious writers and their serious critics -- the formal expulsion ceremonies, the repeated anathemata, the stakes driven over and over through the heart, the vitriolic sneers, the endless, solemn dances on the grave? Did he not want to preserve the virginity of Yaddo? Had he not even understand the importance of the distinction between sci fi and counterfactual fiction? Could he not see that Cormac McCarthy -- although everything in his book (except the wonderfully blatant use of an egregiously obscure vocabulary) was remarkably similar to a great many earlier works of science fiction about men crossing the country after a holocaust -- could never under any circumstances be said to be a sci fi writer, because Cormac McCarthy was a serious writer and so by definition incapable of lowering himself to commit genre? Could it be that that Chabon, just because some mad fools gave him a Pulitzer, had forgotten the sacred value of the word mainstream? No, she would not look at the thing that had squelched its way into her bedroom and stood over her, reeking of rocket fuel and kryptonite, creaking like an old mansion on the moors in a wuthering wind, its brain rotting like a pear from within, dripping little grey cells through its ears. But its call on her attention was, somehow, imperative, and as it stretched out its hand to her she saw on one of the half-putrefied fingers a fiery golden ring. She moaned. How could they have buried it in such a shallow grave and then just walked away, abandoning it? "Dig it deeper, dig it deeper!" she had screamed, but they hadn't listened to her, and now where were they, all the other serious writers and critics, when she needed them? Where was her copy of Ulysses? All she had on her bedside table was a Philip Roth novel she had been using to prop up the reading lamp. She pulled the slender volume free and raised it up between her and the ghastly golem -- but it was not enough. Not even Roth could save her. The monster laid its squamous hand on her, and the ring branded her like a burning coal. Genre breathed its corpse-breath in her face, and she was lost. She was defiled. She might as well be dead. She would never, ever get invited to write for Granta now.
*NOTE: The rest of Ruth Franklin's review of Michael Chabon's The Yiddish Policemen's Union is quite thoughtful, generally positive, and not dismissive of his longing to destroy phony divisions between "genre" and "literature." I just couldn't resist the all too familiar image of her first sentence.
Comments:
(Via Ansible, via Sidelights.)
Posted by crshalizi at July 03, 2007 20:18 | permanent link
Speaking, as we were, of the Reformation, one point I neglected in my description of MacCulloch's book is the stress he lays, at several points, on the (veridical) perception that Latin Christendom was under attack by the Ottoman Turks, and they way the need to defend against the Ottomans forced the Catholic powers into compromises with Protestants, establishing at least some measure of toleration, much against everyone's will (since the Protestants would have preferred to rule, not be tolerated). Now, via Dani Rodrik, some statistical evidence:
Now, the statistics here are the kind of thing you'd expect to see from an economist (who is not an econometrcian), so I'm not entirely happy with it. I should make it clear that the things I'm about to complain about are things I could complain about for essentially any paper I read in quantitative social science. That being the case, I'll stick my grumblings down below, and just talk about why this matters. (What follows makes no pretense to originality, being merely warmed-over William McNeill, Ernest Gellner and Marshall Hodgson.)
The breakthrough to auto-catalytic intellectual, social and economic growth first took place in a fairly small part of northwest Europe, where the outcome of the wars of religion had been to establish a social space with some degree of secularism, mutual toleration and individual autonomy. Moreover, it took place in the context of intricate and intense rivalries between the different European states, where the new kinds of social power that the breakthrough made possible were eagerly seized upon as a strategic advantage, and where it was possible for unpopular or deviant thinkers (practical or theoretical) to move from one state to another, playing them off against each other. This was weird. If you look at other world civilizations, which were contemporary and certainly comparable, the trend was towards the formation of massive "gunpowder empires", which may have been more or less tolerant within certain limits, depending on the whim of the emperor (compare Akbar to Aurangzeb), but certainly did not encourage this kind of disruptive innovation. Had one of the Christian dynasties — e.g., the Hapsburgs — succeeded in establishing that kind of empire in Europe, it is hard to imagine modernity actually taking off. Had the Catholic powers at least been able to crush the Protestants, even if they remained dynastically divided, it's hard to see it being much more innovative than, say, the Muslim world, divided between the Ottoman, Safavid and Mughal empires, with assorted smaller principalties around them. Had the Protestants, by some miracle, been able to actually get what they wanted (perhaps under Gustavus Adolphus?), it would have been just as repressive. The situation in Amsterdam, London and their dependencies wasn't what anyone had set out to create; it was for the most part regarded as a regrettable realization that neither side could pound each other into submission, and so establish a proper Christian commonwealth. One of the reasons none of these scenarios took place, and so we enjoy our current approximations to open societies, was that all the Christians had to fight off the Ottomans, which in particular meant coming to those regrettable compromises which established toleration. This is actual quantitative evidence that the Ottoman threat really did have a substantialm impact, by way of making the Christians fight each other less and put up with each other more.
One could also imagine an alternative history where the Christians so completely failed to ignore the fact that the other side were damned dirty heretics who deserved to burn (in this world and the next) that the Ottomans were actually able to conquer much or most of Latin Christendom. This scenario also doesn't look good for the the great transformation, but it's worth remembering Ernest Gellner's remarks on a related counter-factual:
I like to imagine what would have happened had the Arabs won at Poitiers and gone on to conquer and Islamise Europe. No doubt we should all be admiring Ibn Weber's The Kharejite Ethic and the Spirit of Capitalism which would conclusively demonstrate how the modern rational spirit and its expression in business and bureaucratic organisation could only have arisen in consequence of the sixteenth-century neo-Kharejite puritanism in northern Europe. In particular, the work would demonstrate how modern economic and orgisational rationality could never have arisen had Europe stayed Christian, given the inveterate proclivity of that faith to a baroque, manipulative, patronage-ridden, quasi-animistic and disorderly vision of the world. A faith so given to seeing the cosmic order as bribable by pious works and donations could never have taught its adherents to rely on faith alone and to produce and accumulate in an orderly, systematic, and unwavering manner. Would they not always have blown their profits in purchasing tickets to eternal bliss, rather than going on to accumulate more and more?A M