The Volokh Conspiracy - Judge Sonia Sotomayor: What the Data Show

Much has been written about Jeff Rosen's New Republic piece, which cited anonymous sources who disparaged Judge Sotomayor's judicial ability. It is never easy to evaluate judges, or to evaluate their evaluators, especially when those evaluators insist on anonymity. Fortunately, data on judicial performance exist, and although the data have problems as well, they provide a firmer basis for evaluation.

The most complete effort so far to evaluate federal appellate judges is this paper by Stephen Choi and Mitu Gulati. Choi and Gulati use data from Lexis to measure three aspects of the judge's performance—productivity, opinion quality, and independence. They then rank judges according to how they do along these three dimensions. The data set comes from the years 1998-2000 and includes 98 judges. Unfortunately, Choi and Gulati excluded Judge Sotomayor from the data set because she was appointed in 1998 and thus does not have complete data for that year. Judge Wood is in their data set and ranks eighth in the composite ranking. (Disclosure: Judge Wood teaches at my law school.)

To determine how Sotomayor would do in the ranking, I had some research assistants collect her data for the years 1999-2001. To address the "freshman effect" (the possibility that her statistics are worse for her earliest years because of inexperience), we also looked at her data from 2006. I provide a brief summary below. If you want more detail about the methods, you will need to read the variable definitions in the paper. And if you want a defense of these measures, you will need to read the paper, but in any event should know that they are controversial.

Productivity. Judges write opinions, which provide guidance to lawyers and the public. All else equal, a judge who writes more opinions is more productive, and provides a greater social benefit. Over the three year period from 1998 to 2000, the most productive judge published 269 opinions, the least productive judge published 38 opinions, and the mean was 98.1. For the comparable period from 1999-2001, Judge Sotomayor published 73 opinions. She would have ranked 68th out of 98. However, she was substantially more productive in 2006, publishing 36 opinions in one year (though that also is higher than in other recent years). (Judge Wood is fourth; then-Judge Alito was 72nd.)

Quality (1). Choi and Gulati measure quality by counting citations to a judge's top twenty opinions. This approach avoids penalizing productive judges whose marginal opinion addresses narrower issues (as a citation-per-opinion measure would), while also avoiding excessive credit to productive judges who garner high citations by writing a lot. Judge Wood's top twenty cases over three years received 327 outside-circuit citations, putting her 26th. The range is 96 to 734, with a mean of 277.9. Judge Sotomayor's statistic is 231, which would place her 59th. (Alito was 70th.)

Quality (2). Judge Sotomayor's opinions from 1999-2001 were cited 289 times in law reviews and other legal periodicals through May 31, 2004. Judge Wood's opinions from 1998-2000 were cited 513 times through May 31, 2003 (16th). (Alito's were cited 240 times (73rd).) Sotomayor would have ranked 65th.

Quality (3). Choi and Gulati also check what they call "invocations"—the frequency with which opinions written by other judges refer to the judge in question by name. They argue that invocations are most likely when the judge in question either has a good reputation or has written a particularly helpful opinion. Invocations range from 0 to 175 (excluding two outliers, the highest is 23), with a mean of 32. Judge Sotomayor was invoked 0 times (tied for last). Judge Wood was invoked 10 times (9th), and Judge Alito was invoked 5 times (28th).

Independence. Judges should decide cases in a non-partisan way; "independence" refers to the probability that a judge will dissent from a majority opinion written by a co-partisan or will write a majority opinion from which a co-partisan dissents. Choi and Gulati use a complicated variable that attempts to measure this tendency, and I simplify here. A score of 0 means that a judge is just as likely to disagree as agree with a co-partisan (or opposite-partisan). Negative scores mean that a judge is more likely to agree with co-partisans. Judge Sotomayor's score is -0.153 (read the paper if you want to know how this score is calculated), which would have placed her 55th. Judge Wood has a score of -0.018, placing her eighth in terms of independence. (Alito was 10th.)

Choi and Gulati report a composite ranking—but for technical reasons (they controlled for various factors, for example, variation between circuits) it is difficult to put Judge Sotomayor into such a ranking. In addition, Choi and Gulati's basic approach, which is to give equal weight to productivity, quality, and independence, is contestable, as is their decision to base quality on invocations and citations to top-twenty opinions. They check for robustness by reporting how rankings change as the weightings are adjusted, but their robustness checks are too elaborate to repeat here. Wood ranked eighth; Alito sixteenth. Sotomayor would be in the bottom half. However, I would not put too much weight on this conclusion. Choi and Gulati intended their composite ranking as an experiment to provoke discussion on how best to evaluate and rank judges.

One can, however, make some rough judgments based on the disaggregated rankings. The bottom line is that Judge Sotomayor is about average, or maybe a bit below average, for a federal appellate judge. These results are far from conclusive, but one might think that put the burden on Judge Sotomayor's defenders to come forward with stronger reasons for her nomination than they have so far. Judge Wood is stronger—I would say that she is impressive, but others might weight the factors differently.

Judicial ability is not the only thing that matters for supreme court nominations, of course. People also care about the nominee's politics (euphemistically called "judicial philosophy"), meaning how she will vote on abortion rights and related issues. For those poring over Judge Sotomayor's and Judge Wood's opinions for clues about their political leanings, a word of advice. Research shows that appellate judges don't like to dissent, and they don't like responding to dissents, and so authors shade their opinions to the political center of gravity of the panel. At the time when the data were collected, the seventh circuit consisted mostly of Republicans, while the second circuit consisted mostly of Democrats. That means that Judge Wood's actual views are probably somewhat to the left of her majority opinions, while Judge Sotomayor's views are to the right of her majority opinions—in the case of panels with mixed partisan membership. Focus on dissents and concurrences, which better reflect the judges' actual views, or unanimous opinions when all members of the panel belong to the same party.

A final note. I have done my best to put these statistics together but it is possible that I have made errors. I am reasonably confident about my overall conclusions but not the exact rankings. The data are in any event too crude to allow for precise discrimination. Feel free to take a look at the data yourself and draw your own conclusions. If someone wants to gather data on the reversal rates of the judges, that would also be useful.

	[Eric Posner, May 13, 2009 at 11:40am] Trackbacks Judge Sonia Sotomayor: What the Data Show Much has been written about Jeff Rosen's New Republic piece, which cited anonymous sources who disparaged Judge Sotomayor's judicial ability. It is never easy to evaluate judges, or to evaluate their evaluators, especially when those evaluators insist on anonymity. Fortunately, data on judicial performance exist, and although the data have problems as well, they provide a firmer basis for evaluation. The most complete effort so far to evaluate federal appellate judges is this paper by Stephen Choi and Mitu Gulati. Choi and Gulati use data from Lexis to measure three aspects of the judge's performance—productivity, opinion quality, and independence. They then rank judges according to how they do along these three dimensions. The data set comes from the years 1998-2000 and includes 98 judges. Unfortunately, Choi and Gulati excluded Judge Sotomayor from the data set because she was appointed in 1998 and thus does not have complete data for that year. Judge Wood is in their data set and ranks eighth in the composite ranking. (Disclosure: Judge Wood teaches at my law school.) To determine how Sotomayor would do in the ranking, I had some research assistants collect her data for the years 1999-2001. To address the "freshman effect" (the possibility that her statistics are worse for her earliest years because of inexperience), we also looked at her data from 2006. I provide a brief summary below. If you want more detail about the methods, you will need to read the variable definitions in the paper. And if you want a defense of these measures, you will need to read the paper, but in any event should know that they are controversial. Productivity. Judges write opinions, which provide guidance to lawyers and the public. All else equal, a judge who writes more opinions is more productive, and provides a greater social benefit. Over the three year period from 1998 to 2000, the most productive judge published 269 opinions, the least productive judge published 38 opinions, and the mean was 98.1. For the comparable period from 1999-2001, Judge Sotomayor published 73 opinions. She would have ranked 68th out of 98. However, she was substantially more productive in 2006, publishing 36 opinions in one year (though that also is higher than in other recent years). (Judge Wood is fourth; then-Judge Alito was 72nd.) Quality (1). Choi and Gulati measure quality by counting citations to a judge's top twenty opinions. This approach avoids penalizing productive judges whose marginal opinion addresses narrower issues (as a citation-per-opinion measure would), while also avoiding excessive credit to productive judges who garner high citations by writing a lot. Judge Wood's top twenty cases over three years received 327 outside-circuit citations, putting her 26th. The range is 96 to 734, with a mean of 277.9. Judge Sotomayor's statistic is 231, which would place her 59th. (Alito was 70th.) Quality (2). Judge Sotomayor's opinions from 1999-2001 were cited 289 times in law reviews and other legal periodicals through May 31, 2004. Judge Wood's opinions from 1998-2000 were cited 513 times through May 31, 2003 (16th). (Alito's were cited 240 times (73rd).) Sotomayor would have ranked 65th. Quality (3). Choi and Gulati also check what they call "invocations"—the frequency with which opinions written by other judges refer to the judge in question by name. They argue that invocations are most likely when the judge in question either has a good reputation or has written a particularly helpful opinion. Invocations range from 0 to 175 (excluding two outliers, the highest is 23), with a mean of 32. Judge Sotomayor was invoked 0 times (tied for last). Judge Wood was invoked 10 times (9th), and Judge Alito was invoked 5 times (28th). Independence. Judges should decide cases in a non-partisan way; "independence" refers to the probability that a judge will dissent from a majority opinion written by a co-partisan or will write a majority opinion from which a co-partisan dissents. Choi and Gulati use a complicated variable that attempts to measure this tendency, and I simplify here. A score of 0 means that a judge is just as likely to disagree as agree with a co-partisan (or opposite-partisan). Negative scores mean that a judge is more likely to agree with co-partisans. Judge Sotomayor's score is -0.153 (read the paper if you want to know how this score is calculated), which would have placed her 55th. Judge Wood has a score of -0.018, placing her eighth in terms of independence. (Alito was 10th.) Choi and Gulati report a composite ranking—but for technical reasons (they controlled for various factors, for example, variation between circuits) it is difficult to put Judge Sotomayor into such a ranking. In addition, Choi and Gulati's basic approach, which is to give equal weight to productivity, quality, and independence, is contestable, as is their decision to base quality on invocations and citations to top-twenty opinions. They check for robustness by reporting how rankings change as the weightings are adjusted, but their robustness checks are too elaborate to repeat here. Wood ranked eighth; Alito sixteenth. Sotomayor would be in the bottom half. However, I would not put too much weight on this conclusion. Choi and Gulati intended their composite ranking as an experiment to provoke discussion on how best to evaluate and rank judges. One can, however, make some rough judgments based on the disaggregated rankings. The bottom line is that Judge Sotomayor is about average, or maybe a bit below average, for a federal appellate judge. These results are far from conclusive, but one might think that put the burden on Judge Sotomayor's defenders to come forward with stronger reasons for her nomination than they have so far. Judge Wood is stronger—I would say that she is impressive, but others might weight the factors differently. Judicial ability is not the only thing that matters for supreme court nominations, of course. People also care about the nominee's politics (euphemistically called "judicial philosophy"), meaning how she will vote on abortion rights and related issues. For those poring over Judge Sotomayor's and Judge Wood's opinions for clues about their political leanings, a word of advice. Research shows that appellate judges don't like to dissent, and they don't like responding to dissents, and so authors shade their opinions to the political center of gravity of the panel. At the time when the data were collected, the seventh circuit consisted mostly of Republicans, while the second circuit consisted mostly of Democrats. That means that Judge Wood's actual views are probably somewhat to the left of her majority opinions, while Judge Sotomayor's views are to the right of her majority opinions—in the case of panels with mixed partisan membership. Focus on dissents and concurrences, which better reflect the judges' actual views, or unanimous opinions when all members of the panel belong to the same party. A final note. I have done my best to put these statistics together but it is possible that I have made errors. I am reasonably confident about my overall conclusions but not the exact rankings. The data are in any event too crude to allow for precise discrimination. Feel free to take a look at the data yourself and draw your own conclusions. If someone wants to gather data on the reversal rates of the judges, that would also be useful.