"Accuracy" is not a property of a test itself. Accuracy is a property of the test AND the group of people the test is applied to.

Claimed Accuracy

This web site gives the opinions of Dr. Greg Kane. Everything you read here is expressed only as my personal opinion.


Decision analyses found that officers’ estimates of whether a motorist’s BAC was above or below 0.08 or 0.04 percent were extremely accurate. Estimates at the 0.08 level were accurate in 91 percent of the cases...

Validation Of The Standardized Field Sobriety Test Battery At BACs Below 0. 10 Percent Final Report Stuster and Burns,1998, pg. iii

Fundamental claim, fundamental trick
THE GOVERNMENT'S fundamental claim about FSTs is that if you know a person's Field Sobriety Test score you know their Blood Alcohol Concentration. The National Highway Transportation Safety Administration has paid for several "validation" studies claiming to discover that officers "using" FSTs are "extremely accurate"—about 90% accurate—at identifying driver's with high BACs.

These studies generally measure police officers' "arrest accuracy." What the NHTSA doesn't let on is, the "arrest accuracy" statistic is open to manipulation. Simply by manipulating the group of drivers you choose to “study,” you can set up your "validation" study beforehand so it is certain to “discover” whatever arrest accuracy you’ve been paid to validate. This page shows you how it can be done. A popup explains the science.

How to “discover” whatever accuracy you’re being paid to discover

This example uses actual government data for the Walk And Turn test. I'd rather use published data for the FST itself. I can't. The government keeps FST data secret.

When one test (Walk And Turn) acts as a stand-in for a second, gold standard test (Blood Alcohol Concentration), scientists summarize the stand-in's performance with 2 x 2 boxes like the ones here. (Scientists call these things contingency tables. NHTSA contractors call them decision matrices.) Gray squares sum each row and column. Percentages keep track of various accuracies.

 

The four squares in a matrix tally a validation study’s results. Every driver in a validation study goes in one of the four squares, depending on their combination of BAC and WAT results. BAC results are separated by row. Drivers with BACs above the legal limit go in the top row. Drivers with BACs less than the legal limit go in the bottom row. Which column a driver goes in depends on whether the WAT said to release or arrest them.

This first matrix gives the Walk And Turn test’s performance in an imaginary validation study of 120 drivers. The study is imaginary, but the outcomes are real—calculated according to accuracies reported in the San Diego FST validation study (Stuster and Burns, 1998, pg 21, fig 5). The San Diego study found that on impaired drivers, the WAT test got the correct answer 98% of the time. On innocent drivers WAT was correct 47% of the time; let's pretend the WAT is more accurate than it really is, and round that up to 50%.

Impaired
One hundred drivers were impaired, 2 were wrongly released, 98 were correctly arrested. The accuracy of the WAT on impaired drivers is 98%.

Innocent
Twenty innocent drivers were tested, 10 were correctly released, 10 were wrongly arrested. The accuracy of the WAT on innocent drivers is 50%.

These two accuracies — impaired driver accuracy and innocent driver accuracy — are fundamental properties of any test. (Scientists call them "sensitivity" and "specificity.") In this validation study, 83% of the drivers tested were impaired and the accuracy of the WAT’s arrest decision was 91%.

 

In the second matrix, look what happens to the NHTSA’s arrest accuracy statistic as the mix of innocent and impaired drivers changes. In this and all other examples on this page, the only thing that changes is the mix of impaired and sober drivers in the study group. All the results here reflect exactly the same highly skilled officers doing exactly the same NHTSA-standardized WAT test. The fundamental accuracies of the test stay the same. In each example, the impaired driver accuracy is 98%, and the innocent driver accuracy is 50%.

 

When the same highly skilled officers do exactly the same WAT test, only this time on a group of drivers who are all innocent—the church parking lot study— even though the innocent driver accuracy is still 50%, the NHTSA’s arrest accuracy statistic is 0%.

   

 

When all the drivers are impaired—the drunken consultants study—even though the fundamental impaired driver accuracy is still 98%, the NHTSA’s arrest accuracy statistic is 100%.

 

Every time the study group changes, the NHTSA’s arrest accuracy statistic changes. Which means that simply by manipulating the impaired driver: innocent driver mix you choose to “study,” you can set up your field test beforehand to “discover” whatever arrest accuracy you’ve been paid to validate. Like this:

If you are being paid to discover an
accuracy of 98%,
set up a study group 96% of whose drivers
are impaired

If you are being paid to discover an
accuracy of 91%,
set up a study group 83% of whose drivers
are impaired.

If you are being paid to discover an
accuracy of 80%,
set up a study group 68% of whose drivers
are impaired.

If you are being paid to discover an
accuracy of 70%,
set up a study group 54% of whose drivers
are impaired.

If you are being paid to discover an
accuracy of 60%,
set up a study group 43% of whose drivers
are impaired

If you are being paid to discover an
accuracy of 50%,
set up a study group 34% of whose drivers
are impaired

If you are being paid to discover an
accuracy of 40%,
set up a study group 25% of whose drivers
are impaired

If you are being paid to discover an
accuracy of 30%,
set up a study group 18% of whose drivers
are impaired

If you are being paid to discover an
accuracy of 20%,
set up a study group 11% of whose drivers
are impaired

If you are being paid to discover an
accuracy of 10%,
set up a study group 6% of whose drivers
are impaired

If you are being paid to discover an
accuracy of 5%,
set up a study group 2% of whose drivers
are impaired

If you are being paid to discover an
accuracy of 1%,
set up a study group 0.5% of whose drivers
are impaired

These examples represent highly skilled DUI officers performing NHTSA-standardized WAT tests flawlessly. I didn't make up the WAT accuracies, I took them directly from the pages of the NHTSA's most recent, most up do date FST validation study. In every example the accuracy of the WAT on innocent drivers is the same, 50%. In every example the accuracy of the WAT on impaired drivers is the same, 98%

From example to example, only one thing changed, the percentages of impaired and sober drivers. Manipulating only the mix of drivers in your study group let you manipulate the accuracy you "discovered" in your "validation study" to any number between 0% and 100%

All NHTSA claims that the FST is "extremely accurate" depend on this statistical trick.
Not only is it possible to pick a skewed-sample study group that inflates the "accuracy" the study "discovers," compared with the accuracy in the population in general, that's actually how it's done in real life. That's exactly how NHTSA validation studies "discover" a high accuracy for the FST.

What the in depth articles show you  [Two Statistical Tricks Let NHTSA Contractors Validate Any FST as "Extremely Accurate, page 35, Table 1]  is that NHTSA contractors do set up their study groups in a way that leads to the application of this statistical trick. Every NHTSA FST validation study that "discovers" a high FST accuracy studies a non-random group of test subjects that inflates the accuracy the study "discovers." Every validation study that fails to use a skewed, non-random study group also fails to "discover" a high FST accuracy.

Let me be clear. By "statistical trick" I mean exactly that— an odd trick of statistics. I do not mean the NHTSA or its contractors are deliberately deceptive. Nothing here is a statement about the knowledge or intentions of the NHTSA or it's contractors. It's all about the math.

Attorney note this
The prosecution has no basis for any assertion tying the "accuracy" of the FST to your client's probability of impairment
.
The prosecution asserts that the FST is a scientific test with high accuracy. The FST is 90% accurate. Your client failed the FST, therefore the probability your client was impaired is 90%.

The prosecution has provided no evidence to support this scientific claim. It can't. The claim is false.

The defense may

1

Insist the prosecution provide scientific evidence that its method of interpreting the FST result is scientific. It can't. It isn't.

2

Provide an expert's affidavit or testimony as to how science correctly interprets FST results.

In Colorado this defense has been used successfully to exclude officer testimony as to the "accuracy" of the FST.

 

top