Oregon Appeals Court Says Bullet Cartridge Matching Is Just More Junk Science
from the facts-don't-care-about-your-bullet-casing-feelings dept
This isn’t the first state court to reach this conclusion, but so few courts bother to examine the science-y sounding stuff cops trot out as “evidence” that this decision is worth noting.
There’s no shortage of junk science that has been (and continues to be) treated as actual science during testimony, ranging from the DNA “gold standard” to seriously weird shit like “I can identify a suspect by the creases in his jeans.”
Anyone who’s watched a cop show has seen a detective slide a pen into a shell casing and place it gently in an evidence bag. At some point, a microscope gets involved and the prosecutor (or witness) declares affirmatively that the markings on the casing match the barrel of the murder weapon. Musical stings, ad breaks, and tidy episode wrap-ups ensue.
Maryland’s top court dismantled these delusions back in 2023 by actually bothering to dig into the supposed science behind bullet/cartridge matching. When it gazed behind the curtain, it found ATFE (Association of Firearm and Tool Mark Examiners) and its methods more than a little questionable.
To sum up (a huge task, considering this was delivered in a 128-page opinion), ATFE’s science was little more than confirmation bias. When trainees were tested, they knew one of the items they examined came from the gun used in the test. When blind testing was utilized, the nearly 80% “success” rate in matches dropped precipitously.
He observed, however, that if inconclusives were counted as errors, the error rate from that study would “balloon[]” to over 30%. In discussing the Ames II Study, he similarly opined that inconclusive responses should be counted as errors. By not doing so, he contended, the researchers had artificially reduced their error rates and allowed test participants to boost their scores. By his calculation, when accounting for inconclusive answers, the overall error rate of the Ames II Study was 53% for bullet comparisons and 44% for cartridge case comparisons—essentially the same as “flipping a coin.”
From “pretty sure” to a coin flip. Not exactly the standard expected from supposed forensic science. And that’s common across most cop forensics. When blind testing is used, error rates soar and stuff that’s supposed to be evidence looks a whole lot more like guesswork.
The same conclusion is reached here by the Oregon Court of Appeals, which ultimately reverses the lower court’s refusal to suppress this so-called evidence.
This opinion [PDF] only runs 43 pages, but it makes the same points, albeit a bit more concisely. As the lead off to the deep dive makes clear, cartridge matching isn’t science. It’s just a bunch of people looking at stuff and drawing their own conclusions.
As we will explain, in this case, the state did not meet its burden to show that the AFTE method is scientifically valid, that is, that it is capable of measuring what it purports to measure and is able to produce consistent results when replicated. That is so because the method does not actually measure the degree of correspondence between shell cases or bullets; rather, the practitioner’s decision on whether the degree of correspondence indicates a match ultimately depends entirely on subjective, unarticulated standards and criteria arrived at through the training and individualized experience of the practitioner.
For a similar reason, the state did not show that the method is replicable and therefore reliable: The method does not produce consistent results when replicated because it cannot be replicated. Multiple practitioners may analyze the same items and reach the same result, but each practitioner reaches that result based on application of their own subjective and unarticulated standards, not application of the same standards.
That’s a huge problem. Evidentiary standards exist for a reason. No court would allow people to take the stand and speculate wildly about whether or not any evidence exists that substantiates criminal charges. Tossing a lab coat over a bunch of speculation doesn’t suddenly make subjective takes on bullet markings “science.” And continuing to present this guesswork with any level of certainty perverts the course of justice.
[W]hen presented as scientific evidence, AFTE identification evidence—an “identification” purportedly derived from application of forensic science—impairs, rather than helps, the truthfinding process because it presents as scientific a conclusion that, in reality, is a subjective judgment of the examiner based only on the examiner’s training and experience and not on any objective standards or criteria.
In an effort to salvage this evidence, the government claimed the ATFE Journal was self-certifying. In other words, the fact that ATFE published this journal was evidence in and of itself of the existence of scientific rigor. Both the trial court and the appeals court disagreed:
The court rejected the idea that the AFTE Journal, which the government argued shows that the method is subject to peer review, satisfies that factor for two reasons: because the AFTE Journal “is a trade publication, meant only for industry insiders, not the scientific community,” and, more importantly, because “the purpose of publication in the AFTE Journal is not to review the methodology for flaws but to review studies for their adherence to the methodology.”
The ruling quotes many of the same studies cited by the Maryland court in its 2023 decision — the blind studies that made it clear cartridge matching is mostly guesswork. This court arrives at the same conclusion:
[T]he AFTE method, undertaken by a trained examiner, may be effective at identifying matches, but the problem is that, from what was in the record before the court, the analysis is based on training and experience— ultimately, hunches—not science…
To sum up, this method lacks anything that could be considered sound science:
Neither the AFTE theory nor the AFTE method prescribes or quantifies what the examiner is looking for; the examiner is looking for sufficient agreement, which is defined only by their own personal identification criteria.
Having arrived at this conclusion, the court does what it has to do. It reverses the lower court’s dismissal of the suspect’s suppression motion. The “error” of putting this “evidence” on the record was far from harmless. The state has already announced it plans to appeal this decision, but for now, investigators hoping shell markings will help them close some cases might want to dig a little deeper in the evidence locker.
Filed Under: bullshit, due process, evidence, forensics, junk science, police


Comments on “Oregon Appeals Court Says Bullet Cartridge Matching Is Just More Junk Science”
I’m glad it’s finally becoming a mainstream position. The people seriously into firearms and the science and engineering behind them have known for ages that it was junk.
Re:
yeah, blind trust in police general Forensic-Evidence has rightfully been erased over past 30 years, due to much closer scrutiny.
Finger-print matching was also found lacking; even DNA forensics suffer big sources of possible error.
The ‘Science’ is often just unproven “common knowledge” and mere long term custom.
Police/FBI LABS were found to be loaded with human error and rather frequent falsification.
Re:
Don’t shoot people, just keep throwing whole bullets at the receiver, so to speak, and hope one goes up their nose, or in their mouth, and jams in their throat and asphyxiates them to death.
Hey presto, death by bullet without having to fire a single shot.
I’ve been sticking screwdrivers down my barrels to mar up the rifling between shootings for nothing?
This comment has been flagged by the community. Click here to show it.
More appeals court nonsense. I studied criminology and know for a fact that bullet matching works. Tiny grooves in gun cause tiny grooves in bullet. Simple as that .
Re:
Hello, did you read the article? They are talking about the shell casing, not the projectile. We know, reading is hard…
Re: As well as
They have proven barrel markings on Bullets, really dont match.
Its Soft material as Lead and Copper(so you dont destroy your barrel) May take an impression But take 12-16″ of markings and all you see is the End of the track..THEN the Soft bullt that was Compressed inside the barrel, Changes it shape abit in travel, THEN Impact Ruins it all.
Why do you think the Bullet is smaller then the Shell Firing it? The Bullet expands to attach the grooves… NOW look at a Sled, its 2 runners in the snow. There are a Few dings in the Runners, but Not in the Snow.
This comment has been flagged by the community. Click here to show it.
Sounds like More appeals court nonsense. I studied criminology and know for a fact that bullet matching works. Tiny grooves in gun cause tiny grooves in bullet. Simple as that .
Re: Third Time's the Charm, OP
This comment was wrong the first two times you posted it, but maybe it’s magically correct now? Let’s ask the ATFE…
This comment has been flagged by the community. Click here to show it.
I disagree respectfully
Sounds like More appeals court nonsense. I studied criminology and know for a fact that bullet matching works. Tiny grooves ingun cause tiny grooves in the bullet. It is as Simple as that .
Re:
Soory to learn that your hearing is going.
Re:
When gun barrels were rifled using single point cutters there may have been enough variations to associate a fired bullet with a particular gun, sometimes. Modern manufacturing aims for maximum consistency, which means hundreds of barrels made by cold hammer forging will have identical rifling, and button rifling and electro chemical machining will be equally consistent.
This doesn’t even address the ease of changing barrels in a semi-auto pistol. I can take a .40 S&W Glock 23 and install a 9mm conversion barrel, or a .357 SIG barrel, do crime, ditch the odd caliber and revert to .40 before the cops come knocking.
Re:
Respectful people don’t spout the same obvious and off-topic bullshit 3 times in a row. Maybe you should experience a conviction by coin flip?
Re:
NO THANK
I don’t know if it’s sufficient for trial evidence, but it seems weird to call inconclusive an error.
If an examiner can correctly identify that they don’t have enough evidence to be sure, that seems… fine? It doesn’t hurt a defendant if it’s labeled inconclusive.
Re:
Admittedly “failure rate” may be more accurate than “error” coloquially, but the methodology is intended to determine if “patterns and markings on bullets are consistent or inconsistent with those on bullets fired from a particular known firearm”. If the results of using that methodology are inconclusive, the methodology fails just as much as getting the wrong answer because it’s a binary set in reality. Either the bullets at issue came from the gun in question or they didn’t, and the methodology’s inability to make that determination should not be counted in its favor.
Re: It matters bc there is no objective standard for "inconclusive"
A forensic scientist can reject all the tough cases they want when they know that their success rate is being measured. When it comes time to put someone in jail, they are free to opine on those tough cases. Their measured success rate on easy cases has no relation to whatever success rate they have when pressed to provide an opinion on tough cases. Because there is no standard for “inconclusive” and the analyst is free to include them when providing an opinion, they should be included as errors so that the success rate cannot be artificially inflated.
Re: Re:
That makes sense, thanks. I can see there being pressure to resolve a case skewing things.
I wonder if you could do “in field” tests, so they wouldn’t know a synthetic test from a real case? Basically, the equivalent of a secret shopper. But I suppose you wouldn’t be able to simulate the court testimony, so that’d be a giveaway
Re:
If you have 8 correct, 4 incorrect and 8 inconclusive in a test, does it make more sense to say your score is 8/20 or 8/12? If you leave your answer blank on a test in school, the teacher doesn’t just remove it from the scoring.
Re: Re:
Let’s say you have 100 cold cases where you have 2 suspects. You give the evidence to two DNA experts. The first one correctly matches 49 and incorrectly matches 1, but the evidence is too degraded in the other 50, so he labels them inconclusive. The second one correctly matches 40, incorrectly matches 10, and guesses for the 50 degraded ones, getting half of those right.
So, are you going to say the second guy is better at his job since he got 65% correct while the first guy only got 49% correct? Or is the more relevant stat here the correctness on matches, which for the first guy was 98% and for the second guy was 65%?
Being able to say “I don’t know” in a criminal case is important.
Re: Re: Re:
A good point in the context of performance evaluations (or rather, competence evaluations). Not so much in the context of criminal trials.
Re: Re: Re:2
“Inconclusive” has to be a valid result. That’s just reality.
Re: Re: Re:3 Where is the field guide?
Of course a forensics examiner should be able to give “inconclusive” as a conclusion rather than “matching” or “not matching”. That is separate from how correct the examiner’s conclusion is. How do we know whether an examiner’s conclusion is correct in the first place? The answer is: the examiner followed a reliable standard. How do we know whether a standard is reliable? The answer is: when used by the same person repeatedly and when used by different people, the standard produces the same results most of the time or similar results almost all the time. Keep in mind that the objective of the prosecutor in a criminal trial is to convince the jury to find guilt “beyond a reasonable doubt”. If the standard is unreliable, any evidence produced using the standard becomes grounds for doubt, and any evidence produced without a documented, quantifiable standard is automatically doubtful.
The example from a few comments ago:
The example conveniently has two examiners work with the same evidence. In real life, there will be one examiner in most cases (because hiring examiners costs the state money). In any case, but especially when there’s only one examiner, each examiner should follow a standard that any examiner can use. In the example, (error 1) at least one examiner wasn’t following the standard correctly, (errors 2 and 3) the standard used by the examiners was incomplete/unreliable (red flag in a criminal trial), or (error 4) there is no standard (MASSIVE red flag). What we have in reality is, at best, a great deal of errors 1 and 3: the procedures of the bullet/cartridge matching studies were flawed (incentivized “inconclusive” over “not matching”, weren’t double blind, Hawthorne effect, lack of quantifiable protocols for examiners to follow, etc.), and different results between studies that had and didn’t have known correct answers. I think bullet/cartridge matching in general suffers from error 4: real bullet/cartridge examiners don’t use or don’t have a shared, documented, quantifiable standard.
Tangentially, an examiner’s track record of conclusions is not evidence of how correct the conclusions are in any particular case. What really matters whether an examiner follows a standard in the particular case. Suppose that I’m a forensics examiner. If I never follow a standard but have a history of making 90% correct conclusions, all of my conclusions should be cast into doubt, and my conclusions should be inadmissable as evidence moving forward. If in my past cases I followed the standard 100% of the time with a history of making 90% correct conclusions, but in a new case I don’t follow the standard, my conclusion in the new case should be inadmissable as evidence. Otherwise, the jury would be basing their decision on subjectivity, chance, and ethos – vibes for short. If I followed the standard 100% of the time and fully intend to do so this time, then my conclusion is admissable, and the jury should consider whether I really did follow the standard. (Briefly, the jury should be told how the standard works.)
Re: Re:
It depends what you’re trying to measure. They tell you different things, both of which can be useful (“out of all incidents, how many did it get correct” vs “out of how many did it attempt to submit a solution, did it get correct”)
For a normal school test, we specifically want to test broad based knowledge on the subject, so it doesn’t make sense to restrict it. If a student aces a specific subset of the topic, that isn’t sufficient to show broad knowledge. But a tool can still be useful even if it’s only relevant on a subset.
You could imagine the extreme case where it goes 99 inconclusive, and 1 match. That is very different (and potentially useful) than 99 mismatches and 1 match.
Re: the point
An inconclusive means the bullet markings dont match the gun it was fired from. If it doesnt match the gun it was fired from then it is possible that it could match some other gun that it was not fired from. Too many inconclusives proves that markings arent conclusive in general.
But just dig those groovy cartidge casings, man! They are FAR out! Of evidence, that is. Case dismissed.
Why should you mark an inconclusive as a negative?
Because this stuff isn’t about who took the last cookie from the cookie jar.
Bite mark and hair analysis have both been deemed “bullshit” by the FBI.
Forensics isn’t magic, CSI and Law and Order and NCIS are fictional entertainment.
Because when a life hangs in the balance; and make no mistake, innocent men and women have died from this. We are supposed to have a higher standard in our supposedly SCIENCE BASED EVIDENCE than a coin flip and a shrug from career testi-liars.
Well, you see, trials are about conclusive evidence… no shadow of a doubt. So, you’ve either proven something or you haven’t. Since you haven’t, your evidence is suspect, as in, not evidence at all. Reasonable doubt becomes absolute doubt, which equates to not guilty. Apparently, too much of the lab work done for cases is all about finding ways to make juries believe suspects are guilty, not so much about finding guilty people to prosecute. (Prosecutors seem to most want to look good for their future careers, mostly in politics.)
The CSI Effect
Good article, thank you for it. Now do one on drug dogs.
Re:
Ahh, good ol’ probable cause on four paws.
Re:
Try searching for those articles. You’ll find too many to read in a sitting.
Your Trusted Source in Sydney for All of Your Tiling Needs
Dream Tiles, a well-known tile company in Sydney, takes pride in being the top tile supplier for homes and businesses. We have floor and wall tiles, ornamental feature tiles, and bathroom tiles that Sydney clients adore, whether you’re building a new house or renovating an existing one. The tile specialists in our Sydney showroom are available to help you choose designs that suit your taste and budget. You can count on excellent customer service, premium goods, and affordable prices all in one location.
Visit Dream Tiles, Sydney’s leading tile retailer for elegance and excellence!
For a quick, no-obligation quote, give us a call right now or visit https://dreamtiles.com.au.