from the this-is-good dept
We recently wrote about how Senators Lindsey Graham and Richard Blumenthal are preparing for FOSTA 2.0, this time focused on child porn — which is now being renamed as “Child Sexual Abuse Material” or “CSAM.” As part of that story, we highlighted that these two Senators and some of their colleagues had begun grandstanding against tech companies in response to a misleading NY Times article that seemed to blame internet companies for the rising number of reports to NCMEC of CSAM found on the internet, when that should be seen as more evidence of how much the companies are doing to try to stop CSAM.
Of course, working with NCMEC and other such organizations takes a lot of effort. Being able to scan for shared hashes of CSAM isn’t something that every internet site can do. It’s mostly just done by the larger companies. But last week Cloudflare (one of the companies that Senators are demanding “answers” from), did something quite fascinating: it enabled all Cloudlfare users, no matter what level of service, to start using Cloudflare CSAM scanning tools for free, even allowing them to set their own rules and preferences (something that might become very, very important if the Graham/Blumenthal bill becomes the law.
I highly recommend reading the entire article, because it’s quite a clear, interesting, and easy to read article about how fuzzy hashing works (including pictures of dogs and bicycles). As the Cloudflare post notes, those who use such fuzzy hashing tools have intentionally kept at least some of the details secret — because being too public about it would allow those who are producing and distributing CSAM to make changes that “dodge” the various tools and filters, which would obviously be a problem. However, that also results in two potential issues: (1) a lack of transparency in how these filtering systems really operate and (2) an inability for all but the largest players to make use of these tools — which would be disastrous for smaller companies if they were required to make use of such things.
And that’s where Cloudflare’s move is quite interesting. In providing the tool for free to all of its users, it keeps the proprietary nature of the tool secret, but it’s also letting them set the thresholds.
If the threshold is too strict ? meaning that it’s closer to a traditional hash and two images need to be virtually identical to trigger a match ? then you’re more likely to have have many false negatives (i.e., CSAM that isn’t flagged). If the threshold is too loose, then it’s possible to have many false positives. False positives may seem like the lesser evil, but there are legitimate concerns that increasing the possibility of false positives at scale could waste limited resources and further overwhelm the existing ecosystem. We will work to iterate the CSAM Scanning Tool to provide more granular control to the website owner while supporting the ongoing effectiveness of the ecosystem. Today, we believe we can offer a good first set of options for our customers that will allow us to more quickly flag CSAM without overwhelming the resources of the ecosystem.
Different Thresholds for Different Customers
The same desire for a granular approach was reflected in our conversations with our customers. When we asked what was appropriate for them, the answer varied radically based on the type of business, how sophisticated its existing abuse process was, and its likely exposure level and tolerance for the risk of CSAM being posted on their site.
For instance, a mature social network using Cloudflare with a sophisticated abuse team may want the threshold set quite loose, but not want the material to be automatically blocked because they have the resources to manually review whatever is flagged.
A new startup dedicated to providing a forum to new parents may want the threshold set quite loose and want any hits automatically blocked because they haven’t yet built a sophisticated abuse team and the risk to their brand is so high if CSAM material is posted — even if that will result in some false positives.
A commercial financial institution may want to set the threshold quite strict because they’re less likely to have user generated content and would have a low tolerance for false positives, but then automatically block anything that’s detected because if somehow their systems are compromised to host known CSAM they want to stop it immediately.
This is an incredibly thoughtful and nuanced approach, recognizing that when it comes to any sort of moderation, one size can never fit all. And, by allowing sites to set their own thresholds, it actually does add in a level of useful transparency, without exposing the inner workings that would allow bad actors to game the system.
That said, I can almost guarantee that someone (or perhaps multiple someones) will come along before too long and Cloudflare’s efforts to help all of its users combat CSAM will somehow be incorrectly or misleadingly spun to claim that Cloudflare is somehow helping sites to hide or enable CSAM. No good deed goes unpunished.
However if you want to support actual solutions — not grandstanding nonsense — to try to deal with CSAM, approaches like Cloudflare’s are ones worth paying attention to. This is especially true if Graham/Blumenthal and others get their way. Under proposals like the one they’re suggesting, it will become virtually impossible for smaller companies to take the actions necessary to meet the standards to avoid legal liability. And that means that (once again) the big internet companies will end up getting bigger. They all have access to NCMEC and the necessary tools to scan and submit CSAM. Smaller companies don’t. Cloudflare offering up its scan tool for everyone helps level the playing field in a really important way.