Alan Kyle's Techdirt Profile

About Alan Kyle

Latest Posts (2)
Latest Comments (0)

Posted on Techdirt - 10 July 2024 @ 01:11pm

California AI Bill Tells GenAI Startups To Nerd Harder

There’s a stunning degree of fear mongering and lack of humility about what California AI bill SB 942 can or can’t do. Honest conversation about this bill’s limitations are essential to ensuring we don’t pass this ineffective law. But its proponents have obstructed reasoned policy development by injecting panic into that conversation and pretending it will solve various multifaceted GenAI abuses that it simply cannot.

This article is a follow up to my first one, where I explain why SB 942’s forced disclosures and inclusion of AI-generated text were unworkable. Despite recent amendments that fix those two issues, the remaining requirement that nascent AI companies create “AI detection tools” make it still a fundamentally flawed bill.

SB 942 vaguely aims to “tackle the issue of GenAI-produced content” by requiring three things:

AI providers must offer a free AI detection tool that identifies content generated using their service.
AI providers must offer users the option to include a conspicuous disclosure on content generated with their service.
AI providers must embed specific metadata into files and the generated content itself, including company name, version number, timestamp, and a unique identifier.

A theme that runs through nearly every aspect of this bill’s journey through the legislature is a mismatch between any given GenAI abuse and how this bill would address it. For example, a June 28 committee analysis clumsily tries to explain the danger of open-source AI and incorrectly implies that SB 942 will help counter it:

“ChatGPT is an example of an open-sourced tool, meaning it is accessible to the public. Researchers and developers can also access its code and parameters. This accessibility increases transparency, but it has downsides: when a tool’s code and parameters can be easily accessed, they can be easily altered, and open-source tools have the potential to be used for nefarious purposes.”

While OpenAI has released some model weights, ChatGPT is famously not open-source. Here, open-source has been conflated with “widely available.” Ironically, this bill only applies to AI businesses, and not situations where actual open-source AI software may be abused.

The analysis continues:

“The need for this bill is further highlighted by various instances and research underscoring the threats posed by unregulated GenAI.”

It goes on to describe three examples of GenAI abuse and again falsely implies this bill is the fix. Let’s talk about why that’s not true.

Example 1:

In January, voters in New Hampshire received phone calls from an AI-generated voice clone of Joe Biden telling them not to vote in the primary election in order to save their vote for the upcoming general election. They caught the man responsible and he’s now facing a 6 million dollar fine and 13 felony charges. The FCC has also now made AI robocalls illegal in response.

The bill’s three main provisions would not prevent this from happening again: 1) It would be impractical for consumers to record robocalls and then upload them to a detection tool. 2) Nefarious actors would opt out of the optional disclosures and 3) metadata for the fraudulent audiofile would be irrelevant in the context of an ephemeral phone call.

Example 2:

A finance worker at a Hong Kong firm was persuaded to transfer $25 million to thieves when they were invited to participate in a video call with several other AI-generated colleagues.

This case shares the same pitfalls as the previous example: 1) An employee who thinks they may be in the middle of a fraudulent video call won’t have any way to upload that information to a detection tool during that session. 2) Bad actors will opt out of disclosures and 3) metadata will again be irrelevant.

Example 3:

California high school students have been caught generating nonconsensual nude images of their classmates.

1) Determining the authenticity of this content is useless and doesn’t address the harm it creates. 2) Even if a user opts for a disclosure, the content is still extremely harmful. 3) A hidden disclosure embedded into content might be helpful only in cases where no other evidence or context leads to the perpetrator. But this assumes that all companies can implement this still-developing idea across audio, video, and images. Requiring by law that this be implemented by startups is heavy handed and unrealistic.

The analysis ends with a hypothetical that highlights yet another fundamental flaw:

“In theory, a person who views a video circulating on social media conveying President Joe Biden telling voters not to vote in the primary election could use a provider’s AI detection tool to upload and analyze the video. By examining the embedded machine-readable disclosures, the user could identify the provider’s name, the GenAI system used, and the creation date, concluding that the video was produced by AI. This process would reveal that the video is not genuine, thus, in theory, helping to prevent the spread of misinformation.”

That sounds nice, in theory. But in reality, it’s more complicated than that.

A social media user that uploads to an AI detection tool may not be returned any useful indication about a piece of content’s authenticity because the law only requires AI providers to determine if their service was used. The user would then need to make uploads to every AI provider’s tool until they get a hit, or give up before they do. The user will also need to understand that each tool has a different level of accuracy across video, image, and audio, and will need to account for the fact that there may be false positive and negative results.

One alternative to this rigmarole is for users to do what they’ve always done when inquiring about suspicious online content: google it.

But seriously, how many different AI detection websites will one have to potentially go to? Is it 5? 15? Maybe 50? I doubt anyone has contemplated this number. AI providers with over 1M monthly users will have to comply with this law, making the user threshold arbitrarily over inclusive given that there are at least 1,500 GenAI startups poised for growth. And the unlucky startups that already have 1M users will have only four months to develop this unproven technology before the law takes effect in January of next year.

By passing this law, California lawmakers would be telling an untold number of GenAI startups to nerd harder.

And now we get to the weird part.

In the July 2 hearing [2:55hr mark] for which our much-discussed committee analysis was prepared, co-drafter of the bill Tom Kemp again testified in support.

One statement stood out in particular:

“Recent changes to the bill have addressed many opposition concerns coming out of the privacy committee. For example, a critic just recently wrote that the recent changes have quote, ‘fixed the biggest issues I’ve had with the bill.’ ”

Who is this unnamed critic? Well, Kemp appears to quote a comment that I made while summarizing several amendments in a Linkedin post. Read next to my own commentary, the two phrases are almost identical:

“These changes fix the biggest issues I had with the bill, link in comments.”

These two phrases don’t appear anywhere else on the internet, suggesting Kemp did in fact quote me. That’s a shame because while amendments did fix two of three issues I originally pointed out, this short phrasing doesn’t reflect my full thoughts on the bill. SB 942, in my opinion, is still a hot mess.

The amendments didn’t address the issue of compelled AI detection tools. As we’ve already discussed, requiring AI providers to create detection tools does not guarantee them to actually work well. The fact that we can’t rely on them 100% of the time means there will be false positives and negatives, which has already proved to be highly damaging. Mandating these tools by law, with no consideration for how inaccurate they may be, is tech solutionism and will only confuse people more about the authenticity of content they encounter.

I regret that my choice of a few short words was used to promote this piece of legislation. To avoid any doubt, I’ve edited my Linkedin post to read:

“These changes fix [some of] the biggest issues I had with the bill, [but it still has many, many issues.]“

Alan Kyle is a tech policy professional available for hire in AI Governance, Trust & Safety, and Privacy.

Posted on Techdirt - 1 May 2024 @ 01:38pm

Regulations For Generative AI Should Be Based On Reality, Not Hallucinations.

In a haste to do something about the growing threat of AI-fueled disinformation, harassment, and fraud, lawmakers risk introducing bills that ignore some fundamental facts about the technology. For California lawmakers in particular, this urgency is compounded by the fact that they preside over the world’s most prominent AI companies and are able to pass laws more quickly than congress can.

Take, for example, California SB 942, which is an attempt to regulate generative AI, but which appears to have hallucinated some of the assumptions on which it’s built.

In short, SB 942 would:

Require a visible disclosure on images, video, and text that the content is AI-generated; a disclosure in their metadata; and an imperceptible disclosure that is machine readable.
Require AI providers to create a detection tool where anyone can upload content to check if it was made using that provider’s service.

Sounds pretty good right? Wouldn’t this maybe help fight AI-generated abuse?

It’s unlikely.

In a hearing last week, tech and policy entrepreneur Tom Kemp testified saying that we need this bill. He opened by pointing out how Google CEO Sundar Pichai believes AI will be more profound than the invention of fire or the internet. Then, holding up a pack of gum, said that if we can require a food label for a pack of gum, we should at least also require labels on something as profound as AI.

He concludes, saying:

“In summary, this bill puts AI content on the same level as a pack of gum in terms of disclosures, which is the transparency that Californians need.”

Huh? It’s a fun analogy, but not a useful one. The question we should ask is not whether generative AI is like food. After all, the regulation of food products has different legal considerations than the regulation of expressive generative AI content.

What we should ask is: Will this policy solve the problems we want to solve?

Visible Disclosures:

SB 942 requires disclosures for AI-generated text, but there is no effective method for flagging and detecting content as being AI-generated. Unlike, say, a watermark for an image, a disclosure for text would need to fundamentally alter the message to communicate its synthetic nature. A written disclosure could precede the generated text, but users could simply cut that portion out.

This part of the bill made my Trust & Safety senses tingle. Platform policies that are unenforceable erode trust when there is a mismatch between the rules and what consumers expect. Similarly, if there is a law requiring disclosures of AI generated text, it may give consumers a false sense of protection when there is no way to reliably communicate these notices.

The bill also assumes that generative AI can only be used for malicious purposes. There are many cases where having a disclosure simply doesn’t matter or is even undesirable. For example, if I want to generate an image of myself playing basketball on the moon, there won’t be any question about its inauthenticity. Or if I want to use Photoshop’s generative fill tool for a piece of marketing, I surely don’t want a watermark interrupting my design. To require by law that it all be labeled is a heavy handed approach that seems unlikely to withstand First Amendment scrutiny.

Detection Tools:

AI detection tools are actively being researched and developed, but at this point can’t offer definitive answers to questions of inauthenticity. They give answers with widely varying degrees of uncertainty. This nuance sometimes gets ignored to great consequence, as with the cases where students were falsely accused of plagiarism.

In fact, the technology is so unreliable that last year OpenAI killed its own detection tool, citing its low rate of accuracy. If a safety-conscious AI company is pulling down its own detection tool because it does more harm than good, what incentive does a less conscientious business have to make their detection tool any less harmful?

There are already several generative AI detection services, many offered for free, that are competing for this niche market. If detection tools make big advancements in reliability it won’t be because we required generative AI companies to also push one out just to comply with the law.

It’s worth mentioning that during last week’s hearing, the bill’s author, Senator Becker, acknowledged that it’s a work in progress and promised to continue collaborating with industry to “strike the right balance.” I appreciate his frankness, but I’m afraid that would essentially mean scrapping it. I expect he’ll remove the mention of AI-generated text and hope he gets rid of the detection tool requirement, but that would still leave us with a vague and hard to comply with requirement to label all AI-generated images and video.

The law should try to account for new and developing technology, but it also needs to operate based on fundamental ground truths about it. Otherwise, it will be no more useful than an AI hallucination.

Alan Kyle is a tech governance enthusiast who is looking for his next work opportunity at the intersection of trust & safety and AI.

Alan Kyle's Techdirt Profile

About Alan Kyle

California AI Bill Tells GenAI Startups To Nerd Harder

Regulations For Generative AI Should Be Based On Reality, Not Hallucinations.

Alan Kyle's Comments

Tools & Services

Company

Contact

More

Alan Kyle's Techdirt Profile

About Alan Kyle

California AI Bill Tells GenAI Startups To Nerd Harder

Regulations For Generative AI Should Be Based On Reality, Not Hallucinations.

Alan Kyle's Comments

Email This Story

Tools & Services

Company

Contact

More