How To Bell The AI Cat?

from the are-you-a-bot? dept

The mice finally agreed how they wanted the cat to behave, and congratulated each other on the difficult consensus.  They celebrated in lavish cheese island retreats and especially feted those brave heroes who promised to place the bells and controls.

The heroes received generous funding, with which they first built a safe fortress in which to build and test the amazing bells they had promised.  Experimenting in safety without actually touching any real cats, the heroes happily whiled away many years.

As wild cats ran rampant, the wealthy and wise hero mice looked out from their well-financed fortresses watching the vicious beasts pouncing and polishing off the last scurrying ordinaries.  Congratulating each other on their wisdom of testing the controls only on tame simulated cats, they mused over the power of evolution to choose those worthy of survival…

Deciding how we want AIs to behave may be useful as an aspirational goal, but it tempts us to spend all our time on the easy part, and perhaps cede too much power up front to those who claim to have the answers.

To enforce rules, one must have the ability to deliver consequences – which presumes some long-lived entity that will receive them, and possibly change its behavior.  The fight with organized human scammers and spammers is already a difficult battle, and even though many of them are engaged in behaviors that are actually illegal, the delivery of consequences is not easy.  Most platforms settle for keeping out the bulk of the attackers, with the only consequence being a blocked transaction or a ban.  This is done with predictive models (yes, AI, though not the generative kind) that makes features out of “assets” such as identifiers, logins, device ids which are at least somewhat long-lived.  The longer such an “asset” behaves well, the more it is trusted.  Sometimes attackers intentionally create “sleeper” logins that they later burn.

Add generative AI to the mix, and the playing field tilts more towards the bad actors.  AI driven accounts might more credibly follow “normal” patterns, creating more trust over time before burning it.  They may also be able to enter walled gardens that have barriers of social interaction over time, damaging trust in previously safe smaller spaces.  

What generative AI does is lower the value of observing “normal” interactions, because malicious code can now act like a normal human much more effectively than before.  Regardless of how we want AIs to behave, we have to assume that many of them will be put to bad uses, or even that they may be released like viruses before long.  Even without any new rules, how can we detect and counteract the proliferation of AIs who are scamming, spamming, behaving inauthentically, and otherwise doing what malicious humans already do?

Anyone familiar with game theory (see Nicky Case’s classic Evolution of Trust for a very accessible intro) knows that behavior is “better” — more honest and cooperative — in a repeated game with long-lived entities.  If AIs can somehow be held responsible for their behavior, if we can recognize “who” we are dealing with, perhaps that will enable all the rules we might later agree we want to enforce on them. 

However, upfront we don’t know when we are dealing with an AI as opposed to a human —  which is kind of the point.  Humans need to be pseudonymous, and sometimes anonymous, so we can’t always demand that the humans do the work of demonstrating who they are.  The best we can do in such scenarios, is to have some long-lived identifier for each entity, without knowing its nature. That identifier is something it can take with it for establishing its credibility in a new location.  

“Why, that’s a DID!” I can hear the decentralized tech folx exclaim — a decentralized identifier, with exactly this purpose, to create long-lived but possibly pseudonymous identifiers for entities that can then be talked about by other entities who might express more or less trust in them.  The difference between a DID and a Twitter handle, say, is that a DID is portable — the controller has the key which allows them to prove they are the owner of the DID, by signing a statement cryptographically (the DID is essentially the public key half of the pair) — so that the owner can assert who they are on any platform or context. 

Once we have a long-lived identity in place, the next question is how do you set up rules  — and how would those rules apply to generative AI? 

We could require that AIs always answer the question “Who are you?” by signing a message with their private key and proving their ownership of a DID, even when interacting from a platform that does not normally expose this.  Perhaps anyone who cannot or does not wish to prove their humanity to a zktrust trusted provider, must always be willing to answer this challenge, or be banned from many spaces.  

What we are proposing is essentially a dog license, that each entity (whether human or AI) interacting must identify who it is in some long term way, so that both public attestations about it and private or semi-private ones can be made.  Various accreditors can spring up, and each maintainer of a space can decide how high (or low) to put the bar.  The key is we must make it easy for spaces to gauge the trust of new participants, independent of their words.   

Without the expectation of a DID, essentially all we have to lean on is the domain name service of where the entity is representing itself, or the policy of the centralized provider which may be completely opaque.   But this means that new creators of spaces have no way to screen participants — so we would ossify even further into the tech giants we have now.  Having long-lived identifiers that cross platforms enables the development of trust services, including  privacy-preserving zero-knowledge trust services, that any new platform creator could lean on to create useful, engaging spaces (relatively) safe from spammers, scammers, and manipulators.

Identifiers are not a guarantee of good behavior, of course — a human or AI can behave deceptively, run scams, spread disinformation and so on even if we know exactly who they are.  They do, however, allow others to respond in kind.  In game theory, a generous tit-for-tat strategy winds up generally being successful in out-competing bad actors, allowing cooperators who behave fairly with others to thrive.  Without the ability to identify the other players, however, the cheaters will win every round.

With long term identifiers, the game is not over — but it does become much deeper and more complex, and opens an avenue for the “honest” cooperators to win, that is, for those who reliably communicate their intentions.  Having identifiers enables a social graph, where one entity can “stake” their own credibility to vouch for another.  It also enables false reporting and manipulation, even coercion!  The game is anything but static.  Smaller walled gardens of long-trusted actors may have more predictable behavior, while more open spaces provide opportunity for newcomers. 

This brings us to the point where consensus expectations have value.  Once we can track and evaluate the behavior, we can set standards for the spaces we occupy.  Creating the expectation of an identifier, is perhaps the first and most critical standard to set.

Generative AI can come play with us, but it should do so in an honest, above board way, and play by the same rules we expect from each other.  We may have to adapt our tools for everyone in order to accomplish it — and must be careful we don’t lose our own freedoms in the process.

Filed Under: , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “How To Bell The AI Cat?”

Subscribe: RSS Leave a comment
12 Comments
Anonymous Coward says:

And how would this ensure a fair playing field for everyone?

I can forsee a few big issues with the need for an identifier, and chief among those is identity theft.

Also, how would ypu also handle the poor selling THEIR identities for a hot meal and paying off some of their bills? Because it seems like your proposed answer would create a world where trusted identities are now hot commodities, and only the rich can afford to buy multiple identities to burn per “caise”. And the poor have to sell theirs.

icing says:

This is not a technical problem and DIDs will not solve it. Do you think twitter would stop any new, untrusted DID from paying it $8 for tweeting? Twitter knows very well that many new signups are bots, spammers or paid propaganda.

Every companies wants user growth. They‘ll never put a „Once we know each other well, in a year from now maybe, you can use all of our powerful services!“ sign before your payment.

Security is not worth it.

Ehud Gavron (profile) says:

Social problems have no tech solutions

What a waste of words.

Technical measures don’t solve social problems. When you figure out spam and robocallers, then talk to us about the nonexistent AI issue. Even if AI was a real thing (you know, regurgitating last night’s meal doesn’t make you a chef) that’s not the problem.

The problem is people sending this content attempting to pass it off as real. That’s common to every confidence scam in the last 200+ years. 2000+ years if you call bible miracles real.

When you solve for X, try to make sure X is what you’re wanting to solve. This “2024 AI Problem” hysteria is not X. It’s yet another symptom of unscrupulous people taking advantage of morons.

Solve for X, and don’t make it into a five page thesis. Your Ph.D. will be as worthless as your “solution” otherwise.

Golda Velez (user link) says:

Re: uh, this is from real world experience dude

see below. i think your comment is mostly disingenuous so not responding in detail. but yes i spent some years fighting scammers/bots/fraudsters with some success, this is what gave rise to the thoughts in this article. “social” problems are solved by tech all the time, or do you not have a house with doors that close?

The Phule says:

Ease

It is currently relatively easy to identify a pseudonymous langugage AI due to the fact that it struggles to maintain a self-identity and the fact that it gives extremely inhuman answers to certain types of questions.

I do not see that changing in the near future, and I do not see computing improving in the distant future. Moore’s law is dead.

Anonymous Coward says:

AI runs on the PC. So, I ask it:

User: Who are you?

Llama: I am Llama, a friendly chatbot designed to provide information and assistance in various topics. My primary goal is to be helpful, kind, honest, good at writing, and always available for any requests or questions that may arise.

Biggest problem I see right now is, someone programmed AI to try and tell the truth. And with two political parties in parallel realities, and a government that couldn’t tell the truth if it tried, I think a lot of these calls for “responsible AI” is calls for AI that won’t spit out inconvenient truths. That’s why, while the getting is good, it’s time to set up a local AI running on the desktop.
I’m using llamafile and a 7B uncensored LLM from huggingface.co. It uses under 6GB memory, and works just fine. No need for cloud AI. I can ask it all kinds of personal questions and maintain privacy. Absolutely love it.

Golda Velez (user link) says:

Re: cryptography proofs

right, so when we ask it who it is, we ask it to prove it. signing with a private key is a kind of one-way street, a proof that anyone can verify but only the key holder can make

and yes it has vulnerabilities – quantum computing or more commonly lost keys or stolen keys

but, its a tool – its like we don’t abolish ids because sometimes they are fake. it takes effort to fake them and it is relatively hard in a good DID system- some would say impossible, but anyway you have to grab the private keys

Golda Velez (user link) says:

lessons from risk/fraud

so bit of background, Ayush and I were on the Postmates Risk team a few years back, and have experience building models and features to block fraudsters at scale.

One of the key learnings in risk is that long-lived assets – ip address, device id, login, anything that can be tracked over time – these are key to distinguishing fraudsters from real users.

and this article is talkinga bout that in the more generic world of AI bots and humans who may not be distinguishable, and how to hold AIs responsible when they can more easily invade spaces that used to be limited to humans who could navigate complex interfaces, and interact convincingly.

So its not a new problem, but the threat models are adapting, and pointing out that making use of DIDs more of an expected standard is important to respond to them.

I think i said that in the article! but clearly some folks in the comments thought i’ve never blocked a bot before…

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...