AI Checkers Forcing Kids To Write Like A Robot To Avoid Being Called A Robot
from the delve-devoid-underscore dept
Can the fear of students using generative AI and the rise of questionable AI “checker” tools create a culture devoid of creativity? It’s a topic that is curiously one worth delving into a bit more deeply, in part because of something that happened this weekend.
Earlier this year, we had a post by Alan Kyle about California bill SB 942. That bill would require AI companies to offer a free AI detection tool, despite the fact that such tools are notoriously unreliable and prone to nonsense. As Kyle wrote, the bill takes a “nerd harder” approach to regulating technology its backers don’t understand.
SB 942 has continued to move forward just passed in the California Assembly. It’s now on Governor Newsom’s desk to potentially sign.
I was thinking about that this weekend after a situation at home. One of my kids* has an English homework assignment. They had to read Kurt Vonnegut’s famous short story, Harrison Bergeron, and write a short essay about it. Since I do a fair bit of writing, my kid asked me to review the essay and see if I had any pointers. I gave a few general suggestions on how to think about improving the flow of the piece, as it read very much like a standard first draft: a bit stilted. My kid went off to work on a rewrite.
If you’re unfamiliar with the story of Harrison Bergeron, it’s about a society that seeks to enforce “equality” by placing “handicaps” on anyone who excels at anything to bring them down to the least common denominator (e.g., ugly masks for pretty people, having to carry around extra weights for strong people). One of the morals to that story is on the perils of seeking to force equality in a manner that limits excellence and creativity.
Later in the day, the kid came by with their school-issued Chromebook, which has Grammarly Pro pre-installed. The students are encouraged to use it to improve their writing. One thing that the tool has is an “AI Checker” in which it tries to determine if the submitted text was written by AI.
This is similar to “plagiarism checkers” that have been around for a few decades. In fact, Grammarly’s “check” covers both AI and plagiarism (or so it says). Those systems have always had problems, especially around false positives. And it seems that the AI checkers are (unsurprisingly) worse**.
It turns out that Grammarly only just introduced this feature a few weeks ago. Thankfully, Grammarly’s announcement states pretty clearly that AI detection is pretty iffy:
AI detectors are an emerging—and inexact—technology. When an AI detector definitively states whether the analyzed content contains AI, it’s not acting responsibly. No AI detector can conclusively determine whether AI was used to produce text. The accuracy of these tools can vary based on the algorithms used and the text analyzed.
Anyway, the kid wanted to show me that when the word “devoid” was used, the AI-checker suggested that the essay was “18% AI written.” It’s a bit unclear even what that 18% means. Is it a “probability this essay was written by AI” or “percentage of the essay we think may have been written by AI”? But, magically, when the word “devoid” was changed to “without” the AI score dropped to 0%.
In Grammarly’s announcement, it claims that because these tools are so flaky, it “does things differently” than other AI checker tools. Namely, it says that its own tool is more transparent:
Grammarly’s AI detection shows users what part of their text, if any, appears to have been AI-generated, and we provide guidance on interpreting the results. This percentage may not answer “why” text has been flagged. However, it allows the writer to appropriately attribute sources, rewrite content, and mitigate the risk of being incorrectly accused of AI plagiarism. This approach is similar to our plagiarism detection capabilities, which help writers identify and revise potential plagiarism, ensuring the originality and authenticity of their work.
I can tell you that this is not true. After the kid continued to work on the essay and reached a point where they thought it was in good shape, the AI checker said it was 17% AI, but gave no indication of what might be AI-generated or why.
Now, to be clear, the essay can still be turned in. There is no indication that the teacher is relying on, or even using, the AI checker. When I mentioned all this on Bluesky, other teachers told me they know to basically ignore any score under 60% as a likely false positive. But my kid is reasonably flustered that if the AI checker is suggesting the essay sounds like AI wrote it, that it might mean there’s a problem with the essay.
At that point, the hunt began to figure out what could possibly be causing the 17% score. The immediate target was more advanced vocabulary (the issue that had already been identified with “devoid.”)
The essay did use the word “delve,” which has now become something of a punchline as showing up in every AI-generated work. There’s even a study showing the massive spike in the use of the word in PubMed publications:

Even crazier is the use of both “delve” and “underscore.” However, my kid’s essay did not use “underscore.”

The main theory I’ve seen is that the reason “delve” is so popular in AI works is that some of the training and data commonly used in AI systems was done in Nigeria and Kenya, where the word “delve” is more common. This has resulted in some arguments online, such as when online pontificator Paul Graham tweeted out how receiving an email with “delve” in it indicated it was written by ChatGPT, leading a bunch of Nigerians to call him out by mocking him, and highlighting that other cultures use language differently than he might.
Either way, the “delve” in my kid’s essay was not written by AI. But, just to be safe, the word was replaced. As were some other words. It made no difference. The AI checker still said 17%.
At one point, we looked at a slightly oddly worded sentence and tested removing it. The score went up to 20%. At that point, the kid just started removing each sentence, one at a time, to see what changed the score. Nothing actually seemed to do it, and despite Grammarly’s promise of transparency and clarity, no further information was provided.
All of this struck me as quite a series of lessons. First, it points out the absolute stupidity of bills like SB 942 which will only increase, rather than decrease, this kind of AI dousing rod woo woo divination.
But, the bigger lesson has to do with AI and schools. I know that many educators are terrified of generative AI tools these days. Plenty of educators talk about how they know kids today are turning in essays generated by ChatGPT. Sometimes it’s obvious, and sometimes less so. And many are not sure what to do about it.
I’ve seen a few creative ideas (and forgive me for not remembering where I saw these) such as having the students create a prompt to get ChatGPT to write an essay related to a class topic. Then, the real homework is having the student edit and correct the ChatGPT output. The students are then told to hand in the prompt, the original ChatGPT essay, and also their corrections.
A similar idea was to have the students write their own essay and then also have ChatGPT write an essay on the same prompt. Then, the students had to hand in both essays, along with a short explanation of why they thought their own essay was better.
In other words, there are some ways of approaching this, and as time goes on, I expect we’ll hear of more.
But, simply inserting a sketchy “AI checker” in the process seems likely to do more harm than good. Even if the teacher isn’t guaranteed to be using the tool, just the fact that it’s there creates a challenge for my kid who doesn’t want to risk it. And it’s teaching them to diminish their own writing skills in order to convince the AI-checker that the writing was done by a human.
And that seems, ironically, quite like the lesson of what “Harrison Bergeron” was supposed to teach us to avoid. Vonnegut was showing us why trying to stifle creativity is bad. Now my kid feels the need to stifle their own creativity just to avoid being accused of being a machine.
I’m not against AI as a tool. I’ve talked about how I use it here as a tool to help edit my (human) writing, to challenge me, and to push me to be a better (human) writer, even as those tools tend to be awful writers themselves. But I fear that with there being such a fear about “AI writing,” the end result might actually make people write less with the creativity of humans, and more to simply avoid being called out as a machine.
* In case you’re wondering, I checked first to make sure they were okay with me writing about this before telling this story and have kept details to a minimum to protect their privacy.
** After reading through a draft of this piece, kid suggested we should run this through an AI checker as well, and it tells me (falsely) that 3.7% of this article appears to be written by AI (it specifically calls out my description of Harrison Bergeron as well as my description of plagiarism checkers as likely written by AI).
Filed Under: ai, ai checker, generative ai, grammar, school, writing
Companies: grammarly


Comments on “AI Checkers Forcing Kids To Write Like A Robot To Avoid Being Called A Robot”
Sounds like what is happening with captchas lately
https://techxplore.com/news/2023-08-bots-captcha-humans.html
It is really getting much harder “To Tell Computers and Humans Apart”.
And now it is becomming the opposite of trying to stop bots and allow humans.
Re:
@GHB, Captcha checkers aren’t really captcha checkers anymore. They are history and device fingerprinting.
The puzzle is just used as a bootstrap to run obfuscated code that normally wouldn’t run without a click event. It runs hardware/device fingerprinting locally in your browser, including some of your recent history and submitting it in obfuscated form.
There was a write-up Proof of Concept of how this works at a site called varun awhile back.
While the server side generation code is no longer available, sufficient information is extracted to uniquely identify a device, and local network device map, through the 0.0.0.0 loophole, which is then identified anonymously and tied back to a dossier utilizing a building-a-bridge strategy (building out from origin and destination (backwards) until reaching the connection point.
This is fairly common knowledge in some places.
Re: Re:
I was under the impression RECaptcha also was assessing how you interacted with the prompt, tracking your mouse cursor and such using the existing data used to support accessibility options, to assess for human behavior. Was that short lived?
Re: Re: Re: Re:
Mouse cursor tracking was used, I think; but at this point, surely recording human cursor tracking, adding random jutter, and replaying it would be almost trivial.
Both "creative ideas" to combat AI essays mentioned here are bad
Both suggestions could still be completed 100% by AI without any understanding of the material. The only two viable solutions I’ve seen are a) have the work done entirely in class, or b) conduct an oral exam where the student defends the essay. Both of these have significant challenges.
Re:
I really don’t think so. Or at least not well. I actually really like the first idea, because if you want to cheat with it, it would almost certainly not come out well.
But, more importantly, I always found that the best way for me to learn something was to have to teach it to someone else. And that’s what that first idea really simulates.
Re: Re: 2x2 inch cheat sheet
You remind me of the teachers that told everyone they were allowed a 2×2 inch cheat sheet for the test which they could turn in. A most memorable way to get the homework done.
Re: Re: Re: The Danny Dunn homework computer solution
With a magnifying glass and a 2″x2″ microfiche card, you can get a lot in.
At which point the teacher gives you credit because of the work you did to put the card together.
Re: Re: Re:2
Well, that, and such tests are usually timed. And finding information on such a card can take comparatively a lot of time.
Re: Re: Re:3
Plus, in some cases, needing to understand which information is actually relevant to the question, and how to use it properly.
—
For example, I remember how in first year chemistry and physics courses (especially units on thermodynamics) a fair number of “straight ‘A'” and “reliable ‘B'” students were suddenly running into a lot of difficulty they didn’t know how to deal with.
These students were well-practiced in finding “the right values” for the right variables in a formulae, and if necessary juggling the formulae a bit to isolate the variable they were trying to calculate, but this was no longer sufficient to obtain the right answer.
Because there was now more than one formulae that had the same variables, and they now needed to really understand/have insight into what was going on in all those formulas, a cheat sheet (and these were allowed) with all the formulae they were expected to use wasn’t all that much help to them).
Re: Re:
Yeah, trying to use ChatGPT to sound like something better than ChatGPT is like using the Enterprise computer to invent a villain capable of out-thinking Data. Interesting dramatic concept, pretty much impossible in execution. If it is capable of being better than it is, why doesn’t it just…do it all the time?
Re:
This is assuming understanding the material is the point of the essay. In school I had plenty of assignments (mostly in English/Writing classes) where the teacher wouldn’t have a way to gauge my understanding the the material – the point of the assignment was to assess my understanding of the writing process. The assignment Mike created was more about understanding the writing process and the limits and issues of generative LLMs
started with Turn It In
My kids were in school when the district mandated the use of turnitin.com. late 90s early 2000s, biggest IP issue was really “stealing music”.
That company’s proposition was to store everything written by students and cross check it, looking for plagerized, or paid for material.
I hit the roof because my kid’s work and all other student’s work is owned by them, and they are minors, so can not agree to a contract to assign the work for unconpensated use by anyone else.
A lawyer friend (nog IP lawyer ) offered that perhaps student homework is a work for hire, so is already assigned to the teacher/school/district.
Anyway I went to the school board meeting and told them what they were doing was abusive, non compensatory, confiscative… Against the 13 Amendment!
Anyway, I asked, wtf is wrong with asking the teachers to do their jobs! They know the kids, they have fine honed senses and experience, they are really good at detecting plagiarism.
The board sure got an earful from me, and no other parents were at all concerned.
Pathetic.
I didn’t move the needle.
This comment has been flagged by the community. Click here to show it.
Re:
Dorky TD readers posting their Ls.
Re: Re:
While your existence in itself is an L.
Re: Re: Re:
Well, don’t expect too much of the troll because if there’s one thing I learned in life it’s the people who own up to failures and mistakes voluntarily who also possess integrity.
The rest, like the lame troll above, will spend not an inconsiderable amount of effort to hide any failures/mistakes they made, and especially so for trolls, since they think doing it shows weakness.
Re: Re: Re:
Our L (H-E-double hockey sticks).
Why not just pass a law requiring some arbitrary tech company to solve all our problems with magic? Stop the whole “death by a thousand magic cuts” bullshit.
Re:
Sounds good. but can we just make it a requirement that the regulatory bodies have to directly be the ones to magic away the problems?
Cut out those pesky Tech companies out of the equation. Why should they get any credit for magicking away problems?
Re: Re:
That’s been tried.
The results have consistently ranged from ‘ludicrous’ to ‘horrific’.
Re: Re: Re:
Silly me. I was sure the word “magicking”, as applied to real world things, would tip people off that this was not a serious and literal conversation.
Re: Re: Re:2
It’s the result of magical thinking were people who write laws think it’ll magically solve a problem that either doesn’t actually exists or doesn’t actually solve the problem while introducing a host of other problems, aka “magicking”.
For a reference how lazy some of these lawmakers can be and the problems they create, see The Sorcerer’s Apprentice.
I remember when I was taking an online college course that I had to rewrite some dumb half page or page summary and it got flagged because it was too similar to a previous one of my own because they were both generic bs I had to write about the same thing. Because automated software decided I was plagiarizing myself I had to purposely write differently just to get any grade.
That was before AI. Asking if something was written by Ai is not possible. All these systems will do is make pain for students and waste tax payer money.
artifice
Once AI pedagogy has been sold to every school district, and once all the ensuing litigation is over, nothing will have benefited except tech bros and sales bros. It’s the American way.
Re:
It will also have provided administrators plenty of opportunity to pretend they’re earning their salary.
Re:
You forgot the lawyers. They will also win big from all the litigation.
Re: Re:
All of the lawyers? Or just the ones that managed to keep the use of AI strictly restrained in their own practice?
γνῶθι σεαυτόν
I have my doubts that training basic AI engines to distinguish texts written by humans from texts written by advanced AI engines trained with human-written texts with the goal of being hard to distinguish from texts not written by humans is going to work all that well.
In a nutshell, meta means harder. An advanced AI might recognize a less advanced AI’s work. But those AI checkers running on a normal computer work with the reversed odds.
Maybe I'm an AI
I ran some content I wrote through a few AI checkers, and most gave me a high confidence score that the text was AI-generated.
I then created a block of text with ChatGPT, removed the more obvious AI buzzwords, and submitted it. The AI checkers all concluded that it was most likely created by a human.
If schools are planning to use these tools to evaluate students’ work, they need to consider the high likelihood of false positives.
Of course, there’s still the chance that I’m an AI, and no one bothered to tell me.
Re:
Within the year, all the students subjected to AI supervision will know how to do this.
Re: Re:
Orwell would claim it’s designed to diminish the range of thought – or of complaint.
Re:
Keep joking but one day, AI will write AI to better detect other AI, and forget what human was supposed to be.
Stand back! I possess a vocabulary and am cognizent of its utility and deployment!
Holy crap. Educate a child such that they’re reading at 5, enjoy reading, and have a high school vocabulary at 10 and you might as well have the AI meter ringing alarms in the pentagon!
This comment has been flagged by the community. Click here to show it.
Column AI check
Congrats to the kid on a fantastic idea, doubling the key data at zero cost.
This comment has been flagged by the community. Click here to show it.
AI ethics and writing
As a teacher of writing at a university where English is not the native language, I have been wringing my hands over the use of AI for months. Before that it was plagiarism and Google Translate. My instincts reject such artificial help and some of the questions raised here help me to focus. Mike’s son of course uses words he has read/learned and this highlights a point several students have made concerning Turnitin which the university relies on heavily for thesis work. Students complain that everyday words/phrases are flagged as well as engineering/technical phrasing; but as they point out, all their input has been from textbooks and journal articles, so this absorbed content flavors their output. The comment on “delve” and its possible source was revealing. I rely on my own experience over several years (reading/writing/editing) and can sense (stylistically) direct student input, translated input, copied input and the bland output that AI, Grammerly, et al tend to produce.
It reminds me of that proto-AI website “is it porn”. Fun stuff.
Unless someone comes along and says “lower number better” for the AI checker I wouldn’t even begin to worry about it.
"Digital Age"
Ask an AI anything about the modern internet and it will desperately try to add “Digital Age” to whatever it is writing.
“Because no human being ever used the word ‘devoid’, which was obviously created by software and not human beings,” claimed the CEO of Grammarly, Rahul Roy-Chowdhury, as he is clearly devoid of common sense.
I honestly think culture is already most of the way dead and simply hasn’t fallen over yet. 99% of “people” (quoted ’cause I’ve no idea how many are bots) I see writing anything online could be switched out for anyone else from those 99% and I wouldn’t be able to tell.
Add the fact all algorithms are geared toward “engagement”, and crown it with American “sensibilities” (read: their deathly fear of words arbitrarily deemed “bad” – like ‘fuck’ and ‘suicide’) and we end up with the cesspit that is YouTube, for example, and that’s before shit like a half-assed LLM ruining kids’ futures even enters the picture.
This’ll only get worse until US corporations are regulated. Which is to say it won’t, until and unless children of corpo-rats and/or politicians are affected.
Obviously, since you are a reporter, you are AI.
Your kid wrote an article, that is literally just the AI making a new AI to write an article(in this situation, called an “essay”
Are you that new “Turkey Sammich” LLM?
🙂
WE KNOW WHATS GOIN ON HERE!!!!!!!!!!
(This entire post is entirely a work of my own goofiness, any resemblance to reality or anyone else’s goofiness is coinkydinkimus)
When I was in high school about a decade ago we had to submit most of our essays via turnitin.com to check for plagiarism. Most teachers knew this was a joke and we all quickly learned to upload it, change any seemingly random sentences that it highlighted, then submit the paper. I think its real use was catching students in the same class copying each other, which I do recall happening a couple times. But shouldn’t teachers be able to do that one their own?
Anyways, I get why companies are trying to duplicate this mechanism for AI, but it really makes no sense to me. As it will likely be just as effective as general plagiarism detection -as this post proves- and has no intra-class benefit.