Is there a way to detect if a given piece of writing is written with the help of AI writing tools? Are there any AI content detection tools available? Can they reliably determine if the author of a given text is human or machine?
What is Google's advice on the use of AI written content on your websites? Can Google detect AI generated content? Will Google penalize and derank your website if you have a substantial amount of AI written articles on it? Do AI writing tools help or harm your SEO strategy?
Read this article to find out answers to these questions and more!
AI Generated Writing is Growing Everywhere
It is becoming common to encounter AI generated text, art, music and videos in the 21st century. Dalle-2 is able to generate original high-quality images from text prompts. With large language models like GPT-3, individual creators and large organization are automating content creation on a large scale. Bloggers, copywriters, and content marketing teams are regularly using AI content writing software in their projects. Such automatic writing applications must feel like magical gifts, especially for smaller teams. Finally, they can scale their content pipeline and libraries at affordable budgets and in a short time.
There are many bot accounts on Twitter, Facebook, and other social media platforms. They comment, reply, and interact with a large number of users automatically and at scale. While some projects use such bots for creative and helpful purposes, there are many bot armies that spread spam content and try to scam unwitting users. Be it crypto scams or fake news to influence public opinion during elections, such bot accounts can convincingly apply natural language processing to fool people.
This phenomenon will only grow over time. How will online platforms deal with AI generated content? What rubric and criteria will they use to separate good uses of AI from bad ones? Only time will tell, but Google has already issued a warning regarding this in April 2022.
Google Does Not Like AI Generated Content, How Can Google Detect AI Content?
In an online webinar in April 2022, search advocate John Mueller of Google commented on this topic. He revealed that the content that is generated automatically using AI writing tools is seen badly as per Google's Webmaster Guidelines. For a while, there has been a debate in the SEO community regarding the proliferation of GPT-3 powered AI content generators. Mueller was asked a question by a user, and he replied that Google uses manual reviews to detect AI content, and penalizes if such instances are found out.
“For us these would, essentially, still fall into the category of automatically generated content which is something we’ve had in the Webmaster Guidelines since almost the beginning.
And people have been automatically generating content in lots of different ways. And for us, if you’re using machine learning tools to generate your content, it’s essentially the same as if you’re just shuffling words around, or looking up synonyms, or doing the translation tricks that people used to do. Those kind of things.
My suspicion is maybe the quality of content is a little bit better than the really old school tools, but for us it’s still automatically generated content, and that means for us it’s still against the Webmaster Guidelines. So we would consider that to be spam.”
— John Mueller, Google Search Advocate (April 2022)
There was a follow-up question from the community whether Google can automatically detect AI generated content at scale? While Mueller said at the time that this is not possible, and that Google relies on human reviewers, we found out something different.
Are there any free AI Content Detection Tools?
There are AI content detection tools like OpenAI GPT-2 Detector by HuggingFace, and GLTR by a small team of AI researchers from MIT and Harvard. They take your input text, and give insights regarding the probability that it is AI generated or not. And the GPT-2 detector can often detect GPT-3 writings as well.
So, there is no reason why Google cannot build and deploy automated and advanced AI content detection tools. It seems like only a matter of time before Google and other search engines start to penalize and derank websites using AI content detection at scale.
How to Detect AI Generated Content?
We found out two basic tools that can be used to see if a piece of writing is classified as human-written or AI-written. Both these tools are free and provided as demos rather than finished products for end-users. So the results from them are only suggestive. They can produce false positives and false negatives at times as well. But even so, seeing them automatically distinguish manually written text from AI written text is alarming. Let us briefly understand how you can use these two tools to detect AI content.
The GPT-2 Output Detector tool hosted on HuggingFace is based on the code released by OpenAI itself. It analyzes your input text, and gives you the probability in terms of percentage that it is Real (written by humans) or Fake (generated by AI).
- Go to https://huggingface.co/openai-detector/
- Copy 50–700 words from any piece of text, and paste it in the input box.
- The tool will immediately give results as Real or Fake. If it is not showing any results within a couple of seconds, it means the word count of your copy-pasted text is more than its limit.
- See this result where the text is classfied as 99% Fake, with Red color on the bar.
- See this result where the text is classified as 99% Real, with Blue color on the bar.
- For some samples, it can give weak results, something like 30-70 or 50-50. It might be the case that you took AI generated text and then edited it substantially.
Giant Language model Test Room, or GLTR, is like a forensic tool which analyzes a body of text. Then, it colors the words individually as green, yellow, red, and violet. These colorings are indicate if a given word has a high probability of occuring after the previous word. If most of your text is shown as green, it is fairly clear that it was generated by a language model. And if your text has a healthy mix of all colors, it is fairly clear that a human wrote it.
- Go to http://gltr.io/dist/index.html
- You can select from their sample texts by clicking on the boxes. e.g. click on the first demo text box "machine: GPT-2 small top_k 5 temp 1".
- Or, you can copy-paste any body of text, such as from your own blog. And then click on the Analyze button.
- The GLTR tool analyzes and shows you the result by coloring the words.
- You can see how it entirely colors the demo text "machine: GPT-2 small top_k 5 temp 1" as Green. This means it is machine generated text. Note that usually you will not get all green result like in this example. There will be some yellow and red too. But the higher the distribution of green words, the more likely that it is automatically generated in a predictive fashion.
- You can see how the human written "human: NYTimes article" demo sample produces a colorful analysis. This indicates that the text contains several low probability words in the given sequence. Thus, the likelihood that a real human wrote this text is higher.
Are AI Writing Tools Good or Bad for SEO?
While there is no clear consensus, nor a large scale study on this topic, there are many voices advising caution. Bloggers who used AI writing tools for their websites have experienced rejections from Google AdSense. On Youtube, SEO experts are sharing that Google delisted their clients' sites because 70% of their content was AI written. And now, they are rewriting their entire content library with the help of human writers.
While earlier, there were many positive voices saying how AI writing tools helped them rank higher, now the opposite views are surfacing. Our guess is that this is a cat-and-mouse game. Sometimes you will reap benefits from using AI tools. And sometimes you will incur losses. The SEO and AI writing tools landscape is constantly changing.
Another user suggested that rather using AI content generator software, they have been using AI paraphrasers. Their workflow involves manually researching and compiling accurate and reliable information. And then using a paraphrasing tool like Quillbot to reword the sentences and paragraphs. Their experience has been that their readers see their content as high quality. Search engines are also ranking their pages well.
But such instances are anecdotal. It could be that is only a matter time till Google, Bing and other search engines will flag their content as plagiarized or AI generated. So, the safest bet is to stick to human writing workflows and ensure high quality of work. You can always use AI tools to automate and speed up other parts of your workflow.