Tapping Google's AI for SEO
by Bob Sakayama
The CEO of NYC based consultancy TNG/Earthling, Bob Sakayama has managed the search performance strategies of this and thousands of other websites. He specializes in large & multi-site systems where search is critical, and has long been a leader in the remediation of Google penalties - see Google-Penalty.com. He's been quoted in Forbes, addressed seo gatherings as guest speaker, called as an expert witness, and given rare interviews. He created Protocol, the first search enabled content management system, currently running on several thousand websites. He serves both very large and very small businesses, along with seo agencies in multiple countries, and investors seeking search risk evaluation.
Updated 10 October 2022
AI Content Triggers Penalties
As predicted earlier this year, there are now a good number of SEO products using artificial intelligence to assist in ranking websites. These tools can reveal information and competently write English language content. The most popular is probably Jasper, the writing tool targeting SEOs but whose actual market is much larger because it's basically a content generator for any topic. The most sophisticated SEO tool is on-page.ai which uses Google's api to reveal detailed information regarding improving semantic optimization, along with generating readable text. These tools can very quickly create original content based on seeded topics and keywords that is good enough to cut and paste. The creators of these tools encouraged that because they knew that the AI content was actually original - created by the AI engine, not copied from another source, so it wouldn't be flagged as redundant. And it was fast. Writers could rapidly generate much more content and had a way to become super productive. It was too good to be true. The popularity of these tools and their widespread use as content generators eventually triggered blowback from Google.
The thing about AI generated content is that while factually true, even the most detailed passage is written generically. Google launched its "Helpful Content Update" to detect AI written content and harmed the ranks of sites relying on it. From the penalized sites we've seen, the issue can be remediated by avoiding generic stats and facts while tightly connecting the content to the purpose of the website. From Google's point of view, facts about certain legal problems and remedies do not help you understand how a specific law firm would handle your need. Really interesting to see Google's AI outing AI generated content.
Our note to SEOs: It's ok use the facts generated by the AI writer in your content, but make those facts useful, helpful, personal, and specifically relevant to the motive behind the search query.
Some Of Our Old Tools Just Got Deprecated
Google has been training their AI to improve the methods by which it determines semantic relevancy, and it's been impacting the search results for a while. Extremely valuable actionable knowledge can be derived from (1) the information revealed in their 2020 patent application combined with (2) access to their Natural Language Processor & Entities Data. Tools running their AI can now provide us with actionable information, in real time, concerning what directly influences their relevancy decisions. And that information is very specific and delivered in real time. Their algorithm is constantly changing, and those changes are instantly reflected in the output of their AI processes which we can observe.
Optimizing The Semantic Signature of Content With Help From Google
You can now purchase api access to Google's Natural Language Processing (NLP) for AI which analyzes unstructured text using Google machine learning. This, combined with their Entities data, are used to determine the relevancy of content for a given search term. It's been available for a while, and access has quickly become a marketable service - more SEO tools are on their way. The self improving nature of AI means that in principle, the ability to accurately understand content and assign relevance will be getting better with time. We can now be guided by Google's own standards toward the most effective optimization at the moment, creating the ideal semantic signature for a url.
The rapid adoption of artificial intelligence by Google along with recent revelations from their patent applications have created an opportunity to use this technology to improve search performance by addressing in detail a factor that is known to be important. Google's search engine patent applications demonstrate the organizational semantic concepts that provide the foundation of their logic. For example, all sites are assigned Categories or Topics and each may have multiple children categories - eg. /Shopping/Apparel/Casual Apparel. This and the thinking that lies behind the effort to understand semantics are available from their patents of March 2020, which goes into much more detail.
SEOs read these patents and started applying semantic strategies that applied the same organizational structures to their own content. A year later, the results of the first successful experiments using a strict semantic rule set following the principles set forth in Google's patent, started coming in. In February 2021, Koray Tuğberk GÜBÜR posted the details of impressive results by focusing only on the semantic organization of the content, and many others followed. This is the leading edge of a major shift that is underway - powered by AI and signaled by the recent focus on semantic SEO and by data terms of art, like "entity" entering the SEO vernacular.
There are new benchmarks being set for the requirements to fully optimize a site because we can now know much more about what Google is looking for. For example, the self improving semantic model Google uses to determine Categories relies on AI to analyze a site's content. It's still reading the words, but technology has enabled Google to move from matching words and text strings, to efforts to match intent and motive of a search query, greatly improving the quality of their results. So if you're trying to rank a website, the knowledge of how this evolving semantic process works, including what is valued, and how relevancy is determined, is super valuable.
The Importance of Entities
Google's NLP AI is focused on analyzing words to understand the meaning of content and search queries. The output of the tools can reveal what Google sees when it reads content in terms of "entities". These are keywords that have a specific meaning and convey an understanding as a result of usage. For example, the word "party" has a completely different meaning when used in content for an entertainment venue as opposed to content representing a law firm. The number and types of these entities determine the Categories assigned to a url. Just knowing what these entities are is huge. If you purchase the api, you can download all the current Google entities.
But the real power of AI is revealed in the comparisons of the semantic signatures of competing sites. Comparing the data from the top competitors for a given keyword exposes missing entities that can be easily remedied via simple content revision. The tools can reveal all of the most popular entities from the most successful sites. Many of these entities are keywords associated with valuable targets, but some are indications of non semantic relationships that expose intent or motive. These are the connections that permit a site to rank for a keyword not in its content because the AI understands the meaning of the entities from context.
The Value of Situational Awareness
While this new approach is hugely valuable, it is just one element of many that influence search results. The reality is you can handily beat the competition semantically but underperform in the rankings. What we notice when we run existing client sites is that because most of the entities make common sense, and many are revealed with existing tools, we already were using most. And it is common to see excellent semantic optimization being overwhelmed by content from a more powerful brand or one receiving large numbers of authoritative links.
But in the super competitive atmosphere that most businesses have to contend with, these tools provide a welcome edge. They are very likely to expose weaknesses and present opportunities that previously went unseen. And it definitely makes sense to run them on an existing url that has not been semantically optimized. This would automatically include any important target url that has less than 1,000 words. It might also help with diagnostics on an underperforming keyword or url.
The obvious reason to run the AI is to see how Google views your content in comparison to your top competitors while it exposes targeting flaws in a way that is immediately addressable. Knowing what Google considers the most important words is obviously hugely valuable, and Google's AI tells us exactly this.
Putting It To Work
I expect to see a growing number of subscription tools that enable running Google's AI engine on your own content. They're pretty hard to find right now, but they're coming. Because they're paying Google for the apis, these tools can be expensive, but the revelations are impressive. At the simplest level, they can confirm that your site has been assigned the proper category and is recognized for its most important keywords, and if not, point to the fix. In a more sophisticated scaled environment where relevancy is created across large numbers of pages, knowledge of the entities permits you to create intelligent internal link strategies to effectively optimize very competitive terms.
We brought this technology into our practice and have been running it on selected sites. We can see Google's algorithm recognizing the improvement in our content immediately. I don't have the expectation that these improvements will always result in rapid rank advances. Instead, the goal for the enterprise is to always be as optimized as possible - the payoffs come with time as these efforts are recognized.
Some of the best opportunities to apply these new tools lie with legacy pages that already convert but have never really been semantically optimized. Or any highly focused page with comparatively little content. Or addressing a new target in a competitive field. With these new tools, we can quickly act on information directly from Google's algorithm and implement content that perfectly aligns with their relevancy models. This has become the new standard for semantic optimization.