What Google Doesn't Want You to Know About Their Search Algorithm
uahBcx6vTf8 — Published on YouTube channel Ahrefs on October 8, 2024, 11:00 AM
Watch VideoSummary
This summary is generated by AI and may contain inaccuracies.
- Thousands of leaked documents revealed secrets to their search engine algorithm they never wanted us to see. - The leaked documents reveal a metric called site authority. In the context of search engine optimization, site authority often boils down to backlinks.
Video Description
The leaked Google documents reveal secrets into how the Google search algorithm may work. Watch as we go on a journey to see if Google lies to us and why.
***************************************
Additional Resources
The SEO fundamental EVERYONE gets wrong ► https://www.youtube.com/watch?v=WLVMVCwWJxk
The Google Update That Crushed His Business Overnight ► https://www.youtube.com/watch?v=FozpcXclAtU
SEOs: Do this to make Google's algorithm love you ► https://www.youtube.com/watch?v=ngekIXiFN3
Complete SEO Course for Beginners: Learn to Rank #1 in Google ► https://www.youtube.com/watch?v=xsVTqzratPs
***************************************
Follow along as we take a look at leaked Google documents that reveal some surprising facts about how their search algorithm may work. For years, Google has denied certain ranking factors, but these leaks suggest a different story — one that could change the way we approach SEO and ranking in Google.
If you’re looking to stay ahead of the game and understand what might really be influencing search rankings, this video breaks down the key findings and what they may mean to your website.
#googleleak #googlerankings #seo
Be sure to subscribe for more actionable marketing and SEO tutorials.
https://www.youtube.com/AhrefsCom?sub_confirmation=1
STAY TUNED:
Ahrefs ► https://ahrefs.com/
YouTube ► https://www.youtube.com/AhrefsCom?sub_confirmation=1
Facebook ► https://www.facebook.com/Ahrefs
Twitter ►https://twitter.com/ahrefs
Transcription
This video transcription is generated by AI and may contain inaccuracies.
Speaker A: A few months ago, thousands of leaked documents surfaced from Googles fault, revealing secrets to their search engine algorithm they never wanted us to see. Buried within were four discoveries that, if true, change everything. For over a decade, Google has denied that these things exist. And while lone wolves have called out Google for spreading misinformation, they were often ignored and neglected. After polling people on Twitter, 72% believe Google is actively deceiving us. I had to uncover the truth. Why would Google lie? What could they gain by misleading millions of us who rely on their guidelines to optimize our websites? Is it just a misunderstanding? Or could something more sinister be at play? Because if Google, the company that once vowed don't be evil, is manipulating us, then who can we trust? The first piece of this puzzle takes us back to 2012, centered around a product youre probably using right now, Google Chrome.
Speaker B: Theyre moving toward taking over every part of our computing life.
Speaker A: At a search engine marketing conference, Matt Cutts, former engineer and head of Webspam at Google, was asked if Chrome browser data is used in their search ranking algorithm, to which he denied cuts eventually left Google, and with him, the rumors about Chrome data faded away. But a decade later, John Mueller, a senior analyst at Google, faced a similar.
Speaker C: What data does Google Chrome collect from user for ranking?
Speaker A: I don't think we use anything from Google Chrome for ranking. These denials planted the seeds for what would become the first of Google's four lies exposed in the leak. The documents revealed a module called Chrome. In total, according to the source of the leak, Google wanted the full clickstream of billions of Internet users. Clickstream data is basically like a map of everything you do online. It tracks the websites you visit, the links you click, where you pause and how long you stay. And its a goldmine of data for businesses because it helps them understand how users interact with their apps and websites so they can improve user experience and ultimately boost conversions. And Chrome handed it them on a silver platter. Thing is, Clickstream data is usually limited to sites and apps you own. And since Google owns Chrome, the app where 63% of Internet users do their browsing, Chrome data can be used to understand how over half the planet interacts, not just with search, but with the entire web. But why would Google lie about using Chrome data?
Speaker B: My strong suspicion is it started because when Google started, they were competing for market share in the search world with Dogpile and Altavista and Lycos. Can hotbot and MSN search and ask jeeves? And they rose to the top of that pile fairly quickly. But I think internally at Google, they believed for decades after they had to keep all this stuff secret. I also would strongly suspect Google is using their monopoly power in search to dominate video, to dominate maps, to dominate flight search, to dominate finance search, to dominate news search. Leveraging your monopoly power in one sector to unfairly compete in other sectors, that is what's specifically called out in us antitrust law.
Speaker A: I mean, look, the most basic takeaway is if somebody has data to make a product better, they're probably going to use it. And the motivation? Money.
Speaker C: Google's a public company. They have to make more money every single quarter. The data that's in Chrome, it allows them to profile users. It gives them all kinds of competitive intelligence, data that just no one else has. And they can use that to keep other competitors out of the market.
Speaker A: Contrary to popular belief, advertising isn't Google's main product searches. And for a company that sells ads within search, they need their search engine to deliver the best results every time. Chrome data may have been the secret weapon that's kept them miles ahead of the competition, making them nearly impossible to catch. And arguably the most valuable data point that comes from Chrome is click data. Clicks are the bridge between searchers and search results, and they're a key part to understanding intent at scale. So I just googled running shoes and I'm trying to decide which result to click on. So you can see that the majority of results are ecommerce category pages. But I'm actually interested in reading a comparison guide. This one happens to be the only one, so I'll click it. It's pretty good. I read it and I didn't return to the search results because I got what I was looking for. But this is just my experience. What if 70% of searchers that enter this query each month click on the same page and behave in a similar way? That's a clear signal to Google that most searchers prefer a comparison guide. So logically, Google should rank that page higher and probably feature more comparison guides than the top ten results, right? All of that insight comes from a single click, the very thing Google has spent billions trying to master. Yet despite common sense and even a 2006 patent where Google explicitly states results on which users often click will receive a higher ranking, Google denies using clicks in their ranking algorithm. In fact, in a 2015 AMA, Gary Ilyush, an analyst at Google, said that using clicks directly in ranking would be a mistake. Four years later, he was asked if user experience signals like click through rates are used in Rankbrain, a deep learning system in Google's algorithm, his response, dwell time. Ctr. Whatever Fishkin's new theory is, those are generally made up crap.
Speaker B: So I did that test like 16 times between 2011 and 2013, whatever event I was at. And I'd be like, hey, let's all do it. It's really fun, right? Like let's go search for wedding dresses. We're all gonna click on the 6th result. Everybody disconnect from Wifi. And sure enough, you know, you could get it from number six to number three, or number five to number two. Or at Moscon, I think the famous one was like page two to position one in the 45 minutes I was on stage.
Speaker A: The leaked document also supports that click through rate is used in a system called Nav boost, which helps Google learn and understand patterns leading to successful searches, just as Rand had theorized and tested. Now, Google isn't just using CTR in NavBoost, but they also look at different types of clicks to understand user experience. The document shows different attributes like good clicks, bad clicks, and last longest clicks, among others. And to put this discussion to rest, Googles vp of search Panduniak confirmed in his DoJ testimony the existence of Navboost and how it helps find and rank the stuff that ultimately shows up in the SERP. But heres what doesnt add up. Google doesn't deny the importance of CTR for YouTube videos. They even give us CTR data for keywords in Google search console, which is telling of its importance. So why lie about using clicks in search rankings? My best guess is to stop people from manipulating their search rankings and data.
Speaker B: Back in the day, when I had 400,000 Twitter followers who were active there, you know, I could put up a link to a Google search, get a click, Google shut that down, which I think is smart. You don't want people like me or anybody else being able to manipulate the rankings, even for a short period of time.
Speaker A: If Google publicly acknowledged clicks as a ranking factor, click farms and bots would flood the system trying to manipulate the algorithms, just as Fishkin and others had tested. But many of these tactics became less effective and for some, stopped working. And that's where Chrome data may have come to the rescue yet again.
Speaker B: Essentially, they are using qlik data, which is a very elegant solution, right? Cause they know which Chrome browsers are real, they know which ones have history. You can see in the docs, right? They've got stuff around like, did this Chrome browser visit YouTube and watch YouTube videos? How much history does it have? Does it have a normal click pattern?
Speaker A: As I pieced together this puzzle, a pattern began to emerge. Everything was connected to two key elements, Chrome and Rand Fishkin. In fact, Rand Fishkin pioneered a metric called domain authority that became a cornerstone for SEO professionals and ignited widespread debate. This controversy only fueled the lie revealed in the leaked documents. Google doesnt use site authority in their rankings both Gary and John have stated on multiple occasions that Google doesnt use domain authority, but the leak reveals a metric literally called site authority. But its not as straightforward as it might seem. As Mike King pointed out in his analysis, Google could be playing with semantics. They might specifically deny using Moz as domain authority metric, or claim they don't measure authority based on a website's expertise in a particular subject area. This confusion by wordplay allows them to dodge the question of whether they calculate or use sitewide authority metrics. While these metrics may not be exactly what Google uses internally, the bottom line is this site authority, whatever it means to Google, does exist. So why deny it?
Speaker C: The site authority metric they had was in a section about quality. It wasn't in a section about links. So I don't think it's site authority as SEO See it.
Speaker A: In the context of search engine optimization, site authority often boils down to backlinks. If you have plenty of quality links from authoritative sites, your authority score rises. But what about for new sites that don't have authority? Well, according to some SEO tinfoil, it might be worse than you think, thanks to something called the sandbox. The Google sandbox was a theorized place where new and young websites lacking trust signals like backlinks, were placed and they were forced to wait it out. It was like SEO Purgatory. The idea was that Google needed time to evaluate the quality of these sites, preventing spam from infiltrating search results. For instance, if someone bought 10,000 new domains and flooded them with spammy content and purchased a ton of links, the sandbox would spare Google from having to crawl those pages and potentially pollute their search results. But this also made it unfair for legitimate new businesses trying to gain visibility, which is probably why every googler in this story has denied its existence cuts in 2005, Gary in 2016, and John Mueller in 2019. In fact, the tweet you're looking at right now is from the way back machine, because Mueller deleted it. But the leak changed everything. In the perdoc module of the Google algorithm leak, the documentation reveals an attribute called hostage, used specifically to sandbox fresh spam in serving time. As I continued piecing this puzzle together, I became convinced that Google was intentionally lying to us. But the deeper I dug, the more I realized that I was solving the wrong puzzle. There was a key piece context.
Speaker C: These weren't ranking factors. We don't know what anything on that document is actually used for, could be used for quality testing or evaluation. We just don't know.
Speaker A: People like Matt Cutts, John Mueller and Gary Ilyush have devoted much of their careers to helping the SEO community better understand and optimize for Google search. And theres no logical reason to believe that they had ill intent. Maybe they werent even aware of the full picture. As Mike King put it, Googles public statements probably arent intentional efforts to lie, but rather to deceive potential spammers to throw us off the scent of how to impact search results. So how should we as content creators, webmasters and SEO professionals move forward knowing that we can't take Google's every word at face value?
Speaker D: We've got to test things. And you know, there's this fear right now that we're not going to hear as much from the Google spokespeople because of what has happened here. And I think that's fine. I think it's better if we spend more time using the power of our community to be doing this, sharing what we've learned and continuing to refine our understanding, because anything that they're going to tell us is always going to be aiming for the middle anyway because they can't be specific, specific about everybody's use.
Speaker A: Case, test, verify and share with the community, which is what we're trying to do here. But with Google contradicting themselves time and time again, can we really trust them? Watch this video here to see how one of their core updates crushed the small business overnight.