Once upon a time search engines looked at words. Over the years the movement has been towards concepts. Yes, indeed, the words on the page do come into play, but more often than not, it is more about identifying the concepts of a page or a website. This helps them better deliver results in a commonly more personalized world.
How Search Engines Rank Web Pages
Search engines are extraordinarily complicated. Basically, search engines exist to connect users with information. There is an amazingly huge amount of information out there on the Web, with more being added every single day. How do search engines connect this vast array of information with users who are looking for something in a meaningful way? It's a complex process involving a wide variety of factors, and this process evolves as technology - and the way we use search engines - changes over time.
We've all used search engines, without giving much thought to what's going on behind the scenes as we see our results retrieved within a matter of milliseconds.
The goal of this article is to give you a sense of the core concepts used in modern search. This is not a guide to Google. Nor Bing. It is a starting point to better understand the landscape so that you might venture out to discover more.
While this journey is more about the elements of ranking a page or a site, one really can’t get to that point without the page actually being found. The two obvious elements are;
Crawling: The ability for the search engine to get around the site
Indexing: Actually getting pages into the search engine’s index.
Search engines do this by analyzing words and other content on web pages, placing special emphasis on words that appear on specific locations on the web page: the title, headlines, image attributes, overall content emphasis, outbound and inbound links, etc.
Every search engine can offer a drastically different experience to the user, and there are major differences depending on where you’re located geographically.Forexample, search engines that are in both English and German-speaking countries offer both English and German language descriptions of search results.
These days most of this can be handled by automated tools provided by the major search engines. At least to let them know you exist.
What’s not quite as evident is the level and depth of the crawling and indexation given to a website.
Understanding Signals How Do Search Engines Process Searches?
Please note: search engines are not simple. They include incredibly detailed processes and methodologies, and are updated all the time. This is a bare bones look at how search engines work to retrieve your search results.
All search engines go by this basic process when conducting search processes, but because there are differences in search engines, there are bound to be different results depending on which engine you use.
First things first: signals. It is strangely a commonly bandied about term in the world of SEO.
Google has fairly consistently spoken of having more than 200 major ranking signals that are evaluated that, in turn, might have up to 10,000 variations or sub-signals. RankBrain is one of the “hundreds” of signals that go into an algorithm that determines what results appear on a Google search page and where they are ranked, Corrado said.
A search engine can use signals for many things including categorization, geo-localization, behavioral, demographic, and more. Not just for ranking purposes. Some might be used as signals of quality (task completion) while others used in display elements in the search results.
Where things get interesting are the various page-level signals and site-wide signals. How a search engines “views” your site on the web. In the strictest understanding of “ranking factors”, these might not always be considered, but indeed are important concepts.
Internal link ratios
Page Level Signals:
Classifications (and Localization)
Authority/trust (external links)
Linguistic indicators (language and nuances)
Prominence factors (bold, headings, italics, lists, etc.)
Link related signals
Trust elements (known by the company you keep)
Entity/Authority; citations, co-citation, etc.
Social graph signals
Spam signals (that might incur dampening)
Semantic relevance (of the other signals)
Please do bear in mind, ‘link related’ doesn’t mean PageRank. Links can send a variety of signals, methods like PageRank, being just one.
While we’ll avoid the specifics, you can get a sense that there are a variety of signals a search engine might use to understand what a site and/or a web page is about. And of course what types of elements might be used in scoring search results.
Next stop in our journey is to start understanding some of the various classifications, categorizations, relations, and correlations that a search engine might employ. These days we tend to think of them as “graphs”.
Some of these include;
Link graph: Most commonly known one to SEO professionals, all the links to sites that determine relevance, authority, and trust.
Social graph: Connections, topicality, behavioural data, etc.
Entity graph: People, places, things, events, etc. (named entities).
Knowledge graph: Information related to entities.
Term and taxonomy graphs.
Where this can play into rankings is that Google categorizes the relations and can score and re-rank web pages through these graph relations. Consider things such as co-citation, topical link graphs, social associations toward authority, and much more.