Chapter 2: Search Engine Basics
page-level keywords and content
describes the use of the keyword term/phrase in particular parts of the HTML code on the page (<title> tag, <h1>, alt attributes, etc.)
vertical navigation
each engine offers the option to search different verticals, such as images, news, video, or maps. Following these links will results in a query with a more limited index.
page-level traffic/query data
elements of this factor are click-through rate to the page in the search results, bounce rate of visitors to the page, and other similar measurements.
Google +1 Button
enables users to vote for a page on the page itself.
query deserves freshness (QDF)
ex: when there is breaking news, such as an earthquake, search engines begin receiving queries within seconds, and the first articles begin to appear on the web within 15 minutes. in these situations, there is a need to discover and index new information in near real time. QDF takes several factors into account, such as search volume, news coverage, and blog coverage. QDF applies to up-to-the-minute news coverage, but also to other scenarios such as hot, new discount deals or new product releases that get strong search volume and media coverage.
-keyword
excludes the keyword from the search results.
page-level features other than keywords
factors included here are page elements such as the number of links on the page, number of internal links, number of followed links, number of "nofollow" links, and other similar factors.
interaction with web search results
for example, if a user clicks through on a SERP listing and comes to your site, clicks the back button, and then clicks on another result in the same set of search results, that could be seen as a negative ranking signal. or if the results below you in the SERPs are getting clicked on more than you are, that could be seen as a negative ranking signal for you and a positive ranking signal for them.
Google Toolbar
for users using google toolbar, Google can track their entire web surfing behavior. Unlike Google Analytics, the Google Toolbar can measure the time from when a user first arrives on a site to the time when she loads a page from a different website. It can also get measurements of bounce rate and page views per visitor.
pages on the sites with links for sale
google has a strong policy against paid links, and sites that sell links may be penalized
rate of acquisition of links
if your site has acquired an average of X links per day, and suddenly that number increases, it could be seen as a positive ranking signal. the opposite is also true. However, if it suddenly increases a massive amount, it could be seen as your site becoming more relevant, or resorting to spammy ways to increase links. The origin of those new links is one of the most important details in this.
search engine results pages (SERPs)
in the search marketing field, the pages the engines return to fulfill a query
page-level social metrics
include mentions, links, shares, likes, and other social media site-based metrics.
domain-level brand metrics
includes search volume on the website's brand name, mentions, whether it has a presence in social media, and other brand-related metrics.
blended search
integrating content other than web links (images, video, etc.) into main web search results (Google calls this Universal Search)
page-level link metrics
refers to the links as related to the specific page, such as the number of links, the relevance of the links, and the trust and authority of the links received by the page
importance
refers to the relative importance, measured via citation, of a given document that matches the user's query. The importance of a given document increases with every other document that references it
vertical results (or instant answers)
results that can be derived from different data sources or presented on the results page in a different format. Include more than just links to other sites to help answer a user's questions. (ex: business address, directions, details; weather for a city, related image results, etc)
reading level
search engines can also analyze the reading level of documents. one popular formula for doing this is the Flesch-Kincaid Grade Level Readability Formula, which considers factors like the average word length and the words per sentence to determine the level of education needed to be able to understand the sentence.
cloaking
search engines want publishers to show the same content to the search engine as is shown to users
malware being hosted on the site
search engines will act rapidly to penalize sites that contain viruses or Trojans
crawlers (or spiders)
search engines' automated robots can reach the many trillions of interconnected documents
Boolean searches
searches that use Boolean terms such as AND, OR, and NOT. This type of logic is used to expand or restrict which documents are returned in a search.
user data
showing personalized results, based on geolocation, profile settings, search history (also used in adaptive search), etc. user's can reduce the amount of data collected from them by logging out of their Google account, using Chrome's Incognito mode, or choosing to disable customizations based on web history under the SERP's settings.
keyword1 OR keyword2
shows results for at least one of the keywords
"key phrase"
shows search results for the exact phrase, can also be used to force the inclusion of a specific word. Useful for including stopwords in a query, or if your keyword is getting converted into multiple keywords through automatic stemming.
fuzzy logic
technically refers to logic that is not categorically true or false. ex: whether a day is sunny (is 50% cloud cover a sunny day?) in search, fuzzy logic is often used for misspellings.
citation/citation analysis
the act of one work referencing another, as often occurs in academic and business documents. Can come in the form of links to the document or references to it on social media sites. determining how to weight these signals is known as citation analysis.
page views per visitor
the average number of pages viewed per visitor on your site
anchor text
the clickable text of links; another factor in determining the value of the link - the way the link is implemented and where it is placed; the text used in the link itself (the actual text that will go to your web page when the user clicks on it) is also a strong signal to the search engines. If this text is keyword-rich (with keywords relevant to your targeted search terms) it can potentially do more for your ranking in the search engines.
query deserves diversity (QDD)
the concept of altering the results because of the need for diversity, which can elevate certain pages' rankings.
link neighborhood
the concept of grouping sites based on who links to them, and whom they link to, referred to as grouping sites by link neighborhood. The neighborhood you are in says something about the subject matter of your site, and the number and quality of the links you get from sites in that neighborhood say something about how important your site is to that topic.
relevance
the degree to which the content of the documents returned in a search matches the user's query intention and terms. The relevance of a document increases if the page contains terms relevant to the phrase queried by the user, or if links to the page come from relevant pages and use relevant anchor text.
Bounce rate
the percentage of visitors who visit only one page on your website
Google Analytics
Google is able to collect detailed data about what is taking place on a large percentage of the world's websites. This provides Google with a rich array of data on that site, including bounce rate, time on site, and page views per visitor
Chrome Personal Blocklist Extension
Google offers this Chrome extension, which enables users of the Chrome browser to indicate a search result they don't like.
Goo.gl
Google's own URL shortener. This tool allows Google to see what content is being shared, and which content is being clicked on, even in closed environments where Google web crawlers are not allowed to go.
Navigation to more advertising
Only Yahoo! shows this in the search results. Clicking on these links will bring you to additional paid search results related to the original query.
Query refinement suggestions
Query refinements are offered by Google, Bing, and Yahoo!. The goal of these links is to let users search with a more specific and possibly more relevant query that will satisfy their intent.
document analysis
Search engines look at whether they find important search terms in the title, the metadata, the heading tags, and the body of the text. they also attempt to automatically measure the quality of the document based on document analysis, as well as many other factors.
search query box
Shows the query you've performed and allow you to edit or reenter a new query from the search results page. Google gives you a list of suggested searches if you begin typing (autocomplete suggestions feature); useful for targeting keywords. Next to the search query box, the engines also offer links to the advanced search page. A microphone icon in the right of the search box allows you to speak your query. In Google image search, a camera icon allows you to upload an image or get similar images back.
page speed
a negative factor for pages that are exceptionally slow
vertical search
a term sometimes used for specialty or niche search engines that focus on a limited data set. ex: image, video, news, and blog searches
horizontal navigation
all three engines (Google, Bing, Yahoo!) used to have some form of horizontal navigation, but as of June 2015 only Yahoo! continues to include it.
content that advertises paid links on the site
as an extension of "pages on the sites with links for sale," promoting the sale of paid links may be a negative ranking factor
Domain-level link authority features
based on a cumulative link analysis of all the links to the domain. This includes factors such as the number of different domains linking to the site, the trust/authority of those domains, the rate at which new inbound links are added, the relevance of the linking domains, and more.
algorithms/ranking factors/algorithmic ranking criteria
careful, mathematical equations crafted by the engines to sort the "wheat from the chaff" and then rank the "wheat" in order of quality. These algorithms often comprise hundreds of components. In the search marketing field, they are often referred to as ranking factors or algorithmic ranking criteria.
stopwords
keywords that are normally stripped from a search query because they usually do not add value, such as the word "the."
domain-level keyword-agnostic features
major elements of this factor include the number of hyphens in the domain name, number of characters in the domain name, and domain name length
Frames and iframes
methods for incorporating the content from another web page into your web page. Iframes are more commonly used than frames to incorporate content from another website. ex code: <iframe src ="..........." width="100%" height="300%"> <p> insert alternate text here </p> </iframe> Frames are typically used to subdivide the content of a publisher's website, but they can be used to bring in content from other websites.
information retrieval (IR)
modern commercial search engines rely on this science, which has existed since the middle of the 20th century, when retrieval systems powered computers in libraries, research facilities, and government labs. Early in the development of search systems, IR scientists realized that two critical components comprised the majority of search functionality: relevance and importance.
index
of terms; a massive database that catalogs all the significant terms on each page crawled by the search engine.
semantic search
overlaps Google's Knowledge Graph to some degree, but also takes into account many other factors to personalize results for the searcher.
spammers
people who attempt to manipulate search engine results in violation of the search engine guidelines
plug-ins
programs located on the user's computer, not on the web server of your website. the embed tag (<embed>) is often used to incorporate movies or audio files into a web page; it tells the plug-in where it should look to find the data file to use. Content included through plug-ins may or may not be invisible to search engines.
results information
provides a small amount of meta-information about the results that you're viewing, including an estimate of the number of pages relevant to that particular query (these numbers can be, and frequently are, wildly inaccurate and should be used only as a rough comparative measure)
domain-level keyword usage
refers to how keywords are used in the root or subdomain name, and how impactful that might be on search engine rankings
term weighting
refers to the importance of a particular search term to the query. the idea is to weight particular terms more heavily than others to produce superior search results. ex: using the word "the" in a query will receive very little weight in selecting results because it appears in nearly all English language documents.
semantic map
the search engine performs a detailed analysis of all the words and phrases that appear on a web page, and then builds a map of that data for it to consider showing your page in the results when a user enters a related search query. This map seeks to define the relationships between those concepts so that the search engine can better understand how to match the right web pages with user search queries. If there is no semantic match of the content of a web page to the query, the page has a much lower possibility of showing up.
crawling
the search engines start with a seed set of sites that are known to be very high quality, and then visit the links on each page of those sites to discover other web pages.
PPC (a.k.a. paid search) advertising
the text ads are purchased by companies that use either Google AdWords or Bing. The results are ordered by a variety of factors, including relevance (for which click-though rate, use of searched keywords in the ad, and relevance of the landing page are factors in Google) and bid amount (the ads require a maximum bid, which is then compared against other advertisers' bids).
time on site
the time spent by the user on the site. Google Analytics receives only information when each page is loaded, so if you view only one page it does not know how much time you spent on that page. More precisely, then, this metric tells you the average time between the loading of the first page and the loading of the last page, but does not take into account how long visitors spent on the last page loaded.
keywords in context (KWIC)
the user's keywords are typically shown in boldface when they appear in the search results (sometimes close synonyms are shown in boldface as well).
Natural/organic/algorithmic results
these results are pulled from the search engine's primary indices of the Web and ranked in order of relevance and importance according to their complex algorithms.
proximity searches
uses the order of the search phrase to find related documents. ex: searching for "sweet german mustard" specifies only a precise proximity match. if the quotes are removed, the proximity of the search terms still matters to the search engine, but it will now show documents that don't exactly match the order of the search phrase, such as sweet mustard-german.
semantic connectivity
words or phrases that are commonly associated with one another. ex: the word aloha makes you think of Hawaii