18.2 GOOGLE

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

lowering power consumption and cooling requirements

Each server farm layout has also been carefully designed with an emphasis on _____________ ____________ __________ and ____________ ___________. Instead of using big uninterrupted power supply (UPS) systems common in most data centers, *Google put smaller battery backups next to each server. These cost less; are more efficient, because they leak about 15 percent less energy than big units; and don't have heavy cooling costs.* Employees usually wear shorts inside the data center since the "cool aisle" in the front of machines is around 80°F. The hot aisles venting out the back and cooled via constantly circulating, heat-absorbing water coils can get up to 120°F. That's hotter than most corporate data centers, but Google learned that its systems could take the heat. These practices allow Google to set the bar high for energy efficiency. The standard used to measure data center efficiency is PUE—power usage effectiveness. 1.0 is a perfect score—it means all the power a facility draws is put to use. Everyone loses power; 2.0 (meaning half the power drawn is wasted) is considered a "reasonable number." Google regularly runs PUEs below 1.1—astonishingly efficient. Saving energy helps the firm meet its green goals—the firm is formally committed to being carbon neutral and offsetting its fossil fuel energy needs—but the data centers also help meet other "green" goals: massive cash savings. The firm's infrastructure chief claims that the savings through the firm's ultraefficient data center designs are vital to keeping costs low enough to keep services like Gmail free. Google also uses artificial intelligence to monitor data center performance. If it finds that one of its predicted outcomes doesn't match a current finding (e.g., the temperature is higher than what formulas suggest), this acts as a sort of data center equivalent of a car's "check engine light." AI will then suggest a course of action, like clean an air-filtering heat exchanger or check other systems.

organic or natural search

Search engine results returned and ranked according to relevance.

query

Search.

cache

Refers to a temporary storage space used to speed computing tasks.

link fraud

Also called "spamdexing" or "link farming." The process of creating a series of bogus websites, all linking back to the pages one is trying to promote.

deep web

And a lot of content lies inside the "_____ _____," either behind corporate firewalls or inaccessible to those without a user account—think of private Facebook updates no one can see unless they're your friend—all of that is out of Google's reach.

query, natural search, order, PageRank, money, popularity, linking, placement

Before diving into how the firm makes money, let's first understand how Google's core service, search, works. Perform a search (or ______) on Google or another search engine, and the results you'll see are referred to by industry professionals as organic or ______________ __________. Search engines use different algorithms for determining the ________ of organic search results, but at Google the method is called __________ (a bit of a play on words, it ranks Web pages, and was initially developed by Google cofounder Larry Page). Google does not accept _______ for placement of links in organic search results. Instead, PageRank results are a kind of ____________ contest. Web pages that have more pages ________ to them are ranked higher (while organic search results can't be bought, firms do pay for preferred ___________ in some Google products, including Google Shopping, Hotels, and Flight Search).

server farm

A massive network of computer servers running software to coordinate their collective use. Server farms provide the infrastructure backbone to SaaS and hardware cloud efforts, as well as many large-scale Internet services.

PageRank

Algorithm developed by Google cofounder Larry Page to rank websites.

search engine indexing, caching, HTML code invisibly embedded

But what if you want the content on your website to remain off limits to __________ _________ _________ and ________? Organizations have created a set of standards to stop the spider crawl, and all commercial search engines have agreed to respect these standards. One way is to put a line of _______ ________ _________ ___________ in a Web page that tells all software robots to stop indexing a page, stop following links on the page, or stop offering old page archives in a cache. Users don't see this code, but commercial Web crawlers do. For those familiar with HTML code (the language used to describe a website), the command to stop Web crawlers from indexing a page, following links, and listing archives of cached pages looks like this: 〈META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW, NOARCHIVE"〉

custom silicon, tensor processing unit

Google does make some custom components. Since it's such a big target for hackers, the firm designs its own custom hardware security chips that are inside its servers and peripherals. These chips allow Google to identify and authenticate its own infrastructure at the hardware level. Google has also designed several versions of its own _______ _______ for machine learning, something it calls the TPU for _________ ______________ _______ (TensorFlow is a type of Google-nurtured machine learning tool). These chips are akin to the GPUs that were mentioned in the Moore's Law chapter, and the firm will gladly sell cloud access to other firms that want to use Google smarts at scale. The first TPU outperformed standard processors by a factor of 30 to 80 times when measuring the calculating speed while taking into account energy cost—so much so that Google credits the TPU with saving it from having to open "dozens [more] data centers."

link fraud, spamdexing, link farming

Google is a bit vague about the specifics of precisely how PageRank has been refined, in part because many have tried to game the system. In addition to in-bound links, Google's organic search results also consider some two hundred other signals, and the firm's search quality team is relentlessly analyzing user behavior for clues on how to tweak the system to improve accuracy. The less scrupulous have tried creating a series of bogus websites, all linking back to the pages they're trying to promote (this is called _______ _________Also called "______________" or "_____ ________." The process of creating a series of bogus websites, all linking back to the pages one is trying to promote., and Google actively works to uncover and shut down such efforts—see the "Link Fraudsters" sidebar).

isn't the first page, never be found

Google will crawl frequently updated sites, like those run by news organizations, as often as several times an hour. Rarely updated, less popular sites might only be reindexed every few days. The method used to crawl the Web also means that if a website _________ _________ ________ on a public server, or isn't linked to from another public page, then it'll __________ ____ __________. In addition, each search engine also offers a page where you can submit your website for indexing.

Colocation facilities (colos)

Google's got a lot of data centers, but not all Google-served data comes to you straight from Google's own server farms. The firm also scatters racks of servers in scores of spots all over the world so that it can quickly get you copies of high-value rich media content, like trending YouTube videos, or fast services for businesses using Google's cloud computing services. These racks of Google content and code are tucked away, sometimes within data centers run by big telecom firms like Comcast or AT&T, or kept inside ________________ ____________(colos), big warehouse-like facilities where several telecom companies come together to exchange traffic.

hardware

Google's server farms contain __________ that is custom built to contain just what Google needs and eliminate everything it doesn't (e.g., no graphic cards, since servers aren't attached to monitors, or enclosures, since all servers are rack-mounted). In most cases the firm uses the kind of Intel or AMD processors, low-end hard drives, and RAM chips that you'd find in a commercial PC. These components are housed in racks, slotted like very tight shelving. Each server is about 3.5 inches thick yet contains processors, RAM memory, and hard drives. Google buys so many components for its custom-built servers that it, not a PC manufacturer, is Intel's fifth largest customer.

deep Web

Internet content that can't be indexed by Google and other search engines.

BMW

JCPenney isn't the first firm busted. When Google discovered so-called black-hat SEO was being used to push _____ up in organic search rankings, Google made certain BMW sites virtually unfindable in its organic search results. JCPenney claims that they were the victim of rogue behavior by an SEO consultant (who was promptly fired) and that the retailer was otherwise unaware of the unethical behavior. But it is surprising that the retailer's internal team didn't see their unbelievably successful organic search results as a red flag that something was amiss, and this case highlights the types of things managers need to watch for in the digital age. JCPenney outsourced SEO, and the fraud uncovered in this story underscores the critical importance of vetting and regularly auditing the performance of partners throughout a firm's supply chain.

link farming, 34%

Link fraud undercuts the credibility of Google's core search product, so when the search giant discovers a firm engaged in ____ ______________, they drop the hammer. In this case Google both manually demoted JCPenney rankings and launched tweaks to its ranking algorithm. Within two hours JCPenney organic results plummeted, in some cases from first to seventy-first (the Times calls this the organic search equivalent of the "death penalty"). Getting a top spot in Google search results is a big deal. On average, 34% of clicks go to the top result, about twice the percentage that goes to number two. Google's punishment was administered despite the fact that JCPenney was also a large online ad customer, at times paying Google some $2.5 million a month for ads.

server farms

Sergey Brin and Larry Page started Google with just four scavenged computers. But in a decade, the infrastructure used to power the search sovereign has ballooned to the point where it is now the largest of its kind in the world. Google doesn't disclose the number of servers it uses, but by some estimates, it runs over 1.4 million servers in over a dozen so-called server farms worldwide. In 2018 Google spent over $25 billion on capital expenditures, most of it going to new _________ _____, laying the cabling connecting them to the rest of the Internet, and staff offices, and an amount more than double what it spent a year earlier. Building massive server farms to index the ever-growing Web is now the cost of admission for any firm wanting to compete in the search market. This is clearly no longer a game for two graduate students working out of a garage.

spiders, Web crawlers, software robots

Software that traverses available websites in an attempt to perform a given task. Search engines use spiders to discover documents for indexing and retrieval.

colocation facilitie

Sometimes called a "colo," or carrier hotel; provides a place where the gear from multiple firms can come together and where the peering of Internet traffic can take place. Equipment connecting in colos could be high-speed lines from ISPs, telecom lines from large private data centers, or even servers hosted in a colo to be closer to high-speed Internet connections.

link fraud

The Times reported that "someone paid to have thousands of links placed on hundreds of sites scattered around the Web, all of which lead directly to JCPenney.com." And there was little question it was blatant ______ ________. Phrases related to dresses and linking back to the retailer were coming from such nondress sites as nuclear.engineeringaddict.com, casino-focus.com, and bulgariapropertyportal.com. One SEO expert called the effort the most ambitious link farming attempt he'd ever seen.

fault-tolerance

The ability of a system to continue operation even if a component fails.

search engine optimization (SEO)

The process of improving a page's organic search rankings.

search engine optimization

The process of improving a page's organic search results is often referred to as ___________ ________ ______________(SEO). SEO has become a critical function for many marketing organizations, since if a firm's pages aren't near the top of search results, customers may never discover its site.

barrier to entry, industr profitability,

The size of this investment not only creates a ___________ ____ ____________-, it influences __________ _________, with market-leader Google enjoying huge economies of scale. Firms may spend the same amount to build server farms, but if Google has roughly two-thirds of this market while Microsoft's search draws just a fraction of this traffic, which do you think enjoys the better return on investment?

spiders, Web crawlers, software robots

To create these massive indexes, search firms use software to crawl the Web and uncover as much information as they can find. This software is referred to by several different names—__________(3)—but they all pretty much work the same way. The spiders ask each public computer network for a list of its public websites (for more on this see DNS in Chapter 16 "A Manager's Guide to the Internet and Telecommunications"). Then the spiders go through this list ("crawling" a site), following every available link until all pages are uncovered.

influential, mobile friendly

While Google doesn't divulge specifics on the weighting of inbound links from a given website, we do know that links from some websites carry more weight than others. For example, links from websites that Google deems "____________" have greater weight in PageRank calculations than links from run-of-the-mill sites. For searches performed on mobile devices, Web pages that meet Google's criteria for being "________ ___________" will be ranked higher than those that don't have an option for mobile devices (Google does offer testing tools to see if your pages are compliant). Additionally, different users may not see identical organic search results. Google defaults to a mix of rankings that includes individual user behavior and, for those users searching while logged into Google accounts, social connections (although displaying generic results remains an option)

cached

While search engines show you what they've found on their copy of the Web's contents that Google has __________ on its own servers, clicking a search result will direct you to the actual website, not the copy. Sometimes you'll click a result only to find that the website doesn't match what the search engine found. This is rare, but it happens if a website was updated before your search engine had a chance to reindex the changes.


Ensembles d'études connexes

OCEA 101 [CH. 1: The Water Planet]

View Set

Azure Fundamentals AZ-900 Exam Preparation

View Set

Wordly Wise - 7th Grade - Lesson 8

View Set

ULL Cultural Anthropology 201 Test 1

View Set

PD test 1 (chapters 1, 3, and 6)

View Set

unit 1-number system and rationals

View Set