Google in the Plex – Part 1 : a technology…

In the Plex is an(other) amazing book about my favorite company. Google is the reason why I wrote a book about start-ups: When I did a PowerPoint presentation in 2006 gathering what I knew about the Mountain View start-up, some friends told me to write a more general book about start-ups. Which I did in 2007. Hence this blog !

I have read already three books about Google and this one is as good as the previous ones. Maybe better. So I should thank here Michele Catasta, who advised me to read it when I did last June my updated presentation of the 2006 one. And I should certainly have read before this book published in 2011… I have also posted many articles about the company, just check with the tag Google. But I learnt many things In the Plex, and it is what I want to focus with this post(s). And first with Chapter 1 which is about its technology.

in-the-plex-home

Google was not the only one with the technology

Larry Page was not the only person in 1996 who realized that exploiting the link structure of the web would lead to a dramatically more powerful way to find information. In the summer of that year, a young computer scientist named Jon Kleinberg arrived in California to spend a yearlong postdoctoral fellowship at IBM’s research center in Almaden, on the southern edge of San Jose. With a new PhD from MIT, he had already accepted a tenure-track job in the CS department at Cornell University. […] Kleinberg began to play around with ways to analyze links. Since he didn’t have the assistance, the resources, the time, or the inclination, he didn’t attempt to index the entire web for his link analysis. […] all sorts of IBM vice presidents were trooping through Almaden to look at demos of this thing and trying to think about what they could do with it. ”Ultimately, the answer was … not much”. […] Kleinberg kept up with Google. He turned down job feelers in 1999 and again in 2000. He was happy at Cornell. He’d win teaching awards and a MacArthur fellowship. He led the life in academia he’d set out to lead, and not becoming a billionaire didn’t seem to bother him. [Pages 24-26]

There was yet a third person with the idea, a Chinese engineer named Yanhong (Robin) Li. […] Li came to the United States in 1991 to get a master’s degree at SUNY Buffalo, and in 1994 took a job at IDD Information Services in Scotch Plains, New Jersey, a division of Dow Jones. […] He realized that the Science Citation Index phenomenon could be applied to the Internet. The hypertext link could be regarded as a citation! “When I returned home, I started to write this down and realized it was revolutionary,” he says. He devised a search approach that calculated relevance from both the frequency of links and the content of anchor text. He called his system RankDev. […]Robin Li quit and joined the West Coast search company called Info-seek. In 1999, Disney bought the company and soon thereafter Li returned to China. It was there in Beijing that he would later meet—and compete with—Larry Page and Sergey Brin. [Pages 26-27] (Robin Li is the founder of Baidu.)

The technology was ultimately the best but initially nobody saw the value

Excite would buy BackRub, and then Larry alone would go to work there. Excite’s adoption of BackRub technology, he claimed, would boost its traffic by 10 percent. Extrapolating that in terms of increased ad revenue, Excite would take in $130,000 more every day, for a total of $47 million in a year. Page envisioned his tenure at Excite lasting for seven months, long enough to help the company implement the search engine. Then he would leave, in time for the fall 1997 Stanford semester, resuming his progress toward a doctorate. Excite’s total outlay would be $1.6 million, including $300,000 to Stanford for the license, a $200,000 salary, a $400,000 bonus for implementing it within three months, and $700,000 in Excite stock […] “With my help,” wrote the not-quite-twenty-four-year-old student, “this technology will give Excite a substantial advantage and will propel it to a market leadership position.” Khosla made a tentative counteroffer of $750,000 total. But the deal never happened. [Page 29]

In barely a year since Brin and Page had formed their company, they had gathered a group of top scientists totally committed to the vision of their young founders. These early employees would be part of team efforts that led to innovation after innovation that would broaden Google’s lead over its competitors and establish it as synonymous with search. […] It was at least a ten-day process with one of Google’s first crawl engineers, Harry Cheung (everyone called him Spider-Man), at his machines, monitoring progress of spiders as they spread out through the net and then, after the crawl, breaking down the web pages for the index and calculating the page rank, using Sergey’s complicated system of variables with a mathematical process using something called eigenvectors, while everybody waited for the two processes to converge. (“Math professors love us because Google has made eigenvectors relevant to every matrix algebra student in America,” says Marissa Mayer.) [Page 41]

A technology but not a science… and maybe a dangerous one

In its first few years, Google had developed a number of specialized forms of search, known as verticals, for various corpuses—such as video, images, shopping catalogs, and locations (maps). Krishna Bharat had created one of those verticals called Google News, a virtual wire service with a front page determined not by editors but algorithms. Another vertical product, called Google Scholar, accessed academic journals. But to access those verticals, users had to choose the vertical. Page and Brin were pushing for a system where one search would find Everything. [Something called Universal Search]. [Page 58]

When the Universal Search team showed a prototype to Google’s top executives, everyone realized that taking on the project […] had been worth it. The results in that early attempt were all in the wrong order, but the reaction was visceral—you typed in a word, and all this stuff came out. It had just never happened before. “It definitely was one of the riskier things,” says Bailey. “It was hard, because it’s not just science—there are some judgment calls involved here. We are to some degree using our gut. I still get up in the morning and am astonished that this whole thing even works.” Google’s search now wasn’t just searching the web. It was searching everything. In his 1991 book, Mirror Worlds, Yale computer scientist David Gelernter sketched out a future where humans would interact, and transact, with modeled digital representations of the real world. […] But though Gelernter looked on the overall prospect of mirror worlds with enthusiasm, he worried as well. “I definitely feel ambivalent about mirror worlds. There are obvious risks of surveillance, but I think it poses deeper risks,” he said. His main concern was that mirror worlds would be steered by the geeky corporations who built them, as opposed to the public. “These risks should be confronted by society at large, not by techno-nerds,” he said. “I don’t trust them. They are not broad-minded and don’t know enough. They don’t know enough history, they don’t have enough. [Page 59-60]

Google’s researchers would acknowledge that working with a learning system of this size put them into uncharted territory. The steady improvement of its learning system flirted with the consequences postulated by scientist and philosopher Raymond Kurzweil, who speculated about an impending “singularity” that would come when a massive computer system evolves its way to intelligence. Larry Page was an enthusiastic follower of Kurzweil and a key supporter of Kurzweil-inspired Singularity University, an educational enterprise that anticipates a day when humans will pass the consciousness baton to our inorganic progeny. [Page would hire Kurzweil in 2012 ]What does it mean to say that Google “knows” something? […] “That’s a very deep question,” says Spector. “Humans, really, are big bags of mostly water walking around with a lot of tubes and some neurons and all. But we’re knowledgeable. So now look at the Google cluster computing system. It’s a set of many heuristics, so it knows ‘vehicle’ is a synonym for ‘automobile,’ and it knows that in French it’s voiture, and it knows it in German and every language. It knows these things. And it knows many more things that it’s learned from what people type.” […] Spector promised that Google would learn much, much more in coming years. “Do these things rise to the level of knowledge?” he asks rhetorically. “My ten-year-olds believe it. They think Google knows a lot. If you asked anyone in their grade school class, I think the kids would say yes.” What did Spector, a scientist, think? “I’m afraid that it’s not a question that is amenable to a scientific answer,” he says. “I do think, however, loosely speaking, Google is knowledgeable. The question is, will we build a general-purpose intelligence which just sits there, looks around, then develops all those skills unto itself, no matter what they are, whether it’s medical diagnosis or …” Spector pauses. “That’s a long way off,” he says. “That will probably not be done within my career at Google.” (Spector was fifty-five at the time of the conversation in early 2010.) “I think Larry would very much like to see that happen,” he adds. [Page 66-67]

As a final comment read the book. You may also have a look at my slideshare presentation.

Leave a Reply

Your email address will not be published.

Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.