An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to gostips@gmail.com.

April 30, 2008

Google/YouTube Priorities

YouTube tests a new player and it looks great: the chrome is less pregnant and almost invisible, two rarely-used buttons have been removed, the volume control takes less space, you can finally use keyboard shortcuts (up/down arrows for volume, right arrow for fast forward, space to pause the video).


In other YouTube news, Eric Schmidt said in an interview for CNBC that the highest priority for YouTube this year is figuring out how to make money. "We believe the best products [for monetizing YouTube] are coming out this year. And they're new products. They're not announced. They're not just putting in-line ads in the things that people are trying."

Asked whether gaining a larger share of the ad market is Google's biggest priority, Eric Schmidt changed his mind. "Well, our number one priority is end-user--end-user happiness. Literally, are people happy with the results that they get using Google search? So it's literally search, and every day we bring out new improvements and indices that are--taxonomies that are understanding of language, more content, bigger--all of the things that make Google such a great search experience. That's our number-one priority, even more important, for example, than advertising."

It makes you wonder if the new minimalistic YouTube player intends to improve user experience or make room for the cutting-edge monetization features that will be announced later this year. "Well, our number one priority is end-user--end-user happiness. (...) I don't think we've quite figured out the perfect solution of how to make money [from YouTube], and we're working on that. That's our highest priority this year."

But maybe the two priorities converge at some point. "Google believes that advertising itself has value. The ads literally are valuable to consumers. Not just to the advertisers, but the consumers," said Eric Schmidt in the same interview. Maintaining the right balance between all these priorities is a difficult challenge for Google and one of the reason why, after one year and a half since the acquisition, YouTube still doesn't make significant money.

{ Thanks, Daniel. }

Show the Real Number of Search Results in Gmail

Update (June 2013): This no longer works. Try another trick.

When you search for something in Gmail and there are many search results, you'll see a vague message like "1 - 20 of thousands". In other cases, Gmail shows that there are about 80 results, when the real number is 78. But what if you want to find the actual number of results instead of an estimation? If you have the new version of Gmail, it's really easy to add more precision thanks to permalinks.

(Small digression: I'm sure that Opera users will stop reading this post because Gmail 2.0 doesn't support their browser, but the User JS file from this page fixes the compatibility issues and you can use the tip below. )

Let's say you want to find the number of conversations that contain Google, but don't have the label .Comments. You will search for:
google -label:.Comments (a shorter query: google -l:.Comments)

The address bar should display this URL:
http://mail.google.com/mail/#search/google+-l%3A.Comments


If you go to the next pages of search results, you'll notice that Gmail appends /p2, /p3 etc. to the previous URL. To see the actual number of search results, force Gmail to display an arbitrary page. For example, you can display the page #1000 if you estimate that the number of search results is smaller than 20*1000=20000. Just append /p1000 to the address and get something like:

http://mail.google.com/mail/#search/google+-l%3A.Comments/p1000

or more generally:
http://mail.google.com/mail/#search/YOURQUERY/p1000

Unfortunately, if you manually modify the address, Gmail no longer updates it when you go to a different section, so you need to refresh the page.

Google Analytics for Blogs


Two years ago, Google bought Measure Map, a very intuitive analytics software for blogs designed by Jeffrey Veen. Since then, Google Analytics borrowed some elements from Measure Map's interface. It seems that the transition is about to end soon as Google Analytics will integrate Measure Map's functionality: blog stats.

Measure Map users received an email that instructed them how to access the new version of the software.
Convert your Measure Map account to Google Analytics

We're giving our earliest users of Measure Map an opportunity to use our new service, built on the powerful Google Analytics platform and continuing to use the interface you're familiar with.

1) Create an account at Google Analytics.

2) Install the Google Analytics tracking code on your blog.

3) Tell us the URL of the blog you're using. (Option: This is a Blogger blog. If so, we'll put a "Stats" tab on your Blogger dashboard.)

It's clear that the standalone Measure Map is history and all of its features will be added in a new Google Analytics section for blogs with information about comments, links from other blogs, popular posts and, hopefully, real-time stats. FeedBurner is also a good candidate for integrating with Google Analytics and providing the big picture of your blog's traffic.

Another good news is that Blogger users will finally have a stats panel that can be accessed directly from Blogger's interface. Hopefully, Google will also include options like showing the number of views for each post or adding a list of the most popular posts in the sidebar.


{ The first image is licensed as Creative Commons by Andy Zeigert. }

Update. Google sent this:
Recently at South by Southwest, we announced our new Google Analytics for Blogger reporting interface. This integration marks the transition of Measure Map users worldwide to a new Google Analytics interface specifically designed for Blogger users. We've rebuilt Measure Map as an integrated feature of both Google Analytics and Blogger, which will give bloggers daily stats such as the number of visitors they receive, how many posts get traffic, and new referring links. Users can also use Google Analytics, of course, to track their blogs. More information is available on our Help Center.

We'll be refining the new interface in the coming months before we release it to the general public, and we're excited to help bloggers further understand the impact they're having on their readers.

Update 2: Google Analytics Blog has a screenshot of the Blogger integration. Google says it "will support all blogging programs. Bloggers who don't use Blogger will see a new reporting interface that resembles Google Analytics."

iGoogle Artist Themes

"Now you can put the work of world-class artists and innovators on your personalized Google homepage." Google added a gallery of iGoogle themes dedicated to fashion designers (Oscar de la Renta), musicians (Coldplay, Beastie Boys), actors (Jackie Chan), sportsmen (Lance Armstrong), photographers (Yann Arthus-Bertrand), choreographers (Mark Morris), cartoonists (Robert Mankoff), illustrators (Camilla Engman), architects (Cameron Sinclair) and more.

It's a great way to discover interesting people or to decorate iGoogle with the work of people you love. And I'm sure many companies will start to create themes to maintain the brand loyalty.

"As you may know, iGoogle has always provided you with great tools to access and arrange the content you want on your homepage. Just like a person's book or music collection is an extension of their personality, a user's iGoogle page is also a reflection of their loves and interests, both in terms of content, and now, visually," says Julian Sonego on Google Australia Blog.




In an unexpected move, Google promotes the gallery on the homepage using a graphic designed by Jeff Koons and an invitation to check the new feature: "What happens when great art mixes with your homepage?".

{ Thanks, Jeng, Jaime and Andrew. }

Update: If you can't decide which theme looks best on your iGoogle page, try the sampler theme. "With this theme, you'll get to try out a new featured theme every day and decide whether you want to keep it."

April 29, 2008

Google Combines Driving Directions with Street View


Google Maps finally made Street View imagery useful by integrating it with driving directions. If you try to find driving directions in one of the 42 US cities that have Street View images, Google will include these images for each intersection. "It's not always easy to find your way around an unfamiliar place. To help with this, we've been hard at work integrating Street View into the driving directions feature of Google Maps so that now you can preview your route before hitting the road," says Google Lat Long Blog.


This sample for Google Maps API shows an even cooler connection between Street View and driving directions.

FeedBurner Moves to Google Accounts


After Google bought FeedBurner in June 2007, we didn't hear too many new things about FeedBurner. A post from February detailed the benefits of the future Google integration: connecting with other Google services, better performance and new features. "Why not build new services and integrate at the same time? (...) Our perspective is that the time you lose trying to continuously merge an updated legacy codebase with a new rewrite causes you [to] be in a world of never actually getting the integration done because you're constantly working on merge problems."

It seems that the Googlizied FeedBurner will be brought to life soon. FeedBurner Blog announces that "in the coming weeks, upon visiting www.feedburner.com, selected publishers will have the opportunity to sign in using their Google Account". That means you will be offered the option to choose a Google account as a new home for FeedBurner. The posts mentions that the integration with other services will be added gradually and it's easy to anticipate the relaunch of AdSense for feeds or a new tab in AdWords.

The most visible side-effect of the Google ownership is that the premium accounts and MyBrand are free, so you might save at least $96/year. In the past 11 months, FeedBurner doubled the number of users and now has "882,989 publishers who've burned 1,570,012 feeds".

{ Thanks, Hebbet. }

April 28, 2008

Google Video Categories

The latest Google Video redesign removed yet another useful feature from the interface: restricting search results to a certain category. The feature is still available, but you need to use the genre: operator in your searches. Here are some of the most popular genres:

genre:animation
genre:comedy
genre:documentary
genre:educational
genre:gaming
genre:movie_feature
genre:music
genre:sports
genre:tv_show

Examples of searches:
ufo genre:documentary
charlie chaplin genre:movie_feature
data mining genre:educational

You can also use the genre: operator to better describe your search when you want to subscribe to a Google Video feed or to get an email alert when new videos are uploaded.


At some point in Google Video's history, you could browse videos by genre directly from the homepage, but you can still use the genre: operator without any other keyword to see popular videos from a certain category, like TV shows.

Improving Google Image Search Using Implicit PageRank

Image search engines have a very limited usefulness since it's difficult to accurately describe images in words and since search engines completely ignore the images, preferring to index anchor texts, file names or the text that surrounds images. "Search for apples, and they haven't actually somehow scanned the images itself to see if they contain pictures of apples," illustrates Danny Sullivan.

Image analysis didn't produce algorithms that could be used to process billions of images in a scalable way. "While progress has been made in automatic face detection in images, finding other objects such as mountains or tea pots, which are instantly recognizable to humans, has lagged," explains The New York Times.

An interesting paper [PDF] written by Yushi Jing and Google's Shumeet Baluja describes an algorithm similar to PageRank that uses the similarity between images as implicit votes. "We cast the image-ranking problem into the task of identifying authority nodes on an inferred visual similarity graph and propose an algorithm to analyze the visual link structure that can be created among a group of images. Through an iterative procedure based on the PageRank computation, a numerical weight is assigned to each image; this measures its relative importance to the other images being considered."

The paper, titled "PageRank for Product Image Search", assumes that people are more likely to go from an image to other similar images. "By treating images as web documents and their similarities as probabilistic visual hyperlinks, we estimate the likelihood of images visited by a user traversing through these visual-hyperlinks. Those with more estimated visits will be ranked higher than others." To determine the similarity between images, the paper suggests using different features depending on the type of images: local features, global features (color histogram, shape).

The system was tested on the most popular 2000 queries from Google Image Search on July 23rd, 2007, by applying the algorithm to the top 1000 results produced by Google's search engine and the results are promising: users found 83% less irrelevant images in the top 10 results, from 2.83 results in the current Google search engine to 0.47.

For example, a search for [Monet paintings] returned some of his famous paintings, but also "Monet Painting in His Garden at Argenteuil" by Renoir.


It may seem that this algorithm lacks the human element used to compute PageRank (links are actually created by people), but the two authors disagree. "First, by making the approach query dependent (by selecting the initial set of images from search engine answers), human knowledge, in terms of linking relevant images to webpages, is directly introduced into the system, since the links on the pages are used by Google for their current ranking. Second, we implicitly rely on the intelligence of crowds: the image similarity graph is generated based on the common features between images. Those images that capture the common themes from many of the other images are those that will have higher rank."

For now, this is just a research paper and it's not very clear if Google will actually use it to improve its search engine, but image search is certainly an area that will evolve dramatically in the future and will change the way we perceive search engines. Just imagine taking a picture of a dog with your mobile phone, uploading it to a search engine and instantly finding web pages that include similar pictures and information about the breed.

In 2006, Google acquired Neven Vision, a company specialized in image analysis, but the only new feature that could be connected to that acquisition is face detection in image search. Riya, another interesting company in this area, didn't manage to create a scalable system and decided to focus on a shopping search engine.

April 27, 2008

More Synergy Between Google's Communication Services

Google Talk has four flavors (Google Talk, Gmail Chat, Google Talk Gadget and Google Talk Labs Edition) and all of them have different features: you can transfer files only in Google Talk, chat with AIM contacts only in Gmail, get calendar notifications only in Google Talk Labs Edition and upload pictures from webcam only in the gadget. It's quite confusing to switch between all these implementations of the same service. Apparently, the main reason behind the launch of Google Talk Labs Edition was to unify these versions in a common platform. Here's a recent post from Ollie, Google Talk Guide:
We certainly haven't forgotten about our client users and we've been listening to your comments (here, in the Google Talk Help Discussion Group, and on the feedback forms). We hear you loud and clear; you love the client and you want it to have all the great new features that have been added to Gmail Chat or the Google Talk Gadget. We know that it's important to be able to chat inside and outside of your browser and that it's important to have a full array of features at your fingertips in both places. In short, you want to be able to choose how to connect to the Google Talk service without having to make any major feature trade-offs. We're completely with you on this one -- we want that too!

Now, I suspect some of you are thinking: if you're with us on that, why aren't all features available on the client right now? Well, we've got a lot on our plate here at Google Talk and we're always negotiating what we can get done. At the moment, we're focusing our energy on developing platforms that will let us make Google Talk better for all our users, whether they want a web-based experience or a client experience. There is still much to done, but we're committed to continually improving the Google Talk user experience for everyone.

As Jeff suggested in the comments, all these delayed Google Talk updates could be caused by a future integration with GrandCentral. "Although you haven't heard as much from us in the past few months as you did before, we are working hard every day on the next great version of GrandCentral and a ton of cool new features," informs up a post from GrandCentral Blog.

{ Thank you, Ender. }

April 26, 2008

Google Docs Lives to Share the Words

Mike Riversdale wrote the best article I've read about Google Docs. "Google Docs ... so what - the ONE reason why you should care" doesn't talk only about Google Docs, it's also about Zoho, wikis or any other tool that lets you write, collaborate and share your documents. It outlines the major difference between Google Docs and office suites like Microsoft Office or OpenOffice: Google Docs is built for a connected environment.
Documents (PC-based I'm thinking) are fundamentally about "one person". The document you edit looks lovingly into your eyes proclaiming ever lasting love just for you. If someone else tries to muscle in on this close(d) relationship they will get told to go away, I am with someone else.

Of course the words inside the document want to be loved by all and to love all. They force the document to dump one person and love another in a serial monogamy type of way. The document that was only for you will quite easily tell you to go away as they are now in a one-on-one relationship with someone else...

This issue - words love all / documents love one at a time - is a fundamental issue that many have tried to solve using any number of clever means. We've had software attempting to mediate the differences - every electronic document management (EDMS) system you've battled against lives this category. We've had consultants claiming to solve it via changes in work practices - 'workflow" and the bottlenecks they employ.

The most common way employed by everyone ever is ... copy the document. The words love this - they can love more and more people, more words can join them as they spread around the network - you can put in your words, I can add my words, Stevens from Accounts can remove the words he doesn't want - the words are out there, they love to be free and are loving all.

But once set free they're bloody near impossible to reign back in, for a start where the frig are they - out there in the wilds of the electronic world running free is all well and good until some poor sod has to try and reign them in. (...)

Google Docs doesn't live in the 'document' world. Oh it has similar naming conventions, it uses all the jargon that we're used to and it pretends to be a document ... but it's not because it comes from the 'words' world view. It knows that the words you're gonna edit are, 99.9% of the time, going to want to be loved by many more than you. And being on the Web they know that the world of connected people at your fingertips is massive. Not only is there the list of attractive people in your contacts list but there is everyone with an internet connection!

Google Docs lives to share the words:

* knows that words want to be shared and that's why you've typed them.
* its world view knows/understands its connected environment
* its capabilities are built to use this environment

{ Text licensed as Creative Commons. }

Google Me (The Movie)


"Egosurfing (also called vanity searching, egosearching, egogoogling, autogoogling, self-googling, or simply Googling yourself) is the practice of searching for one's own given name, surname, full name, pseudonym, or screen name on a popular search engine, to see what results appear." (Wikipedia)

"It all started when I Googled my name," says Jim Killeen, described by Washington Post as a "failed Los Angeles actor". "At 38 he was unmarried, no children. The movie stardom for which he'd left Detroit had never materialized; he'd eventually launched a business providing chair massages in poker halls for a dollar a minute. It was surprisingly lucrative but (perhaps not surprisingly) unfulfilling."

In search of his own identity, Jim decided to meet other people who share his name and to find their stories. Google was the most accessible way to find other Jim Killeens from all over the world: a cop from New York, a priest from Ireland, an engineer, a swinger. Google Me (The Movie) is a 96 minutes documentary that describes his cathartic journey. The full video is available for a limited timed at YouTube.

Google Me is an invitation to rediscover yourself by listening to other people's stories. If you can't watch the whole movie, the first 10 minutes are very special.


{ Thank you, Steve Hemmerstoffer. }

April 25, 2008

New in Google Docs: Insert Videos, Edit CSS

There are so many updates at Google Docs, that you'll need many hours to explore them and start to use them.

You can now access your browser's contextual menu by pressing Shift while right-clicking. This might be useful if you want to search the text from a document online or to use other features included in your browser.

If you don't want to convert a document to PDF and print the generated file, the option to print the document as a web page is back in the File menu. For simple documents, this should be a better option.

For better customization, Google Docs lets you define CSS styles for your documents: Edit > Edit CSS. Those who know CSS will find it faster to define styles and use them in the HTML code. The most important limitation is that you can't use images that are not hosted by Google Docs in your CSS rules. This page shows you how to add watermarks, repeating backgrounds, styled headers, image borders using CSS.


Presentations can now include videos, obviously only from YouTube, but at least you can find videos directly from Google Presently. "Videos can help you make a point, command the attention of your audience, or even add humor to your presentation," points out Google Docs Blog. Unfortunately, when you export your presentations as PPT, YouTube videos are replaced with still frames.

To write some text that might guide you while presenting, use the new speaker notes feature. "These notes will be visible to you and your viewers in presentation mode or when you print your slides."


Google Docs Blog also mentions that everyone who uses the English interface should be able to view and edit documents offline. "When we first announced offline access several weeks ago, it was limited to viewing and editing word processing documents. Now, we've added view-only offline access to spreadsheets and presentations as well."

Update at Google Product Search

The service formerly known as Froogle, Google Product Search, has received one of the most importance updates since it was launched, back in 2002. For some queries like [cell phone] or [scanner], Google detects identical products that are available in multiple online stores and lets you compare prices, read reviews and technical specifications on a single page. Until now, Google Product Search linked directly to the online store's web page and didn't include product reviews or detailed information about a product, like you can still see if you search for [barney].



Other comparison shopping sites like Shopzilla, MSN Shopping and Yahoo Shopping already have this feature and are more established destinations for finding products online. It's interesting that even Froogle used to include price comparison for an individual product, but the feature has been removed at some point. A mobile version of the site still waits for an update and Google Base needs more visibility.

Google Annoyances

While there are a lot of things to love about Google, some strange annoyances manage to balance the situation. Here are 10 annoyances that are in need for a fix:

1. Every time you go to www.google.com/analytics/, Google Analytics asks you to enter your password, even if you are already logged in. One workaround is to bookmark https://www.google.com/analytics/home/.

2. "New features!". Google's products are updated pretty frequently, but sometimes they show this message for months, even if the features are no longer new. Some pathological examples: Google Calendar and Picasa Web Albums.

3. The inconsistent navigation bar. There's no consistency here: some of the links send you to search results, other links send you to homepages. Some of the pages open in a new tab/window, other pages open in the same tab/window. The list of links is different, depending on the current service, and the ordering is not predictable.

4. Search results with tracking code. Because Google needs to track the search results you click on in order to add them to Web History, it replaces their addresses with redirects like: http://www.google.com/url?sa=t&ct=res&cd=1&url=... That means you can no longer right click on the link and copy the location. Some workarounds: disable Web History, log out or use this Greasemonkey script.

5. Google Updater. An annoying and intrusive way to install Google software, without providing an alternative for people who like the classic installer.

6. Set Google as my default search and notify me of changes. Every Google software has the mission to make Google your default search option in Internet Explorer (it's already the default option in other browsers), but also to install a notifier that warns you when other software tries to change the default search engine. Usually, the option can be disabled, but Google's wording is vague.

7. Blogger comments. It's hard to create something worse than Blogger's comments: they open in a new page with a different layout, the first option is to log in with a Google account, there's no spam filtering etc.

8. Posting a message at Google Groups. It usually takes one minute for your post to appear on the site, but Google should show it instantly.

9. When you translate a web page, Google Translate shows the original text in a bubble. Google's JavaScript code interferes with other web pages' code and the result is usually terrible. Another downside is that you can't copy the text from a translated web page. One workaround would be to block the JavaScript file, but it keeps changing its address.

10. Google Video has the worst advanced search page. If you search for something and click on "advanced search", your query is lost. The page doesn't put the focus on the first input box and pressing Enter has no effect.

11. Click on a broken link for a Blogger blog and Google is glad to inform you that "the blog you were looking for was not found". Pretty bad for an error message that should've been helpful.

Did you find other Google annoyances?

April 24, 2008

A Radio Interview with Marissa Mayer

(Hopefully fair-use thumbnail of a photo)
from Marissa Mayer's public album

KQED FM hosted one of the most interesting interviews with Marissa Mayer, Vice President for Search Products & User Experience at Google. Some tidbits:

- because of the limited environments where you are able to search and because of the small number of options to express your searches, you search less often than you should. For example, you can't find web pages that describe an idea and you can't speak to a search engine.

- the goal for Google Street View is to find what something looks like (e.g.: the door to a museum).

- Google could make $80-200 million/year by adding ads to Image Search, but people would use the product less.

- Google shows fewer ads to make them more relevant and more meaningful to users.

- Google builds products for a broad audience of users, so the products have to be simple and easy to use.

- the ad targeting in Gmail works by finding the most relevant words from a message and then listing ads that are related to those words.

- Larry Page and Sergey Brin read some studies that showed it's good to have around 25% of the technical workforce women to get a balanced environment and managed to maintain this proportion inside Google.

- Google does a small amount of outsourcing for testing and user interface design.

- the median age for Google's employees generally follows the average between Larry's age and Sergey's age.

- 80% of the calls to GOOG-411 return satisfactory results.

- there are more than a million of books in Google Book Search and the average number of pages for a book is 300, so Book Search has a similar index with Google's index from 2000.

- no plans for building a desktop operating systems.

- the public version of Google Health will be launched shortly.

The interview can be downloaded as an MP3 (24 MB) or listened using the player below (52 minutes):

YouTube Suggest

The always surprising video sharing service acquired by Google in October 2006 is constantly improving its search features and borrows many tricks from its parent company. The latest enhancement is an auto-complete feature that shows query suggestions as you start typing characters in the search box. You'll notice the obvious similarity between this feature and Google Suggest, a project that is about to finally graduate from Google Labs. YouTube Suggest has its own list of queries obtained from YouTube users, so it should offer decent suggestions.

"By suggesting more refined searches up front, Google Suggest can make your searches more convenient and efficient by keeping you from having to reformulate your query. Google Suggest might offer suggestions that you will find novel or intriguing," explains Google in an interesting FAQ.


The feature is enabled by default, but you can disable it in the "Settings". For now, YouTube Suggest seems to be live only for international sites like YouTube UK and only if you search from the homepage, but it should be available at YouTube's main site in the near future.

To get the suggestions, YouTube uses this simple JSON call:

http://suggestqueries.google.com/complete/search? hl=en&ds=yt&json=t&jsonp=callbackfunction&q=QUERY

where ds=yt defines the search's scope (YouTube), while q=QUERY includes the characters typed by the user. A similar URL is used by Google News to suggest news sources in advanced search, so we can expect an API for query suggestions:

http://news.google.com/complete/search? hl=en&ds=ns&js=true&q=QUERY

This URL works as well:

http://suggestqueries.google.com/complete/search? hl=en&ds=ns&json=t&jsonp=callbackfunction&q=QUERY


YouTube constantly experiments with new features and most of them are related to the way people navigate the site or discover new videos. A recent experiment added a search box below the list of related videos so that people can search and see the search results while watching a video. The only problem was that you couldn't add the results to the Quicklist in order to build dynamic playlists.

Update (May 16): YouTube Suggest is now live at youtube.com.

The Informational Distance Between Cities


Information Aesthetics points to an interesting visualization of the "informational" relation between cities. Two cities are "informationally" related if they are often mentioned together, so the visualization uses the number of Google results to approximate the distance:

Gdistance(w1,w2) = (#(w1)+#(w2)) / (#(w1+" and "+w2)+#(w2+" and "+w1)),
where #(w) is the number of Google search results for the query w enclosed in quotes.

This approximation could be improved by replacing "and" with "*", so that the words aren't necessarily separated by the conjunction "and". The Google distance is multiplied with the physical distance between cities to increase the connection between cities that are far away.

Among the cities that have a small "informational" distance: London and New York, Tokio and Sydney, London and Singapore City.

Another way you can use the number of Google results is to calculate the mindshare of a word or name within a domain. If you divide the number of search results for [nokia mobile phone] by the number of results for [mobile phone] you can find Nokia's Googleshare within the mobile space.

April 23, 2008

Kai-Fu Lee on Cloud Computing


John Breslin highlights some interesting ideas from Kai-Fu Lee's keynote about cloud competing presented at the 17th International World Wide Web Conference (Kai-Fu Lee is the president of Google China from July 2005). He mentions six properties of cloud computing from Google's perspective:

1. User centric. "If data is all stored in the Cloud - images, messages, whatever - once you're connected to the Cloud, any new PC or mobile device that can access your data becomes yours. Not only is the data yours, but you can share it with others."

2. Task centric. "The applications of the past - spreadsheets, e-mail, calendar - are becoming modules, and can be composed and laid out in a task-specific manner. (...) Google considers communication to be a task" and that's the reason why Gmail integrates a chat feature for instant communication.

3. Powerful. "Having lots of computers in the Cloud means that it can do things that your PC cannot do. For example, Google Search is faster than searching in Windows or Outlook or Word" because a Google query hits at least 1000 machines.

4. Accessible. Having your data in the cloud means you can instantly get more information from different repositories - Google's universal search is one example of simultaneous search. "Traditional web page search does IR / TF-IDF / page rank stuff pretty well on the Web at large, but if you want to do a specific type of search, for restaurants, images, etc., web search isn't necessarily the best option. It's difficult for most people to get to the right vertical search page in the first place, since they usually can't remember where to go. Universal search is basically a single search that will access all of these vertical searches."

5. Intelligent. "Data mining and massive data analysis are required to give some intelligence to the masses of data available (massive data storage + massive data analysis = Google Intelligence)."

6. Programmable. "For fault tolerance, Google uses GFS or distributed disk storage. Every piece of data is replicated three times. If one machine dies, a master redistributes the data to a new server. There are around 200 clusters (some with over 5 PB of disk space on 500 machines). The Big Table is used for distributed memory. The largest cells in the Big Table are 700 TB, spread over 2000 machines. MapReduce is the solution for new programming paradigms. It cuts a trillion records into a thousand parts on a thousand machines. Each machine will then load a billion records and will run the same program over these records, and then the results are recombined. While in 2005, there were some 72,000 jobs being run on MapReduce, in 2007, there were two million jobs (use seems to be increasing exponentially)." This recent video has more information about Google's infrastructure.

Kai-Fu Lee thinks that outsourcing IT to a "trusted shop" like Google is the key to make using a computer simple and safe. "Entrepreneurs should have new opportunities with this paradigm shift, being freed from monopoly-dominated markets as more cloud-based companies evolve that are powered by open technologies."

There's a shift from the computer to the user, from applications to tasks, from isolated data to data that can be accessed anywhere and shared with anyone.

"Cloud computing liberates the user from having to remember where the data is, enables the user to access information anywhere once created, and makes services fast and powerful through essentially infinite information and computing. People are using cloud services to find, share, create, and organize information. People are also using cloud services to shop, bank, communicate, socialize. By using cloud computing, these capabilities will be accessible not only on PCs but also telephones, automobiles, televisions, and appliances. (...) Google is committed to help bring about the era of cloud computing, which we believe will facilitate services that are convenient, easy-to-learn, people-centric, scalable, and device-ready," mentions Kai-Fu Lee in the abstract.

April 22, 2008

Google Search REST API

More than one year after Google discontinued the SOAP Search API, it finally got a proper replacement. The AJAX Search API can now be used from any Web application, not just in JavaScript. The other two Google AJAX APIs for feeds and translations were updated for non-AJAX use, as well.

"For Flash developers, and those developers that have a need to access the AJAX Search API from other Non-Javascript environments, the API exposes a simple RESTful interface. In all cases, the method supported is GET and the response format is a JSON encoded result set with embedded status codes."

"Using the APIs from your Flash or Server Side framework couldn't be simpler. If you know how to make an http request, and how to process a JSON response, you are in business," says Mark Lucovsky. Here's a simple example for web search:
http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=Earth%20Day

There are some differences between the old SOAP API and the REST one.

PROs:
- the new API doesn't require a key
- there's no limitation for the number of queries
- it's much easier to use
- you can use the REST API for web search, but also for image search, news search, video search, local search, blog search and book search.

CONs:
- you need to send "a valid and accurate http referer header"
- you can only get up to 8 results in a single call and you can't go beyond the first 32 results
- the terms of use are pretty restrictive: for example, you need to attribute the results to Google and you are not allowed to change the order of search results.

It's interesting to notice that Yahoo's search APIs are more developer-friendly and, although they require an application ID and have some usage limitations (5,000 queries per IP per day), they offer more features and they are more flexible, by also including XML output. Another important difference is that Yahoo doesn't require "a valid and accurate http referer header".

Philipp Lenssen suggests that it's much easier to just screenscrape the results, but search engines could change their code or block your requests.

Update. Check this excellent interview with Mark Lucovsky, who mentions that the API has been available for almost two years, but it wasn't officially documented:

So When Do We Get Folders in Gmail?

- OK, I followed your advice and switched to Gmail. It's great, but when do we get folders in Gmail?

- Gmail already has something similar to folders: labels. The main difference between folders and labels is that you can add more labels to a message.

- Oh, I see, but I don't think it's very useful to add a message to more folders. I mean, labels.

- You could create labels to categorize your mail and some of the messages will certainly fit in more than one category.

- It doesn't work. I created a label for "Invitations" and I added the label to one of my messages, but it's still there in the inbox.

- That's because "inbox" is also a label and adding another label doesn't remove the other ones.

- So now I have to click on "Delete" to remove the message from my inbox, right?

- No, to remove a message from the inbox, click on "Archive".

- I thought "Archive" compresses my messages to save space.

- I'm sure that Google stores your email efficiently.

- Thanks for your help. Now I know how to use folders in Gmail. I select the message, click on "Archive" and then... Hey, wait a minute! My message has disappeared!

- You can still find it in "All Mail", one of the sections bellow Gmail's logo. "All Mail" includes all the messages, except those from the trash or flagged as spam.

- That's too complicated! So when do we get folders in Gmail?

April 21, 2008

Google's New Social Network: iGoogle


The Google-owned social network orkut, while extremely popular in Brazil and India, has failed to find similar success in the US. With the launch of OpenSocial, an API for writing social gadgets, it was clear that iGoogle will play an important role in Google's second attempt to socialize. After all, OpenSocial applications are iGoogle gadgets with a social component.

Following orkut's model, iGoogle opened a sandbox for developers who write OpenSocial gadgets. The sandbox is probably a test for the next iteration of iGoogle: the personalized homepage turned into a social network. "The integration of OpenSocial with gadgets gives you an opportunity to enhance your content for users by incorporating social features. For example, a books gadget could display what a user's friends are reading, allow users to request to borrow books from friends' libraries, and show users books that their friends recently rated. As users share content with their friends, your gadget will naturally build a broad audience for distributing content and driving traffic," explains the new developer site for iGoogle gadgets.

iGoogle has tens of million of users, 50% of the users are from the US and it was one of the fastest growing Google products in 2006 and 2007. It's also the homepage for many Google users who want to personalize their experience by adding a theme and fresh information from the web. The new social component will not affect all the gadgets, so you'll still have gadgets for mail, weather or news, but some of the gadgets could share information with your friends. There's also a new canvas view that will show an expanded version of the gadgets, an integration with Google profiles and a newsfeed that shows your friends' recent activities.

Hopefully, the social component of iGoogle won't be too prominently promoted and people will be able to continue using the personalized homepage without dealing with friend invitations and viral gadgets. iGoogle will try to be the social connection between Google services, but this is a difficult mission for Google, a company that has never managed to build a successful social site.

Related:
iGoogle Gadget Maker
Google intends to integrate its social applications
Google to open up its social platform

April 20, 2008

Recent Searches To Influence Google's Results

Danny Sullivan reported earlier this month that Google will start to personalize search results based on the previous query. "For example, if someone were to search for [spain] and then [travel] after that, BOTH the ads and the organic results will be altered to take the previous query into account. To some degree, it will be as if the second query was for [spain travel]." According to some code from Google's sites, Google will use not just the previous query, but a list of recent queries.


Until now, Google personalized the results based on the search history only for users that were logged in and enabled the Web History service. Google created a profile from your search history and used it to disambiguate your queries and slightly alter the rankings for pages that were likely to match your interests.

The new signal for personalizing results (recent searches) should work without having to log in and could influence the results in a different way. In many cases, people constantly refine their queries by adding or removing keywords, but Google and other search engines don't use all these refinements to improve the results in real time. By connecting the related searches from a session, Google will understand more from what you intend to find and should deliver better results.

While search history disambiguates general queries, the list of recent searches connects the failed attempts to find an answer for a complex query and creates a more detailed description of your intentions. Search history could be Google's long-term memory and the recent queries could build the working memory.

Google search patterns (from "Searching for the mind of the searcher" [PDF], by Daniel Russell)

Google Phishing Warning

After flagging search results that distribute malware, Google will also show warnings for web pages used for phishing. Most of these pages are active one or two days before they are taken down by hosting providers, but some of them could be indexed by search engines. While the latest versions of Internet Explorer, Firefox and Opera have anti-phishing protection, a new security layer still have some usefulness.

"Warning - phishing (web forgery) suspected. The site you are trying to visit has been identified as a forgery, intended to trick you into disclosing financial, personal or other sensitive information," mentions the page displayed by Google instead of the search result.


Google also has a Safe Browsing API "that enables client applications to check URLs against Google's constantly updated blacklists of suspected phishing and malware pages." The API is used by Firefox and Google Desktop.

Search for Mapped Web Pages in Google Maps

Google Maps added the map view available at Google Experimental Search. Google extracts the most important locations from web pages and lets you see the search results on a map. To restrict your search to web pages, you need to click on "Show search options" and select "Mapped web pages" from the new drop-down. Google displays the most relevant web pages that include locations from your current map view, but you can change the location in your query using the operators near or in: for example, [Beethoven near Germany] or [Beethoven in Europe].

This is an entire new way to search the web by changing the focus from general information to geographical information. You could use it to search for people, companies, organizations, events, traditional food or anything that could be connected to a location.

Web pages include a lot of useful information that isn't properly used by search engines: addresses, phone numbers, dates, opinions, characteristics, quotes, examples. All of these could be used to create connections between people and some important dates, between products and people's opinion about them, between concepts and examples. Web search engines could answer to complex queries like "the general opinion about iPhone in the first week after its launch" by using the information available on the web and cleverly extracting attributes and connections.

April 18, 2008

Google WHOIS OneBox


Google shows a special OneBox when you search for "whois", followed by a domain name: for example, [whois google.com]. The OneBox shows the date when a certain domain was created and date when it will expire. It seems that the only provider of information for this OneBox is Domain Tools.

Google launched a similar feature four years ago, but it was removed really fast because it scraped data from Network Solutions without permission. "Google quietly launched a service allowing visitors to look up data on domain name owners from public databases - collectively known as Whois - run by registrars worldwide. Although largely unpromoted, the service generated enough traffic to surpass Network Solutions' (NSI's) daily Whois use limits, which aim to stop spammers and other undesirables from harvesting information about its customers."

This is not the only direct answer displayed at the top of Google's search results: there are many OneBoxes that show maps, stock quotes, weather information, local time, books, definitions or facts.

{ via Matt Cutts }

Update: after 3 or 4 page views, DomainTools shows this message "You have reached your daily lookup limit as an guest user. Please login or register". Maybe Google should partner with companies that have less restrictions.

Yet Another Google Video Redesign

Google Video redesigned its homepage and now only includes a list of "hot" videos. "The Google Video home page allows you to browse and play hot videos directly from the home page, making it easier for you to discover popular, interesting videos from across the web. The hot video list is compiled by looking at a variety of signals including videos that most shared, viewed and blogged about."

The video watch page has also been redesigned and it's more flexible: you can hide the right sidebar, minimize the list of related videos and use pagination to read a long list of comments.


Videos from third-party online video sites are still displayed in an annoying frame, but Google Video's bar has been moved at the left of the page to leave more room for the videos.


The search page has two new options for displaying results: in a grid and in TV view, that lets you watch videos while also being able to see the list of results. An interesting new option in the advanced search page lets you search only closed captioned videos. Another good thing is that you can now watch videos inline for some new video sites: DailyMotion, Revver, Guba, Crackle, not just for YouTube and Google Video.


The updates make Google Video more user-friendly and easier to use, even if mixing a video search engine with a video hosting site makes the user experience confusing.

Watch Restricted YouTube Videos

I've noticed that an increasing number of YouTube videos are restricted to a limited number of countries, probably because the company that uploaded them doesn't have global distribution rights or because it wants to use different marketing strategies in other countries. Even if YouTube says that "this video is not available in your country", you can actually see it using a very simple trick: replace http://www.youtube.com/watch?v=VIDEOID with http://www.youtube.com/v/VIDEOID (VIDEOID is the 11-characters video identifier).


Example of video blocked outside US (Madonna - 4 Minutes): http://www.youtube.com/watch?v=I9ciR9qR1dU
To see it, paste this in the address bar: http://www.youtube.com/v/I9ciR9qR1dU

Update (one day later): YouTube fixed the player and you can no longer bypass the country restrictions using this trick. Youtubegir has a nice proxy for YouTube videos.

The same trick works if you don't want to log in when you get this message: "This video or group may contain content that is inappropriate for some users, as flagged by YouTube's user community. To view this video or group, please verify you are 18 or older by logging in or signing up."


http://www.youtube.com/v/VIDEOID sends you to the player used by YouTube when you embed the video into a web page and this player doesn't perform country verifications and can't detect if you're logged in.

April 17, 2008

Finding the Right Signals to Rank Search Results

Udi Manber, VP for search quality at Google, talks in an interesting interview about search, personalization and the influence of social networks on finding the information you need.
Search has always been about people. It's not an abstract thing. It's not a formula. It's about getting people what they need. The art of ranking is one of taking lots of signals and putting them together. Signals from your friends are better signals, stronger signals. On the other hand, many searches are long-tail kinds of searches. If you're looking for what movies to see tonight, your friend can probably give you the best information. If you're looking for the address of the business, the Web as a whole can give you better information. If you're looking for something obscure about anything, again the web can give you much better information. It depends on the type of search you do and how to take all those signals and put them together.

The ranking algorithm is a recipe that takes into account many factors: an absolute value for the importance of a page, the relevance of the web page to your query, the location connected to the web page, the recency, the percentage of people with similar interests that found the web page helpful, the density of information etc. It's important to find the right signals, their importance and the contexts where you should apply them, but Google relies on engineers to adjust the recipe. At some point, Google will have to come up with an algorithm that automatically identifies potential signals.

"I found this surprising. Google manually comes up with tweaks to its search engine that only apply to a small percentage of queries, tests the tweaks, and then tosses them into the relevance rank? (...) Frankly, I thought Google was beyond this. Rather than piling hack upon hack, I thought Google's relevance rank was a giant, self-optimizing system, constantly learning and testing to determine automatically what works best," wrote Greg Linden in "The perils of tweaking Google by hand".

{ via Slashdot }

Google Maps Predicts Traffic Conditions

Google Maps can now predict traffic information for any day of the week and time of the day, based on past conditions. By default, if you click on the Traffic button in a supported area from the US, Google Maps shows real-time traffic information. "Comprehensive traffic data is available in over 30 major US metropolitan areas (including Los Angeles, New York, Chicago, and others), with partial coverage available in many more," according to Google Maps help center. There's also a traffic layer in Google Earth and Google Maps Mobile, but these applications don't include yet traffic prediction.


{ Thanks, Michael. }

Subscribe to Authenticated Feeds in Google Reader

Google Reader is one of the many online feed readers that don't support authenticated feeds. This special kind of feeds requires a username and a password before displaying the content to protect sensitive information. An example of authenticated feed is Gmail's feed for unread messages, but you'll also find password-protected feeds for internal bug reports, private email distribution lists, etc.

FreeMyFeed wants to solve this problem by creating feeds that don't require authentication. The site acts as a proxy between the original feed and your feed reader, while promising that your credentials are safe. "Usernames and passwords are never stored on the server. The usernames and passwords are only parsed to retrieve your RSS feed and then are discarded." Rob Wilkerson explains that the credentials "are encrypted using a rotating algorithm and included in the new URI."


It's not a good idea to enter your username and password in any other place than the site where you created them, but FreeMyFeed could be useful for feeds that are not tied to important accounts. Make sure you don't share any item from the generated feeds.

Google News Quote Finder

Google News can now detect quotes in news articles and attribute them to their authors. If you search for people like Eric Schmidt, Pope Benedict XVI, President Bush, Angela Merkel or Fernando Alonso, Google News will display a relevant quote and a link to other quotes from recent news articles.


You can search inside the quotes, sort them by date, restrict them to the last day or the last week. Unfortunately the links to quote listings aren't very descriptive (here's one example) and there's no option to find the quotes related to a news cluster. By default, Google sorts the quotes by relevance and gives more weight to the quotes that are used often.


{ via Google News Blog }

April 16, 2008

YouTube Search Enhancements

YouTube's search algorithms are increasingly smarter and borrow a lot of things from Google search: advanced operators, spelling corrections, related searches, query expansions. YouTube detects duplicate videos and shows the most popular copy in search results, followed by a link to the other videos. There's also an enhancement for videos that are split in two or more parts: YouTube displays a list of links to all of the episodes.

An Outdoor Campaign for Google Video

This very interesting outdoor campaign for Google Video Germany used a billboard imitating a real-life video player that captures the life as it happens. The tagline is "any film you can imagine", a simple message that encourages people to search for videos and to upload their own videos. AdFreak has an interesting explanation for the unusual idea: the see-through billboards suggest "that online video presents life in all its glorious randomness".

It's not very clear if the campaign was launched before or after the YouTube acquisition, but the fact that the video embedded below is from YouTube tells a lot about Google Video's success.

April 15, 2008

Google Earth 4.3 Adds New Navigation and Street View

The latest version of Google Earth brings a lot of interface changes and new features. There's a redesigned and improved navigation control that lets you change the perspective much faster. Here's the description from Google Earth's help center:

"1. Click the north up button to reset the view so that north is at the top of the screen. Click and drag the ring to rotate your view.
2. Use the Look joystick to look around from a single vantage point, as if you were turning your head. Click an arrow to look in that direction or continue to press down on the mouse button to change your view. After clicking an arrow, move the mouse around on the joystick to change the direction of motion.
3. Use the Move joystick to move your position from one place to another. Click an arrow to look in that direction or continue to press down on the mouse button to change your view. After clicking an arrow, move the mouse around on the joystick to change the direction of motion.
4. Use the zoom slider to zoom in or out (+ to zoom in, - to zoom out) or click the icons at the end of the slider. As you move closer to the ground, Google Earth swoops (tilts) to change your viewing angle to be parallel to the Earth's surface. You can turn off this automatic tilt (Tools > Options > Navigation > Navigation controls; Mac: Google Earth > Preferences > Navigation > Navigation controls)."

You can now display the sun by enabling View > Sun or clicking on the sun button from the toolbar. To create time-lapse views of sunsets and sunrises, click on the "play" button and watch the changes.

For some of the imagery, you can see at the bottom of the window an approximation of the date when it was taken. The Street View images from Google Maps are now available in a new Google Earth layer, which is not enabled by default.

Google Earth includes much more models in the 3D buildings layer for cities like: San Francisco, Boston, Orlando, Munich, Zurich. "Google has optimized the loading and performance of 3D buildings. When you first turn on the 3D Buildings layer near a city with models, you'll see simplistic versions of the buildings load up really fast, then they gradually get more solid and load more texture detail," explains the unofficial Google Earth Blog.


Google Earth 4.3 can be downloaded from earth.google.com. Windows users that don't want to install the application using Google Updater can try this direct link. You'll probably notice that the Windows setup is much smaller: the size has been reduced from 12.7 MB to 7.36 MB. Unfortunately, the new version seems to be less stable and it uses more resources, but it's still in beta.

For Google, Online Video = YouTube

Whenever a Google product adds a feature related to video, YouTube comes into play. Google Talk's gadget lets you watch YouTube videos, orkut lets you add videos from YouTube and Google Video, personalized maps can include videos from the same two Google-owned services, content producers that want to add their videos to Google News need to host them at YouTube and now local business owners can add videos to their Google Maps listings, but only if they are hosted at YouTube.

"In addition to using Google Maps to get local business details, read reviews, and check out photos, I can now also get a sneak peek with embedded videos. Local business owners can easily add YouTube videos along with other content such as business details, photos, and descriptions to their listings. To do so, simply upload your videos to YouTube and ensure that the 'embed' option is turned on," explains Google LatLong Blog.


Online video is more than YouTube and Google Video, but Google seems to ignore this. Even if YouTube's US market share is 73.18% (according to Hitwise), it's unreasonable to think that YouTube should aggregate all the videos that are available online. Google should encourage diversity and choices, instead of selecting the most convenient options.

Related:
Promoting your own services in search results: Google/YouTube vs Yahoo/Flickr

Google Updater, the New Installer for Google Software

Last year, I posted that Google intends to install all its applications through Google Updater, the central component of Google Pack. At that time, a small number of people were redirected to the integrated installer, but this behavior has now become a standard practice.

Because some of the files from Google Earth were corrupted, I had to uninstall it. When I went to Google Earth's download page, Google informed me that I have to install Google Earth with Google Updater.


Google Pack's help center gives some reasons why it's convenient to use the Updater, but most of them help Google promote other software. "The Google Updater makes the software installation process more convenient in several ways. First, it installs software easily with just a few clicks. Also, once the Google Updater is installed, you can choose to have a system tray icon notify you when new software becomes available. Finally, the Google Updater provides you with a central place from which you can download more Google software, as well as other software we think you'll enjoy." (my emphasis)

Probably the only reason why I use my computer is to install Google software and this updater finally helps me get things done. If I want to install Google Earth, it's obvious that I should be informed if Google launches other applications and I should be able to install them with a single click. Hopefully, in the next iterations of the Updater, the click will be eliminated and the new software will be installed automatically after analyzing my interests.


I installed Google Earth using the updater and the setup was launched in the background, with the default settings. Google Updater is installed as a system service that starts automatically, places an icon in the system tray and constantly pings Google to see if there are any updates for the Google software installed on your computer. By default, the application installs the updates automatically and can be uninstalled.


Google still offers the chance to install applications without the updater, but the page that points to the direct links is too difficult to find and has an inappropriate title. I'll repost the links here, for convenience.

Google Earth for Windows:
http://earth.google.com/tour/thanks-win4.html

Google Desktop for Windows:
http://desktop.google.com/index.html?rd=f

Google Toolbar 4 for IE:
http://toolbar.google.com/service/tbdl?hl=en&tbdata=T4

This practice is not Windows-only. Google's Mac software is installed only with the updater. "Google Updater is the installer for Google products on the Mac. You can use Google Updater to see which Google software you have installed and to see other Google applications you might be interested in. Google Updater helps keep your software up-to-date by installing updates when they become available. And you can use Google Updater to uninstall Google Software." Probably the most outrageous part from the Mac FAQ is the answer to the question: how do I uninstall Google Updater? "To uninstall Google Updater, you first have to uninstall other Google software on your computer. You can't uninstall Google Updater while you have Google software on your computer because we need it there to keep your software up-to-date."

Maybe Google should focus less on "we" and more on "you". Most Google software already has an option to auto-update and this could be easily added to the applications that don't have it. If the installers are too confusing, Google could simplify them and remove the unnecessary steps. I don't want to imagine what would happen if each application installed a system service for auto-update and used your network connection to constantly check for new updates.

Update: Apparently, I was lucky to install Google Earth in Firefox. If you use Internet Explorer, Google adds the options to install Google Toolbar and to set Google as the default search engine. Both options are enabled by default, so a standard Google Earth installation bundles Google Updater, Google Toolbar and changes your default search engine in Internet Explorer. This is way too much.

April 13, 2008

Google Starts to Index the Invisible Web


Google Webmaster Central Blog has recently announced that Google started to index web pages hidden behind web forms. "In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn't find and index for users who search on Google. Specifically, when we encounter a <FORM> element on a high-quality site, we might choose to do a small number of queries using the form. For text boxes, our computers automatically choose words from the site that has the form; for select menus, check boxes, and radio buttons on the form, we choose from among the values of the HTML. Having chosen the values for each input, we generate and then try to crawl URLs that correspond to a possible query a user may have made. If we ascertain that the web page resulting from our query is valid, interesting, and includes content not in our index, we may include it in our index much as we would include any other web page." For now, only a small number of websites will be affected by this change and Google will only fill forms that use GET to submit data and don't require personal information.

Many web pages are difficult to find because they're not indexed by search engines and they're only available if you know where to search and what to use as a query. All these web pages create the Invisible Web, which was estimated to include 550 billion documents in 2001. "Traditional search engines create their indices by spidering or crawling surface Web pages. To be discovered, the page must be static and linked to other pages. Traditional search engines can not see or retrieve content in the deep Web -- those pages do not exist until they are created dynamically as the result of a specific search."

Anand Rajaraman found that the new feature is related to a low-profile Google acquisition from 2005.
Between 1995 and 2005, Web search had become the dominant mechanism for finding information. Search engines, however, had a blind spot: the data behind HTML forms. (...) The key problem in indexing the Invisible Web are:

1. Determining which web forms are worth penetrating.
2. If we decide to crawl behind a form, how do we fill in values in the form to get at the data behind it? In the case of fields with checkboxes, radiobuttons, and drop-down menus, the solution is fairly straightforward. In the case of free-text inputs, the problem is quite challenging - we need to understand the semantics of the input box to guess possible valid inputs.

Transformic's technology addressed both problems (1) and (2). It was always clear to us that Google would be a great home for Transformic, and in 2005 Google acquired Transformic. (...) The Transformic team have been been working hard for the past two years perfecting the technology and integrating it into the Google crawler.

It's not clear what are the high-quality sites used by Google for the new feature, but this list includes some good options. Along with Google Book Search, Google Scholar, Google News Archive, this is yet another way to bring to light valuable information.