An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to gostips@gmail.com .

April 30, 2007

April 2007 Recap: Towards a More Personalized Google

In April, Google launched a Mac version for Google Desktop and a directory assistance service. Google also made its biggest acquisition: an advertising company called DoubleClick that will help Google increase its footprint in display advertising.

Google also bought Tonic Systems, a small company focused on creating PowerPoint tools in Java, and announced the addition of a presentation web application to Google Docs. Google Spreadsheets finally added charts, although you can't import or export files that include charts yet, and Google Docs is closer to a wiki system.

The concept of personalization was pushed further by expanding the search history to all the visited web pages and by showing more recommendations as a benefit of the privacy trade-off. Google Personalized Homepage became iGoogle and added new features that could transform it into a social network.

Google Search started to include results from Google News, Open Office documents get the same treatment as Microsoft Office files, and there are new experimental layouts that might replace the current interface of Google.

Read more:
All the posts from April

iGoogle Gadget Maker

As reported by Google Blogoscoped, Google Personalized Homepage will be rebranded as iGoogle and will let you build your own gadgets using wizards. The gadgets are very simple and are more like containers for things that matter to you: photos, videos, events.

"Once the gadget is created, you can invite other people to view & use the gadget, and make it publicly available for other people to view & use it." So whatever you choose to add to a gadget will be visible to the people you invited.

The wizards let you enter the settings for seven new gadget templates:

1. Photo album - add up to 7 photos that can be rotated.

2. GoogleGram - enter seven greeting messages.

3. Daily Me - type what you are doing.

4. Countdown - count the days until a special event.

5. Simple list - you can use it as a ToDo list, shopping list.

6. YouTube videos - up to 10 videos.

7. Freeform gadget - add an image and some text.

The gadgets are a way of staying in touch with your friends: if one of your Gmail contacts creates gadgets you'll be able to see them in a new section of iGoogle called "My Community".

Strange Suggestions in Google Blog Search

Google's blog search engine has a bug that shows weird suggestions for your query in the seventh page of results. While you can't see the weird "did you mean" for every query, I got the message for two different queries.

One of the suggestions was:
"Elena Dementieva" (naked| scandal| divorce| separated| cancel| lawsuit| police| injured)

As you probably know, | is an alternative way of entering the boolean operator OR that triggers all the results which contain at least one of the keywords separated by the operator. But this suggestion is most likely a bug because is completely unrelated to the query and very few people use advanced operators.

Visualize AIM Conversations in Google Earth

AIM, AOL's instant messenger, is more open every day. AIM host team (whatever that means) created two kml files that let you visualize interesting real-time information about AIM users in Google Earth.

One of visualization displays all the conversations started in the last minute as connections between two or more cities. The other one shows the most popular AIM users. By looking at the map, you'll notice that Germany is the only European country where AIM is really popular.

April 29, 2007

Share Big Files Fast

You'd think that all these communications tools that ask you to move your files online make it easy to move big files too. I've recently had to send an archive that had around 100 MB to a group of people. If I wanted to send the file to a single person, an instant messenger like Google Talk or Yahoo Messenger would've been a good option. I initially thought that BitTorrent is a good idea, but that works well for a big group that connects at the same time (though Pando is a nice client based on BitTorrent).

So file hosting had to work. Unfortunately there are many options and most of them are bad: they require to register, they have traffic limitations or they're slow. Here are some of the best file hosting solutions I've found:

mihd.net - they promise to let you upload files up to 2 GB (they lowered the limit to 200 MB temporarily), the site has a lot of text ads, but the download speed is very good (at least 100 KB/s) and there aren't hourly download limitations. The files are deleted depending on their file size: for 100 MB, they're deleted 45 days after the last download.

DivShare lets you upload files smaller than 200 MB, but they're never deleted. You also get some basic stats and previews for images, MP3s and videos.

MediaFire has a very nice interface and lets you upload files up to 100 MB without any limitation. The download speed is almost as good as for mihd.net and they even scan your files for viruses.


I ended up using mihd, although the other two sites are also very good. If you know something better that doesn't require registration, software and has very few or no limitations, I'm all ears.

Collecting Imagery for Google Earth

Mark Aubin, co-founder of Keyhole (the product known these days as Google Earth), provides some details about the process of collecting satellite imagery. Google has many data providers and the imagery is updated at least every three years.
Most people are surprised to learn that we have more than one source for our imagery. We collect it via airplane and satellite, but also just about any way you can imagine getting a camera above the Earth's surface: hot air balloons, model airplanes – even kites. The traditional aerial survey involves mounting a special gyroscopic, stabilized camera in the belly of an airplane and flying it at an elevation of between 15,000 feet and 30,000 feet, depending on the resolution of imagery you're interested in. As the plane takes a predefined route over the desired area, it forms a series of parallel lines with about 40 percent overlap between lines and 60 percent overlap in the direction of flight. This overlap of images is what provides us with enough detail to remove distortions caused by the varying shape of the Earth's surface.

The next step is processing the imagery. We scan the film using scanners capable of over 1800 DPI (dots per inch) or 14 microns. Then we take the digital imagery through a series of stages such as color balancing and warping to produce the final mosaic for the entire area.

We update the imagery as quickly as we can collect and process it, then add layers of information – things like country and state borders and the names of roads, schools, and parks - to make it more useful. This information comes from multiple sources: commercial providers, local government agencies, public domain collections, private individuals, national and even international governments. Right now, Google Earth has hundreds of terabytes of geographic data, and it's growing larger every day.

Use Google Desktop's Gadgets Outside the Sidebar


Google Desktop's sidebar is nice, but it takes an important part of your screen. Of course, you can disable the "always on top" option, but then you have to minimize all the applications (Windows+M) to see the sidebar.

A better option is to undock the gadgets from the sidebar and move them to the desktop. To do that, right-click on each gadget and select "Undock from sidebar" or drag them to the desktop. Then minimize the sidebar by clicking on the small button from the top of the sidebar.

You'll see a new search box in the taskbar, an option to reopen the sidebar, and a gadgets menu that lets you bring back a certain gadget or all of them.


An easier way to see all the gadgets when they're covered by other applications is to press Shift twice. Press the same combination again and they'll disappear.

April 28, 2007

Remove Software Preinstalled with New PCs

It's really hard to buy a computer preinstalled with Windows that doesn't have all kinds of bundled software. From anti-virus software like Norton AntiVirus, music players to DVD burners, links to affiliated services, these programs occupy a lot of space and memory, and you may not be aware of their existence.

Even Google has a three-year agreement with Dell to install Google Toolbar and Google Desktop on new PCs. "Google has reportedly sealed an anticipated deal with Dell to pre-install some of its software on PCs before shipment, partially closing an advantage Microsoft has long held on users automatically defaulting to its products," Forbes wrote last year.

If you don't like to let others choose for you and want an easy way to uninstall the software that comes with your new computer, PC Decrapifier could be a solution. It's like Google Pack in reverse: uninstall everything in just a few clicks.
So, you're the proud owner of a new PC. You anxiously open the box, dumping out the contents, casting the instructions aside. You feverishly push your old PC off the desk and get the new one set up. On the floor lies a pile of plastic wrap and twist ties. Your brand spanking new PC boots up only to greet you with a plethora of pop up advertisements pestering you to pay for anti-virus software or sign up for a music service. Your desktop is littered with website links for 'special offers.' The system tray is already full of programs that continuously use your internet connection to make sure that you're 'up to date.' (...) All of this stuff is placed on your new PC because the big companies like Dell, HP and others sell advertising space on your PC to put more money in their pockets at the expense of your time and frustration.

April 27, 2007

Google News Integrates with Google Finance

Google found a way to promote the not very popular Google Finance: link to it everytime it may be useful to find more about a company. After adding a Plus Box in the search results, Google News shows the tickers for companies mentioned in news articles.

In a recent Q&A, the product manager of Google Finance said that "Google Finance is most different from others in its approach to search. We've tried to make searching for financial information as easy as possible so you can search by public or private company by name or ticker, mutual funds, etf's, even by product or management name. We do other things differently, too, but I think search is the biggest differentiator."

Google Finance seems more like an add-on for search results than a destination for financial information.

Video Ads on YouTube

Red Herring announces that in a few months YouTube will include video ads. Most likely, you'll see ads only for the premium content, and the revenue will be shared with the content owners. This program should synchronize with "Claim your content", a feature that will automatically identify copyrighted material, assuming that content owners share some information about their videos.

It's not yet clear if anyone who uploads videos to YouTube can choose to have ads and to split revenue with Google, but one thing is sure: the format of the ads. "We're looking at executions like a very quick little intro preceding a video, then the video, then a commercial execution on the backside of the content", said YouTube's Suzie Reider.

While many people will say that those who visit YouTube don't like ads and they'll move to other video sites, Google's main priority is to create a model that works well for advertisers, but doesn't disrupt the user experience.

Google Desktop's Profiles

Google Desktop added so many independent features that the application should include some profiles you can select when you install the application.

1. Desktop search
If you don't want the sidebar, gadgets and other distractions, just right-click on system tray icon and select "Deskbar", "Floating deskbar", or "None", depending on the position of the search box.

2. Sidebar
If you only want the sidebar, but not the desktop search functionality, go to settings and check "Disable indexing of new items".

3. Application launcher
Google Desktop indexes the shortcuts from the Start Menu, the services from Control Panel. To use Google Desktop only as a searchable Start Menu, disable all the options from "Search types". You should also select "Launch programs/files by default" in the "Quick find" section. Then press Ctrl twice and enter the first letters of the program in the quick search box.

4. Web history search
Google Desktop indexes all the web pages you visit with Internet Explorer and Firefox. By itself, this is a very useful feature, so if you only want an enhanced web history, unselect everything except "web history" in the "search types" sections from the preferences page.

Another way to disable indexing, the sidebar or both is to go to Start Menu and click on "Uninstall Google Desktop" (or press Ctrl+Ctrl and type "Uninstall Google Desktop"). This works only if you have the latest version (Google Desktop 5), which is now out of beta and available in 29 languages.

April 26, 2007

Ranking Web Pages Based on Their History

A new Google patent describes some scores that could be used for ranking search results. These scores use information about a document, from the moment when Google first finds it to the present. The history of a web page could help Google determine if the content is fresh, still useful or outdated.

"Search engine may use the inception date of a document for scoring of the document. For example, it may be assumed that a document with a fairly recent inception date will not have a significant number of links from other documents (i.e., back links). For existing link-based scoring techniques that score based on the number of links to/from a document, this recent document may be scored lower than an older document that has a larger number of links (e.g., back links)."

"For some queries, documents with content that has not recently changed may be more favorable than documents with content that has recently changed. As a result, it may be beneficial to adjust the score of a document based on the difference from the average date-of-change of the result set. In other words, search engine may determine a date when the content of each of the documents in a result set last changed, determine the average date of change for the documents, and modify the scores of the documents (either positively or negatively) based on a difference between the documents' date-of-change and the average date-of-change. "

"Documents for which there is an increase in the rate of change might be scored higher than those documents for which there is a steady rate of change, even if that rate of change is relatively high. The amount of change may also be a factor in this scoring."

"Using this date as a reference, search engine may then monitor the time-varying behavior of links to the document, such as when links appear or disappear, the rate at which links appear or disappear over time, how many links appear or disappear during a given time period, whether there is trend toward appearance of new links versus disappearance of existing links to the document, etc. (...) By analyzing the change in the number or rate of increase/decrease of back links to a document (or page) over time, search engine may derive a valuable signal of how fresh the document is."

If a page still gets links one year after it was created, Google might assume it's still useful. If a page is constantly updated (like Wikipedia pages), the content could be more relevant to the reader. These are some simple rules that could remove outdated pages from the top results.

{ via Russel Shaw. }

Fresh News Become Standard Search Results

As promised last week, Google's web search results integrate results from Google News. The standard news OneBox that was displayed at the top of the page for queries related to recent news was replaced with a list of links to news articles, displayed anywhere in the top results.

The important news will rank higher than less important ones and they become a standard search result. This is the first step in Google's big plan of integrating every type of content in one index: an universal search engine that mixes web pages, images, videos, books, scholar papers, news articles and more.

It will still take some time until the change propagates to all Google's data centers, so you may not see results from Google News displayed as standard results (try to search for news-related queries like Bush, Iraq).

Creative Commons-licensed by rustybrick. More screenshots.

Homeless Internet Citizens

So you open the door, you step inside and you discover that your house is empty: no furniture, no books, no family, no pets. Your home became empty and nobody bothered to explain why.

That's what happened to Google Personalized Homepage for some users today. Says Michael M.:
Today, I logged on at college and all was fine on my homepage. When I got home, I turned on my laptop and my homepage had reverted to the default, with all of the default gadgets and the default theme. I tried to re-add my gadgets, but it keeps going back to the default style. I've tried clearing my cookies/history and signing in/out.

The personalized homepage is the page I visit most on the internet, it tracks all my news and weather and lets me keep track of my schedule and chat to my friends on Google Talk.

My homepage looks the same, but there's a big thread at Google Groups with people who lost their homepages. Google's answer is so endearing:
We're now in frantic-chase-down-this-bug mode here at the Googleplex, and I hope to have more info for you soon. For now, we're not entirely sure of this, but it's possible that changing your homepage theme might cause the problem. SO, if you still have your homepage intact, please avoid changing your theme until further notice. The big question I know you'll all want answered is whether you'll get your homepage back once we sort things out... and the really honest answer is that I hope so, but I just don't know yet.


Update: A day later, everything was back to normal in the Google land.

{ Thank you, Thomas, Redmar, Yasar, Michael, Joe. }

April 25, 2007

Blogger Widgets

Kirk, my favorite Blogger tweaker, wrote a very simple yet elegant code that sends you to a random post from a Blogger blog. It uses Blogger's JSON API and it's available as a widget if you upgraded to layouts or as a JavaScript code easy to paste in your template. I placed a link that uses this code in the right sidebar and it's addictive. Because it can send you to any post from the entire archive, it's a cool way to (re)discover a blog.
I wouldn't have thought many people would think something like this would be useful, but a recent post on Techcrunch goes ape over Wordpress having a 'random post' widget now. I figured if Wordpress has one, New Blogger should as well.

Kirk wrote even more complicated widgets for Blogger, like Picasa Web slideshow, archive calendar and the spectacular tag cloud. If only someone could build an automatic way to upgrade the classic templates to Blogger's new layouts.

Update: Blogger's API has difficulties, so you won't see a "random post" link in the sidebar. The error returned is: "Too many instances of callback".

April 24, 2007

Google Search Recognizes Open Office Documents

In addition to HTML files, Google indexes other file types like: PDFs, Microsoft Office files, Shockwave Flash files and more. Google offers you the option to read the HTML (or text) version of the cached file, in case you don't have an application that opens the file.

Google added OpenDocument format to the list of supported documents. OpenDocument is supported by important office suites like OpenOffice.org, Star Office, KOffice, but also by Google Docs & Spreadsheets. OpenDocument can be used to store many kinds of files, including documents (.odt), spreadsheets (.ods) and presentations (.odp).


Google's filetype: operator allows you to restrict the search results to files that have a specific extension. This table shows the number of files from Google's index for Open Office and Microsoft Office:

File types
Open Office
Microsoft Office
documents 88,200 42,300,000
spreadsheets 19,700 15,600,000
presentations 42,300 13,800,000

Google Transit Expands to Japan

Google's experimental trip planner is now available for Japan. Unlike Google Maps that shows driving directions, Google Transit is useful if you want to use public transportation, but the number of places where you can use it is really small (10 cities in the US and now Japan).

Google Transit is also available for mobile phones.

April 23, 2007

Gadgets and Personal Data

User-generated gadgets made personalized homepages a better place because users can choose from a wider variety of content and even create their own gadget if they have programming skills. But just because you see a gadget in Google's directory or elsewhere doesn't mean you have to trust it and handover personal information or credentials.

Jason wrote a popular gadget that showed your MySpace alerts. Of course, the gadget required you to enter your MySpace username and password (ideally, MySpace should have an API for authentication and data).

"A few months back, a flaw was discovered whereby usernames were being passed in clear-text as a querystring parameter when using the gadget. As a result of Google's mechanism that caches web-content, a list of usernames on a phishing watchlist website was cached in Google's search index, thus making them publicly accessible. Once Google was alerted of the issue, they contacted me immediately. Google took action and removed the cached content from their search index, and I took numerous steps to strengthen the security of the gadget - and to mitigate any future risks. Google even went as far as to work with the operators of the phishing watchlist to remove my name and IP addresses from the suspected phishers list."

But people thought that the flaw was intentional and accused him of phishing. "Due to a common misconception that the Gadget was actually being used to facilitate phishing activity, I have decided to remove it permanently. This is a particularly difficult decision because of the large number of users and the popularity of this gadget."

Google shows a warning everytime you add third-party gadgets, so you should be careful when you add gadgets from unknown sources. You should be even more careful when you enter personal data.

Try Google's Next Design Before It Goes Live

Google's latest design test for the search results pages and homepage is clever and has big chances of replacing the current one. The big change is that Google adds a navigation menu for its services placed at the top of the page. The list of services includes Gmail, Calendar, Google Docs (the last two are hidden under "more"). Under the search box, you'll see links to other specialized search engines that provide useful results for that search. For "Bush", there are many services listed, including news, news archives and blog search, but for most queries you'll see few services listed (for "Google", you'll see only news search; for "flowers", image search; for "c++", code search, blog search and groups).


Courtesy of Webbsnack, here's how you can test this new design.

Copy this code:

javascript:document.cookie= "PREF=ID=fddb01133a87d314:LD=en:CR=2:TM=1177334998:LM=1177334998:GM=1:S=OOg0FEVzpPplxe9J; path=/; domain=.google.com";void(0);

go to google.com/ncr, paste it in the address bar, press enter, then search for something clever. I absolutely love the new design, but if you don't like it, clear your Google cookie and it'll go away.


Small observation: the number of results should be at the top of the page. I sometimes search for things on Google just to see the number of results (this is my primitive grammar checker).

Update: There's an alternative version for this new layout.


The code for the second design (via Blogoscoped):

javascript:document.cookie= "PREF=ID=9d04e374b01fd77c:TM=1177187296:LM=1178229339:DV=AA:GM=1:IG=3:S=Iiqx5SsQA0p79zvy; path=/; domain=.google.com";void(0);

Google Ads Used as Malware Warnings


TorrentFreak found a clever way to advise the potential users three BitTorrent clients that they include malware.

"Well, we started to run Google Adwords campaigns on the Bitroll, Torrent101 and Torrentq websites. In just four days the ads were shown 20,000 times, the clickthrough rate was impressive, and the ads probably prevented more than one thousand people installing these malware-laden BitTorrent clients."

But the authors of those BitTorrent clients didn't lose money because a lot of people were advised to try other software: the ad-warnings paid off.

{ via Digg. }

April 22, 2007

Gmail Attachments

While Gmail is a great web mail service, the way it treats attachments might confuse some people.

In Gmail, you can't send executable files (.exe, .cmd) or ZIP archives that contain executables. To bypass this limitation, you need to rename the file and change its extension (don't forget to mention this to the person that receives your mail).

Although Google says you can't send attachments larger than 10 MB, Gmail is quite forgiving and lets you send files up to 13-14 MB, so you don't have to worry about size. [Update (May 2007): Gmail has increased the maximum attachment size to 20 MB. ] If you need to send bigger attachments or you send your files to a lot of people, consider uploading them to a file hosting site (like DivShare, mihd or QuickSharing) and including the URL in the mail. For documents that require collaboration and reviews, Google Docs is a good solution, while Picasa Web and Google Video are better options if you need to share photos and videos.

It's a good idea to select the files you want to attach before writing your email, because Gmail starts to upload them immediately, saving you precious time. If you want to be reminded to attach a file if you talk about attachments in your email, this Greasemonkey script is fairly good. To upload the files using drag and drop, install this Firefox extension.

Now that you sent your message, you may wonder how to retrieve it in the future. To search for emails that contain attachments, use: has:attachment. If you know some words from the title of an attachment or its extension, add them: has:attachment filename:pdf or has:attachment filename:author filename:review. Unfortunately, the only searchable attachments are text files, so you may want to upload a plain text version of your documents if you need to search their content later.

Google offers you the option to view online a lot of file types: Microsoft Word, Excel, PowerPoint files, PDF, RTF and even edit Word and Excel files using Google Docs. This is a simple way to convert all these file types to HTML. You can also listen to MP3 files directly from Gmail.

While Gmail offers plenty of space (almost 3 GB), it's not a very good idea to use it for storing files. There are tools that make it easy to upload files to Gmail (the most well-known are the Firefox extension Gspace and the Windows application Gmail Drive), but Gmail was not created for this purpose, so they're just clever hacks. If you upload too many files, Google could even lock your account for 24 hours.

Suggestions for Google Services

Most Google services have feedback forms where you can suggest new features or improvements, but some of them even list frequent suggestions and let you vote your favorites. The lists also give you hints about the future updates.

April 21, 2007

Customize Google Adds Infinite Scrolling for Google Search

Customize Google is a Firefox extension that adds or removes some features in Google's services, including links to competing search engines, removing ads and click tracking, rewriting links to point directly to images in Google Image Search.

Inspired by the infinite scrolling Greasemonkey script presented in a previous post, Customize Google includes a similar feature, which is not enabled by default, so you'll have to check "Stream search results pages" in the options. The infinite scrolling means that you don't have to click on "Next page" because the next search results are loaded in the background as you scroll down. This feature removes the related searches from the bottom of the page and some OneBox results, while not being able to function correctly when you hit back after clicking on a result, but it's still cool to enable it when you're in exploration mode.

Eric Schmidt Interviewed by John Battelle

John Battelle interviewed Google's CEO, Eric Schmidt, at Web 2.0 Expo four days ago. Eric Schmidt answered questions about Microsoft, the DoubleClick deal, users' privacy, YouTube and the future of Google. This was the place where Google announced they'll add a presentation tool to Google Docs.

April 20, 2007

Google Page Creator's Sitemap and Feed

If you have a site hosted by Google Page Creator, be prepared for a surprise. Google automatically builds a sitemap file that lists each and every file from your site created or only hosted by Page Creator.

The address of the sitemap file is http://sitename.googlepages.com/sitemap.xml, so anyone who knows the URL of your site (or just your Gmail address) can find all the files from your site. Some sitemaps are even included in Google search.

That means you shouldn't use Page Creator to upload personal files because even if you don't link to them, their addresses are easy to find in the sitemap. But there's also a bright side: you don't have to build a sitemap for your site to submit it to Google Webmasters Central.

Also, each site has a RSS feed that contains only the pages created with the online editor. The feed is available at: http://sitename.googlepages.com/rss.xml.


{ Via A consuming experience. }

Google Search to Incorporate News Results

Search Engine Land reports that Google will start to integrate results from Google News with the standard web results. Googlebot is not able to crawl news sites as fast as Google news bot, so if you search for recent events, it's likely you won't find fresh news in the top results. That's why Google used a news OneBox that displayed the top 3 results from Google News for queries that are associated with recent events.


Starting from next week, the news OneBox will be removed and the news pages will become standard web results, "This allows us to rank news according to relevance in search results rather than at top of the page", said Google's Marissa Mayer.

"Mayer said that the changes are a result of new technology Google has developed to dig deeper into news and find truly relevant stories, rather than simply displaying up to three headlines in the OneBox format, which were displayed based on keyword triggers rather than a deeper analysis of news content.

News results will appear anywhere in a search result page, and links to different sources will be clustered together, similar to how stories are grouped in Google News. Thumbnail images related to the news will also appear next to these results."

This change will make Google's results even fresher and it's a big step towards a universal search that integrates content from different specialized search engines and provides a single ranking.

What Has Google Done in Search Lately?

Google is continuously accused that it didn't improve its search engine (some even say Google doesn't care about search anymore). So how is Google search better than two years ago?

Google's index updates faster
While two years only very popular sites were updated every day, now a lot of sites are updated every 1-2 days. A search for [Google Marratech] returned 8/10 results about Google's latest days in less than one day after Google Blog mentioned about it. Yahoo and Live Search returned no result about the deal in the first page of results.

Closer to the natural language
Google no longer tries to find the exact keywords in the documents. It uses stemming so you can find documents that include "flower" even if you search for "flowers", it includes synonyms for the keywords and expands abbreviations. If you search for [Aretha Franklin birthdate], most results' snippets have "birthday" in bold.

Categorization
Google Co-op aims to label high-quality web pages, so users can refine their queries. While there are only 6 general topics, users can label any web page for their own custom search engines.

Personalization
If you have an account, Google can record your queries and the visited web pages to improve the quality of the search results. This way, Google can do a better job at disambiguating queries and putting your query into a larger perspective.

More specialized search engines
Book search, blog search, patent search, music search, news archive search and more. All these search engines allow you to focus on different kinds of information and get the best results from their limited scope.

Unified search engine
Google brings more information from its specialized search engines using OneBox results and PlusBoxes. Google also wants to include results from other search engines directly in Google search: the first stop will be news results. Other plans include results from book search. The goal is to have an universal search engine that includes every useful information crawled by Google.

Outside the SERP
Google experiments with interesting ways of integrating search results into content that adds value to a site. AJAX Search API lets you build web applications that take advantage of search results.

Mozilla Thunderbird, Integrated with Gmail

The latest version of Thunderbird, Mozilla's email client, lets you add a Gmail account without entering any other detail than your email address. The account wizard's poor wording might mislead people into thinking that Gmail is not a mail service, and it also forgets you to tell you how to enable POP3 in Gmail.

Among other new features, Thunderbird 2.0 adds message tagging, saved searches, more descriptive alerts that include email snippets.


Even if you like Gmail's web interface, a mail client is still useful to backup your email and for reading your messages offline. If you use multiple email clients, your Gmail messages are available only to the first client that requests them. To change that, replace name@gmail.com with recent:name@gmail.com in your client's settings.

April 19, 2007

Google Web History

Last month I wrote a post titled Web history, the next step in Google's personalization (I quote myself):
Google's plans for using personalization to improve search results could face some difficulties. Google already uses your queries, the results you click on, your bookmarks, but this isn't enough to build a comprehensive profile. People don't search too many times and, most often, they click on the top search results.

So I think the next step in Google's efforts to tailor the search results to your preferences is to expand the search history into something more complex: the web history. Browsing web pages is an important part of your online activity and there are already applications like Google Desktop that monitor and index the visited web pages.

Google Web History is a reality starting today. This replaces the previous search history service that was limited only to queries and search results. If you want to add the web pages you visit, you need to have Google Toolbar with the PageRank feature activated and to enable web history here. It's just the regular toolbar, but you'll have to explicitly allow Google to use the PageRank feature to record all the visited web pages and associate them with your Google Account.

The listing mixes visited web pages with Google searches, as you can see in the screenshot below:


"You know that great web site you saw online and now can't find? From now on, you can. With Web History, you can view and search across the full text of the pages you've visited, including Google searches, web pages, images, videos and news stories."

Besides keeping track of all the web pages you visit and making them searchable online, Google Web History is used to improve the personalized search results.

"Web History uses the information from your web history or other information you provide us to improve your Google search experience, such as improving the quality of your search results and providing recommendations."

Google says they encrypt all the data and you're the only one who can access it (they even ask your password multiple times during a session). Web history is a feature implemented in most modern browsers, but the storage is limited and the history is usually deleted after a small number of days. Google's new feature lets you store your entire web history online (and that sounds pretty scary).

Of course, you can pause the service at any time and even delete the entire web history, but the big question is: do you trust Google enough to send it all your online activity?

Related:
Personalized Google
MyLifeBits - Making your life searchable

Changes in the Names of Google's Products

Google's shopping site has been rebranded from the catchy-but-not-very-clear Froogle to the simple-yet-boring Google Product Search. "You may be familiar with our product Froogle (a pun on "frugal"). Froogle offers a lot of great functionality and has helped many users find things to buy over the years, but the name caused confusion for some because it doesn't clearly describe what the product does," laments Marissa Mayer on Google's weblog.

"The ill-named Froogle was a problem from the start. "I don't think we understood the complications with rolling out another brand," Marissa Mayer, Google's vice president of search product and user experience, said in an interview with CNET News.com. "While it was a cute and clever name, it had issues around copyright and trademark, as well as internationalization. The pun (to "frugal") isn't obvious.", realizes Google five years after product's launch.

Google has a reputation of launching products with long and unattractive names that have the advantage of being very descriptive (Google Blog Search tells you more about the product than Technorati, but it's also less memorable). Some of the exceptions to the rule were: Gmail, Froogle, orkut, AdWords, AdSense, but also acquired services/products like: Blogger, Picasa, YouTube, which kept their original name.

Another change that should happen pretty soon is replacing "Google Personalized Homepage" with "iGoogle", the name behind the URL google.com/ig. With the addition of presentations and wikis, "Google Docs & Spreadsheets" should better choose a different name than "Google Docs, Spreadsheets, Presentations and Wikis" (what about Google Docs?). Also "Picasa Web Albums" is a long and strange name for a photo sharing service and could easily transform into "Google Photos".

Google has a very strong brand and should include that brand in the names of their products, but that doesn't mean there's no room for creativity or consideration for people who actually have to remember or type those names.

(On a related note, maybe Google Operating System is too long as well. But, hey, I'm not a Google property.)

April 18, 2007

Your Search History Does the Magic

Google's foray into search personalization stumbled into the great concept of recommendation. By correlating your search history, bookmarks and other data obtained from Google services with other users' data, Google recommends you web pages that might interest you. The first fruit of their work was a Google gadget that initially showed recommended searches, pages and gadgets. Then Google started to show recommended videos and news.

Now the recommended web pages are available in a feed, but also from an URL that sends you to a random recommended URL: Google Recommendation.

If you have Google Toolbar, add this custom button instead. Google's Sep Kamvar describes the magical button: "Click on the dice, and we'll take you to a site that may be interesting to you based on your past searches. If you want another, just click the dice again and we'll show you a new one. We'll give you up to 50 new sites per day that might be of interest."


Google split the all-in-one recommendation gadget into six smaller gadgets for: searches, web pages, news, videos, groups and other gadgets. To get them in your personalized homepage, go to this page and add the six gadgets one by one (alternatively, create a new tab titled "Recommendations"). Here's what I see (click to enlarge):


While this is no StumbleUpon, you may discover a lot of interesting web pages related to your queries you wouldn't have found otherwise.

Yahoo PayPal Checkout

It's so sad to live in a divided world where companies try to be better than their rivals by imitating everything the rival does. Yahoo's lack of creativity is proven by the latest partnership with PayPal to counter Google Checkout, which gained some traction thanks to Google's aggressive promotions.

"With the Yahoo! PayPal Checkout Program, a blue shopping cart icon appears next to your ad in Yahoo! search results. This can help your ad stand out, and let customers know you offer PayPal Express Checkout, from the brand known for security."

PayPal offers free processing until the end of the year, like Google. But there's more:

"The ability to quickly locate PayPal merchants will save you some time because the PayPal checkout system remembers all of your personal information, providing you (and me) with the convenience of a single username and password, as well as a consolidated look at your transaction history so you can view all of your purchases and track each items' shipping progress."

This seems pretty familiar, isn't it? Here's a fragment from a Google post written after Google Checkout's launch, in June last year:

"One cool feature of Google Checkout is that you can buy from stores with a single Google login – no more entering the same info each time you buy, and no more having to remember different usernames and passwords for each store. To help you find places to shop, you'll see a little icon on the Google.com ads of stores offering Google Checkout. It's an easy way to identify fast, secure places to shop when you search. And after you've placed your order, Google Checkout provides a purchase history where you can track your orders and shipping information in one place."


This is only about ads, Google and Yahoo have many common advertisers and there are some merchants who use both PayPal and Google Checkout. Google forgot about users' choices to promote Google Checkout, while Yahoo is always on Google's footsteps and replicates every new feature or product.

Google Life Search (The Chinese Google Base)

In 2004, BBSpot wrote a funny post about Google Life Search, a service that "uses a stream of magnetically targeted electrons to index a user's memory. (...) We think of this as the photographic memory you never had. Simply type in what you are looking for and Google Life Search will quickly locate that item. For example if I enter 'car keys' Life Search responds with the result 'In your pocket', and there they are right where it said!"

Three years later, Google will launch a service called Google Life Search, even though only in China. At least that's the latest message included in Google's translation program.

"Label for the tab on the homepage or search results page that leads to Google Life Search (google.cn only)."

So this is a search engine targeted to China's population. It can't search for alien life or for the origin of life, but it could be a search engine for interesting things to do or for lifestyle information. What do you think "Google Life Search" could be?

Update (May 19): The service is now live at Google China and it's only a nice interface for Google Base. You can use it to search for structured information from housing, recipes, products and more.

April 17, 2007

Google Spreadsheets Adds Charts

Google Spreadsheets finally added charts. This feature has been developed for many months and was one of the biggest lacks from Google's spreadsheet application.

You can create more types of charts: columns, bars, lines, pie, scatter, add labels and a legend. Just select the columns you want to plot and click on the new chart icon. After inserting the chart in the spreadsheet, you can save it as a PNG image or edit it.

The charts are rendered as SVG in Firefox/Opera and VML in Internet Explorer, so they don't require plug-ins. As usually, Opera is not officially supported, so you'll find things that don't work as expected.



You can annotate cells and search using Google the text from a cell.

Google Buys PowerPoint Solutions

Google bought Tonic Systems, a company specialized in Java solutions for PowerPoint.

"Tonic Systems is a San Francisco-based company that provides Java presentation automation products and solutions for document management (...). Features of their products included text extraction for indexing documents, presentation creation capabilities and document conversion tools."

Their most important products are:
  • TonicPoint Builder - Java library to programmatically read, create and manipulate presentations.

  • TonicPoint Transformer - Java library to convert presentations into images (e.g. PNG, BMP and JPEG), PDF documents, Macromedia Flash (SWF) and Scalable Vector Graphics (SVG).

  • TonicPoint Filter - Java library to extract text from presentations with full contextual data. The text extraction library orders and organizes the text into its proper position relative to the other information. The output retains the vital information such as which text is on which slide, which are the master slides, which notes belong to which slide, and more.

  • TonicPoint Viewer - Free PowerPoint viewer application for Windows, Mac, and Unix.

This straightforward presentation should be enough to realize that Google bought this company to integrate a PowerPoint-like tool in Google Docs. The official Google blog announces that the presentation tool will be launched this summer.

The question is why Google decided to buy yet another company when they've already developed Google Presently, which seemed an extension of Writely, but probably not powerful enough. It's interesting to see Google buying a lot of companies to accelerate the development of already existing products. This trend shows an impatient Google who wants to build everything, but relies more and more on external resources.

"TonicPoint Viewer, a standalone Java application that allows you to open and view PowerPoint presentations on any platform. The Viewer supports the standard PowerPoint file format used by PowerPoint 97, 2000, XP, 2003, etc. The Viewer uses TonicPoint Transformer technology to display sharp, crisp images of your slides."


Update: Here's an online application created by TonicPoint that will be used for Google's PowerPoint (the link doesn't work anymore, but I've got some screenshots).



How to Disable Google Personalized Search

In February, Google Personalized Search got out of beta and was enabled by default in every Google account. To personalize your search results, Google uses more sources, the most important one being search history. You can pause or even delete the search history, but you may find it useful for future reference.

If you don't want personalized search results, Google recommends to log out, but this may not be the best option if you use other Google services at the same time (for example, you edit a document in Google Docs).

The "deus ex machina" in this story is the pws parameter that can be added to the address of a search results page to control the personalization.

This URL corresponds to a search for [Google blog] without personalization:
http://www.google.com/search?q=google+blog&pws=0.

Try it when you're logged in and compare it to:
http://www.google.com/search?q=google+blog.

How to temporarily disable Google Personalized Search? Add &pws=0 in the address bar at the end of a Google search URL. Or drag this bookmarklet to your bookmark toolbar:

Google Apps Demo

Rajen Sheth, who manages Google's enterprise products, shows in a 17 minutes video how to use Google Apps in a company or organization. The video focuses on the integration between Gmail and other Google products: email is seen as a starting point for collaboration. Gmail improves productivity by allowing you to transform attachments into collaboratively-edited documents or to add events from an email to a shared calendar.

Mac Theme for Google Reader

Hicksdesign wrote a custom theme for Google Reader, influenced by the clean design of Mac applications. The theme is actually a user stylesheet that includes new icons and leaves more room for the content. It should work in Firefox, Opera, Safari, Camino, Omniweb (from the instructions, in Opera and Omniweb setting up the theme is a one-step process).

There's also a similar theme for Bloglines, the elder brother of Google Reader.

April 16, 2007

Dodgeball Founders Leave Google


You may remember that Google bought a small mobile social network called dodgeball. You don't remember, right? Well, don't be upset: Google also forgot about it. And the two founders decided to leave the "Mountain View-based Internet behemoth". For good.
It's no real secret that Google wasn't supporting dodgeball the way we expected. The whole experience was incredibly frustrating for us - especially as we couldn't convince them that dodgeball was worth engineering resources, leaving us to watch as other startups got to innovate in the mobile + social space. (...) It wasn't worth being that frustrated all the time - it was making us both crazy.

So now dodgeball looks almost the same as two years ago, while the mobile social space evolved and services like twitter or dada grew a lot lately. The obvious question is: why did Google buy Dodgeball?

{ Via Digital Markets. }

Google to Sell Ads on Clear Channel's Radio Stations

Google made a deal with Clear Channel, the largest radio station group owner in the US, to sell audio ads. "The deal will run for several years, and will give Google access to just under 5 percent of Clear Channel's commercial time. That will include 30-second spots on all of Clear Channel's 675 stations during all programs and all times of the day, executives at both companies said in interviews yesterday," reports New York Times.

Google acquired dMarc Broadcasting, a radio ads platform, last year and integrated it in Google AdWords. They already sell ads on more than 800 radios, including XM satellite radios, but this is the first major test for Google audio ads.

Google wants to create a platform that allows advertisers to create and sell ads online and offline from the same place. And that includes TV ads, newspaper ads, display ads - everything managed from the same Google AdWords, using similar metrics, concepts and targets. In a recent interview from Wired, Eric Schmidt said one way to look at Google is "as an advertising system".

April 15, 2007

Visualizing Human Feelings


We Feel Fine is the name of the project that gathers texts expressing human emotions from blogs. "Every few minutes, the system searches the world's newly posted blog entries for occurrences of the phrases "I feel" and "I am feeling". When it finds such a phrase, it records the full sentence, up to the period, and identifies the "feeling" expressed in that sentence (e.g. sad, happy, depressed, etc.)."

We Feel Fine aggregated a database of millions of human feelings that can be explored by restricting the view to several parameters like the age, gender, or the geographical location of the post author. In of the views (called "madness"), each feeling is represented by a colorful particle that moves around the screen.

"The Madness movement, with its network of many tiny colorful particles, was designed to echo the human world. Seen from afar, Madness presents a massive number of individual particles, each colored and sized uniquely, each flying wildly around the screen, proclaiming its own individuality. At this level, Madness presents a bird's eye view of humanity – like standing atop a skyscraper and peering down at the street. People bustle to and fro, darting in and out of shops, hailing taxis, falling in love, laughing, handling personal crises. From the skyscraper, the people below are like ants – their words cannot be heard, their facial features cannot be seen, and the notion of individuality is hard to recognize. At this level, each particle seems insignificant."

There's also a view that displays the most common feelings. Right now, they are: "I feel..." better, bad, good, right, guilty, sick, (the) same.

But the most interesting views are "Mobs" and "Metrics" that show the most representative feelings for a population (for example, men aged 20-29 from UK) or the most representative population for a feeling. The system shows that two times more women than men feel happy at the moment.

April 14, 2007

Why There's No DoubleClick Ad on Google.com

John Battelle tells Google's story in "The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture". An interesting episode happens in 1999, when Google still tried to find a business model and when DoubleClick's banners didn't seem the right way to make money.

Near the end of 1999, Google Inc. had thirty-nine employees, most of whom were engineers of one stripe or another. Omid Kordescani, Google's newly hired sales chief, was still plowing the fields for enterprise deals, but they were few and far between. With more than $500,000 (and growing) going our the door each month and less than $20 million in the bank, you didn't need a Stanford PhD to do the math: the company needed a business model that worked.

There was always the failback of simply running banners on Google's prodigious traffic — one deal with DoubleClick, an ad network that specialized in serving graphical banners, would probably net the company millions of dollars. But that felt like a sellout — DoubleClick's ads were often gaudy and irrelevant. They represented everything Page and Brin felt was wrong with the Internet "They didn't want to turn the Web site into the online version of Forty-second Street," recalls investor and director Michael Moritz.

Instead, the young executive team decided to try a more focused approach—it would sell text-only ads to sponsors targeting particular keywords. When you searched for "Ford cars," for example, an ad would appear at the top of the results for Ford Motor Company. These first advertisements were sold on a cost per thousand (CPM) model. (…)

Turns out the ads worked well enough, but they didn't scale. Revenue was limited by Kordestani's ability to sell, and despite his talents, it was difficult to book enough orders to create a healthy business. "It didn't generate much money," Brin recalls, referring to the program as a "hand-patched life preserver." DoubleClick, he adds, was the ocean liner Google would swim to should the life preserver fail.

Infinite Scrolling in Google Search

If you hate clicking on "next" in Google search, but you don't want to set a higher number of results in the preferences because the page loads slowly, this Greasemonkey script might be for you (requires Greasemonkey for Firefox). It loads the next page of results as you scroll down so it gives the illusion of "infinite scrolling".

This is a Japanese script as it was created by two people from Japan. One downside of the script is that it opens search results in a new window/tab, but removing that bit of code causes weird effects.

If you want a native "infinite scrolling" in Google, try SearchMash and keep pressing the space bar to automatically fetch the next results page. Microsoft's image search is also a good implementation of the concept and probably the first major search engine that used "infinite scrolling" (at first, Windows Live Search used it for web search results as well, but the feature was removed).

Call Google 411 Wherever You Are

Google launched last week a directory assistance service that uses voice recognition to automatically answer queries about the US local businesses. The service is free, but if you're not in the US or Canada, it's not very easy to see how well it works.

Fortunately, Yahoo Messenger lets you call toll-free US numbers that start with 800, 888, 877, or 866. This is a fairly recent feature that was added in Yahoo Messenger 8.1. All you need to do is call 8004664411 and follow the directions.

Skype also offers free calls to toll-free phone numbers, but Google's number is always busy.

If you discover some interesting bugs or funny answers, record the conversation using a software like SoliCall or Audacity*, upload it to odeo and post a link in the comments.

* Most sound cards have the option to record mix that captures the output signal from wave channel AND input signal from microphone channel. In Windows XP, go to Start/Run, type sndvol32 and in Preferences toggle the "Recording" option and select "Stereo Mix" (make sure you select in the main dialog as well). Then you can record the sound using with Audacity or other audio editor.

April 13, 2007

Google Pays $3.1 Billion for DoubleClick


I don't know if there's a single ad blocker or cookie filtering program that doesn't include doubleclick.net in its black list. For me, DoubleClick is associated with ugly animated banners and tracking cookies. But since today, DoubleClick is a part of Google's empire and will help it expand in the display ads area, where Google failed to attract too many advertisers.

"Web advertising leader Google Inc. said on Friday it has agreed to acquire DoubleClick Inc., a top online advertising network, for $3.1 billion, beating out other major Internet players with its bid."

The major Internet player outbid by Google was Microsoft and that was probably the explanation for this huge value paid by Google for the largest and most ungoogly acquisition ever made.

"Acquiring DoubleClick expands Google's business far beyond algorithm-driven ad auctions into a relationship-based business with Web publishers and advertisers. (...) DoubleClick's exchange is different from the ad auctions that Google uses on its networks because the exchange is open to any Web publisher or ad network — not just the sites in Google’s network," notes New York Times.

But what is DoubleClick anyway?
DoubleClick is a provider of internet ad serving software. Its clients include agencies, marketers (Universal McCann Interactive, AKQA etc.) and publishers who service customers such as Microsoft, General Motors, Coca-Cola, Motorola, L'Oreal, Palm, Visa USA, Nike, Carlsberg and many more. (...)

DoubleClick was founded in 1995 as Internet Advertising Network by Kevin O'Connor and Dwight Merriman. DoubleClick was initially engaged in the online media business, meaning it helped web sites sell advertising to marketers. In 1997 the company began offering the online ad serving and management technology they had developed to other publishers as the DART services. During the internet downturn, DoubleClick divested its media business, and today DoubleClick is purely involved in ad management from the technology end — uploading ads and reporting on their performance. (...)

DoubleClick is sometimes linked with the controversy over spyware because browser cookies are set to track users as they travel from website to website and record what commercial advertisements they view and select while browsing. However, the company maintains that it is important to understand the difference between DoubleClick's ad serving tags and the spyware/adware companies.

Update. In Google's press release, Sergey Brin says:
"It has been our vision to make Internet advertising better - less intrusive, more effective, and more useful. Together with DoubleClick, Google will make the Internet more efficient for end users, advertisers, and publishers." And what about the lack of intrusiveness?

Switch from Google Maps to Google Earth

Oftentimes I find something interesting in Google Maps and want to see it more detailed in Google Earth. You could try to repeat the search in Google Earth, but that's not the best idea.

To move to the same location click on "Link to this page", go to the address bar and copy the value of the ll parameter from a Google Maps address. Then type the value in Google Earth's search box. Here's an example:

http://maps.google.com/maps?f=q&q=Paris,+France&layer=&om=1&
z=14&ll=52.046737,-0.198269&spn=0.024229,0.080338

If you want to save some clicks, after clicking "Link to this page", add "&output=kml" at the end of the URL and hit Enter. A dialog will ask you to open or save a KML file. Choose to open the file in Google Earth.

There's even a bookmarklet that automates the process. Bookmark this link or drag it to your bookmarks toolbar. Next time you want to switch to Google Earth, just click on the bookmarklet.

April 12, 2007

A Simplified Version of Google's Spell Checker

Peter Norvig (from Google) explains in a detailed article how to write in 20 lines of Python code a spell checker almost as good as the one used by Google to show the famous "did you mean" corrections. Well, at least for one-word corrections.
We will read a text file, holmes.txt (that I happened to have on my laptop) which is a collection of Sherlock Holmes stories (from Project Gutenberg) consisting of about 100,000 words. We then extract the individual words from the file (using the function words, which converts everything to lowercase, so that "the" and "The" will be the same). Next we train a probability model, which is a fancy way of saying we count how many times each word occurs. (...)

Now let's look at the problem of enumerating the possible corrections c of a given word w. It is common to talk of the edit distance between two words: the number of edits it would take to turn one into the other. An edit can be a deletion (remove one letter), a transposition (swap adjacent letters), an alteration (change one letter to another) or an insertion (add a letter). (...)

The literature on spelling correction claims that 80 to 95% of spelling errors are an edit distance of 1 from the target.

A simple way to define the error model was to say that "all known words of edit distance 1 are infinitely more probable than known words of edit distance 2, and infinitely less probable than a known word of edit distance 0". From all the candidates for the correction, you can choose the most frequent word.

In Peter Norvig's tests, this simple algorithm returned correct answers in more than 80% of the cases. Of course that Google has more data than the holmes.txt file (it crawls the web, right?) and has access to a huge list of queries and refinements that could improve the algorithm, but this is an example of a simple yet powerful program.

3D Buildings in Google's Street Maps

Google Maps adds a new dimension to buildings in the street maps of more than 30 important cities from the US (like New York or Boston) and Japan. Frank Taylor from the unofficial Google Earth Blog argues that they're not yet 3D. "You can't rotate or tilt your view, so it's 2D. But, the buildings have shaded 3D-like projections from an angle so you get an idea of height and shape of the buildings."


This is, in fact, Google Earth's building layer. To see it, go to the layers sidebar and enable "3D buildings".


Google Earth's layers contain a lot of useful information that enhance the satellite imagery: road names, airports, parks, populated places, pictures and more. The cool thing is that they reside on Google's servers, so you don't need to update Google Earth to see new information. Google Maps could become much more useful if it included these overlays.

April 11, 2007

The Sad Story of Darfur in Google Earth

"Girl with traumatized baby sister. The baby has not made a sound since the day their parents were slaughtered and the village burned."

When you hear about sad stories from far away, they rarely touch you. It's hard to be impressed by the sufferance of someone who doesn't have too much in common with you.

The United States Holocaust Memorial Museum presents in a Google Earth layer the consequences of a conflict from Darfur, a region situated in Sudan.

BBC tells the story:
Sudan's government and the pro-government Arab militias are accused of war crimes against the region's black African population, although the UN has stopped short of calling it genocide. (...)

The conflict began in the arid and impoverished region early in 2003 after a rebel group began attacking government targets, saying the region was being neglected by Khartoum. The rebels say the government is oppressing black Africans in favour of Arabs. (...)

[The government] admits mobilising "self-defence militias" following rebel attacks but denies any links to the Janjaweed, accused of trying to "cleanse" black Africans from large swathes of territory. Refugees from Darfur say that following air raids by government aircraft, the Janjaweed ride into villages on horses and camels, slaughtering men, raping women and stealing whatever they can find.

"I was living with my family in Tawila and going to school when one day the Janjaweed entered the town and attacked the school. We tried to leave the school but we heard noises of bombing in the town and started running in all directions. All the girls were scared. The Janjaweed entered the school and caught some girls and raped them in the class rooms. I was raped by four men inside the school. When they left they told us they would take care of all of us black people and clean Darfur for good."

The refugees, their destroyed villages and a disturbing story - in a Google Earth layer (requires Google Earth, obviously).

Opera 9.20 - More Homepages at Your Fingertips

The latest version Opera (a free browser from Norway) brings the speed dial from your phone to your browser. You can configure a start page with nine boxes where you can add frequently visited sites. The page shows up everytime you open a new tab, but the sites added to the start page can be opened by simply clicking on Ctrl-[number from 1 to 9].

Opera shows real-time thumbnails of the selected pages and lets you reload them at a custom interval so it's a cool way to monitor changes.

There's also a link to Opera developer tools, a list of bookmarklets that add some of the best features from Firefox's DOM Inspector and the most popular extension for developers: Web Developer Toolbar. You can inspect, edit or remove DOM nodes; view, edit or disable stylesheets; view HTTP headers and cookies. It looks pretty impressive for a JavaScript bookmarklet.

Opera finally becomes a normal browser: if you enter a query in the address bar (something that's not an URL or a single word), Opera performs a search.

Scan a File Using the Top Antivirus Software

If you get a file from a site you don't know very well (like a game or a screensaver), the first thing you should do is scan it using an antivirus. The problem is that your antivirus might not be very good or might not include the signature of the trojan included in the file you've just downloaded. So a good idea is to have a second opinion, but you can't install more than one antivirus (unless you disable the real-time protection).

VirusTotal is a site where you can upload a file smaller than 10 MB and it will be scanned by a large number of antivirus software (the current number is 31), including: Kaspersky, BitDefender, F-Secure, Panda. The file will not be scanned instantly, but you'll have to wait a short time (usually around one minute), depending on site's load. You'll get a report like:


If you see conflicting responses, look at the most trustworthy engines (some tests) and at the number of engines that report a virus. In the situation depicted in the screenshot, I can safely assume that the file is clean.

The service is available by email too: send a mail to scan@virustotal.com with the subject SCAN. If you use Gmail, you'll have to rename executable files (for example, from setup.exe to setup.ex1) to be able to send them.

A similar service is Jotti's malware scan, that has a bigger limit for the file size: 15 MB, but uses less antivirus engines.

{ Thank you, Google! }

April 10, 2007

Open-Source OCR Software, Sponsored by Google

Google sponsors the development of an open-source OCR software at the IUPR research group. "OCRopus is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities."

"The goal of the project is to advance the state of the art in optical character recognition and related technologies, and to deliver a high quality OCR system suitable for document conversions, electronic libraries, vision impaired users, historical document analysis, and general desktop use," explains Thomas Breuel, who leads the project.

The software is partly based on Tesseract, the best open source OCR engine available for now. While the project is expected to be released at the end of next year and will be used for Google's book scanning project, the team has some interesting applications in mind:

* a web service interface
* PDF, camera, and screen OCR
* integration with desktop search tools: Beagle, Spotlight, Google Desktop

The most popular OCR software are ABBYY FineReader, Omnipage, Readiris and Presto OCR, but they're pretty expensive (starting at $100). A decent solution to perform OCR on a document is Microsoft Office Document Imaging, included in Microsoft Office XP/2007. Microsoft Office OneNote 2007 also lets you OCR imported images. A free online alternative is Scanr, a site that lets you digitize documents by sending a mobile phone photo by email.