Wednesday, April 13, 2005

Amazon.com just came out with a fascinating new feature-- "Statistically Improbable Phrases."

Amazon.com's Statistically Improbable Phrases, or "SIPs", show you the interesting, distinctive, or unlikely phrases that occur in the text of books in Search Inside the Book. Our computers scan the text of all books in the Search Inside program. If they find a phrase that occurs a large number of times in a particular book relative to how many times it occurs across all Search Inside books, that phrase is a SIP in that book.

I'm really fascinated by this kind of 'smart searching' as one might call it-- seeing patterns or hiccups in data would be extremely useful in my work and raises a lot of intellectual questions with me too.

Anyone know of similar projects and/or research elsewhere? I'd be curious to hear about them!

No comments: