July 18, 2007

Biznology Blog by Mike Moran

« That "I Don't Know" Thing | Main | Marketing 2.0 »



What Semantic Search Isn't

You may have heard the term "semantic search," but do you really know what it is? Some people have very big ideas of how computers will understand the meaning of text, but today's semantic search falls far short of that. Regardless, what's possible today is still very useful.

To understand how hard it is for computers to really understand the meaning of text, let's not look at understanding entire documents or even paragraphs. Let's not even look at sentences. No, let's start with something extremely simple: noun phrases.

Here's a simple noun phrase: bath soap. It has a simple meaning, too—soap used in the bath. Let's look at different phrase now—wood soap. It means soap used to clean wood. And one more: glycerine soap—soap made of glycerine.

Three noun phrases about soap and the modifying noun means something different each time. It's not easy for software to interpret them correctly, as you might imagine. I don't think you'll see software that can correctly interpret most noun phrases for quite awhile.

So what kind of semantic search is possible?

Today's keyword search can be vastly improved using mere part-of-speech analysis. Consider the law enforcement officer looking for a report on someone driving a Neon car. If it is an old Neon, it was a Dodge Neon. Newer models are Chrysler Neons. It's likely the police reports that should be found contain neither the words Dodge nor Neon. So how do you do a keyword search? Searching for "neon" alone finds neon signs, neon lights, and other spurious results. Searching for "neon car" likely finds nothing.

Enter semantic search. With a semantic search facility, looking for "neon car" causes the system to look for occurrences of the word "neon" that denote cars. Simply knowing that a car is a noun eliminates almost all spurious results ("neon" is a modifier in the phrase "neon lights"). A bit more smarts, such as looking for forms of the word "drive" in the same sentence improves the results even more.

So even though semantic search is a big idea, practical implementations exist today to improve search results. Does your search facility have the smarts that semantic search proivides?

Posted by MikeMoran at July 18, 2007 2:06 AM

Trackback Pings

TrackBack URL for this entry:
http://www.mikemoran.com/mt/mt-tb.cgi/296

Comments

Powerset's blog (blog.powerset.com) Also discussed about how they handle noun-phrases and get what they mean... a few days back.

Posted by: S at July 18, 2007 11:36 PM

I just read the post (http://blog.powerset.com/2007/6/26/noun-noun-compound-is-like-a-chocolate-box) and it is very interesting. It certainly seems like it's worth a try.

Posted by: Mike Moran at July 21, 2007 10:36 PM

not bad

Posted by: Meg at March 25, 2008 5:39 AM

heres a new one, some functionality is still a little rough, however we solved all the issues with semantic search - just need the hardware to expand now and encompass the whole web :)

http://www.cklingo.com/

Posted by: Cory J. Geesaman at July 22, 2008 4:02 PM

Post a comment




Remember Me?

Human detector: Please enter the letter "l" in the field below to help fight automated spam comments:

(you may use HTML tags for style)