Webapps and the Yahoo BOSS API

Introduction

A few weeks ago Yahoo released a new search API known as BOSS, which is an abbreviation of "Build you Own Search Service". Despite the fact that I usually prefer to use Google APIs I decided to take a closer look on how BOSS works. After a few hours of testing and working I definetly have to admit that the API has great potential. Unlike most other solutions you are allowed to mash-up the results with other services and format them as you wish.

Necessary preparations

To use Yahoo webservices you have to register yourself to Yahoo and get an App key. This App key is required by Yahoo to give you access to their services.

Depending on what language you want to use in this article you will have to include some lines of code for each given listing.

Listing 0 (PHP5) <?php

Listing 0 (Ruby) require 'net/http' require 'rexml/document' require 'uri' puts cgi.header

Just the usual line in case of php5 (we will make use of SimpleXML) while we need four lines for Ruby. The first three lines include libraries neccessary to fetch data via http,parse xml and encode URIs. Last line sends out the http header (which is automatically done and therefore obsolete in Ruby on Rails).

Doing queries

First query

Now let's send our first query.

Listing 1 (PHP5) $query = "geekmonkey"; $appid = 'YOUR APP KEY'; $url = "http://boss.yahooapis.com/ysearch/web/v1/". "{$query}?appid={$appid}&format=xml"; $xml = file_get_contents($url); $doc = simplexml_load_string($xml); foreach($doc->resultset_web->result as $result) { echo "<h2><a href='{$result->url}'>{$result->title}</a></h2>"; echo "<p>{$result->abstract}</p><br />"; }

PHP: After fetching the XML file from yahoo's BOSS server, by handing over application ID and search string we use SimpleXML to parse and iterate over the result. If your provider does not grant you the right to access URLs directly (allow_url_fopen = Off in php.ini) via fopen or file_get_contents (which is short for fopen/fread/fclose) then you might test the following lines instead:

Alternative Listing 1 (PHP5) $cc = curl_init($url); curl_setopt($cc, CURLOPT_RETURNTRANSFER, true); curl_setopt($cc, CURLOPT_HEADER, 0); $xml = curl_exec($cc); curl_close($cc);

This, of course, requires the cURL extension to be installed.

Listing 1 (Ruby) # Web search for "geekmonkey" query = "geekmonkey" appid = 'YOUR APP KEY' url = 'http://boss.yahooapis.com/ysearch/web/v1/'+query+\ '?appid='+appid+'&format=xml' # get the XML data as a string xml = Net::HTTP.get_response(URI.parse(url)).body # extract event information doc = REXML::Document.new(xml) titles = doc.elements.to_a( "//title" ) links = doc.elements.to_a( "//url" ) abstract = doc.elements.to_a( "//abstract" ) # print all results titles.each_with_index do |title, idx| print "<h2><a href='#{links[idx].text}'>#{title.text}</a></h2>\n" print "<p>#{abstract[idx].text}</p><br />\n" end

Ruby: Similar to the PHP version we first set some variables. By using Net::HTTP we fetch the XML file containing our search results. We parse the document with REXML and filter out the values we want to use.

Improving

As you have noticed, listing 1 performs a fixed query. Let's change this by adding an input field.

Listing 2 (PHP5) <form action="search.php" method="post"> <label for="search">Search:</label> <input type="text" name="search" /> </form>

PHP: As the action of the form implies Listing 1 has to be moved into a file named "search.php". The above HTML can be used on every page of your website. To make the whole thing work replace line 1 from the first listing with the following:

Listing 3 (PHP5) $query = urlencode($_GET['search']);

We have to urlencode the search string as we want to submit it in an URL.

Listing 2 (Ruby)

News and Image Search

Now that we created a small form we can extend it to do more specific tasks. The BOSS API also includes news and image search, where breaking news as well as the huge image database of yahoo can be browsed.

Listing 3 (PHP5)

Listing 3 (Ruby)

Getting spelling suggestions

A user might have made typos in his search query. Building an own spelling suggester is a lot of work and would require a huge word dictionary. So, why not making use of the spelling suggester which is part of BOSS :)

Listing 4 (PHP5)

Listing 4 (Ruby)

More features?

Of course there are a lot of features one can imagine. As already mentioned you are free to mesh-up the data retrieved from yahoo with other search results. Combining image and web search might also be a good idea.. Just let your creativity run free!

Recommended books on this topic

Understanding Search Engines: Mathematical Modeling and Text Retrieval (Software, Environments, Tools), Second Edition Understanding Search Engines: Mathematical Modeling and Text Retrieval (Software, Environments, Tools), Second Edition by Michael W. Berry and Murray Browne

Comments (3) |

Comments

Submit Comment