In-Depth
Use Google APIs in .NET
Learn everything you need to start using .NET Web services to access Google Web APIs in your application.
Service-oriented architecture (SOA) is more than a buzzword; it is also an effective design pattern, which you can use to solve many development problems. At the heart of SOA are Web services, which are modular pieces of software used over the Internet that communicate using standardized XML. Google, which needs no introduction, has released Web APIs that allow developers to access some of Google's robust functionality using Web services. The APIs, at the time of this article, are in beta 1, but still allow you to use Simple Object Access Protocol (SOAP) and Web Services Description Language (WSDL) to query more than 8 billion Web pages.
I'll not only show you how to harness the search power of Google, but I'll also discuss some of the basics behind Web services in Visual Studio .NET. I'll talk about some of Google's advanced features you can use to create more robust Web applications, including site-restricted searching, date-restricted searching, and document filtering.
You must follow a number of steps prior to using the Google Web APIs. First, you must have an Internet connection, as well as a development language capable of interfacing with Web services. For this article, I'll be using C# and Visual Studio .NET, but Web services are flexible by design, so you could also use Java, Perl, C++, or any other language you desire. Second, you need to download the Google Web API and create an account with Google (see Additional Resources). You must have an account because you need a Google-generated license key to access Google's Web APIs.
There are some limitations due to the fact that the Google Web API is currently in beta 1. For instance, there is a limit of 10 results per query, and there's a 1,000 query limit per day. You should also review the licensing requirements prior to any development.
After you sign up and receive a license key from Google, open Visual Studio .NET and create a new ASP.NET Web project. Before getting into the guts of Web services and the Google API, first create a new user control that will be the main GUI for the example. Creating the GUI first lets you focus on the core logic of the applicationthis method is more efficient than jumping between design view and the code. After you create a new ASCX user control, add an HTML table that contains five labels (Basic Search, Site Restricted, File Filter, From Date, and To Date), five textboxes (txtBasic, txtSiteRes, txtFilter, txtFromDate, and txtToDate), and a single search button (btnSearch). Next, drag and drop a data grid called dgResults from your toolbox, then add two other labels above the grid called lblResults and lblError (see Figure 1). In my example, I added some style to the labels and the data grid, and I added a JavaScript popup calendar selector for the date fields. However, function is always more important than form, so the look and feel is really up to you.
Now that the GUI is out of the way, you can delve into the core functionality of this control. But before you begin writing code, I'll quickly discuss some of the fundamentals behind Web services and their implementation in the .NET Framework. Web services are basically reusable programs exposed to the Internet that use a standard messaging structure. Web services generally use SOAP for their messaging structure. SOAP is a standard XML-based protocol that defines the use of XML and HTTP for accessing Web services in a platform-independent fashion. Unless you have had your head buried in the sand for the last few years, you've probably heard all about Web services. What you might not know, however, is that the Web services paradigm is fairly unique because each Web service (when built correctly) is self-describing. That is to say, you can gain a working understanding of its functionality and its interfaces by examining the Web service metadata (aka, WSDL).
Implement the Google Web API
First, add a Web reference to Google's API by right-clicking on the References folder and selecting Add Web Reference. Add "http://api.google.com/GoogleSearch.wsdl" to the URL field. Now you can reference the Google Web API just like any other managed componentthrough the "using" statement, instantiation, and so on. Unlike your standard .NET reference, you can change the name of a Web reference simply by changing the folder name in the Properties window. A Web reference is nothing more than an auto-generated proxy class, which you can find in your Projects directory under the Web References folder. Be aware that when you change the Folder Name property, you will also be changing the way you reference the API in your code (i.e., the "using" statement).
The default reference name for the Google Web API is "com.google.api" and that's how it's set up in this example. You'll find Reference.cs, GoogleSearch.wsdl, and Reference.map in the Web References folder. The .NET IDE essentially abstracts all of the Web services interfaces and proxy code for you when you add a reference in this manner. Many of us geeks, though, like to see what's happening under the covers, so here's a look at a piece of the proxy code being generated:
[System.Web.Services.Protocols
.SoapRpcMethodAttribute("urn:GoogleSearchAction",
RequestNamespace="urn:GoogleSearch",
ResponseNamespace="urn:GoogleSearch")]
[return: System.Xml.Serialization
.SoapElementAttribute("return")]
public GoogleSearchResult doGoogleSearch(string key,
string q, int start, int maxResults, bool filter,
string restrict, bool safeSearch, string lr,
string ie, string oe)
{
object[] results = this.Invoke("doGoogleSearch",
new object[] {key, q, start, maxResults, filter,
restrict, safeSearch, lr, ie, oe});
return ((GoogleSearchResult)(results[0]));
}
As you can see, the proxy code is nothing more than a wrapper of the information found in the WSDL file, but it allows you to interact with a Web service as if it were a managed object on your local machine. For the most part, the proxy code generated by the .NET IDE is identical to the code generated by the WSDL.exe tool that ships with the .NET Framework (see Additional Resources for more information).
Once you have a Web reference to the Google API, you can continue by simply instantiating a GoogleSearchService object that has three main methods: doGoogleSearch, doGetCached, and doSpellingSuggestion. .NET also generates asynchronous wrapper methods that consist of a void "begin" and "end" signature for each of the method definitions found in the WSDL. It would be impossible to cover the entire Google APIs in this article, so I'll focus on the doGoogleSearch function and how you can use it in your applications. Let's look at how to do a basic search:
// Create a Google Search object
GoogleSearchService gss = new GoogleSearchService();
// Invoke the search method
GoogleSearchResult results =
gss.doGoogleSearch("Your license key",
"Your query goes here", 0, 10, false,
string.Empty, false, string.Empty,
string.Empty, string.Empty);
The doGoogleSearch method returns a GoogleSearchResult object, which you use as a container for the results being returned from Google. Hanging off the GoogleSearchResult object is the ResultElement class, which you use to access the return data. The ResultElement class is basically an array that allows you to access these members: cachedSize, directoryCategory, directoryTitle, hostname, relatedInformationPresent, snippet, summary, title, and URL.
As with most applications, error-handling code doesn't write itself; you must add a try/catch block to trap for exceptions. With Web services, you must trap for a particular type of exception called System.Web.Services.Protocols.SoapException, because an error that occurs within a Web service is raised as a SOAP exception for greater interoperability. Rather than put all the code in the Button event, I've encapsulated the control's functionality in these methods: BuildQuery, which builds Google's query string; FormatResults, which makes the results pretty; ShowErrorMessage, which displays errors; and GoogleHandler, which is the main interface called from the button event (see Listing 1). When you put all the code together and do a basic search for the term "MSMQ," you can see that you get back 38,100 results, and you can display the first 10 of these results in your data grid (see Figure 2).
Advanced QueriesThe Important Stuff
Now that you can do a basic search against Google, you're probably wondering how it helps you in your everyday Web development. Well, the Google API has a lot more functionality than just basic searching. You can do site-restricted searching, document-type filtering, and date-range searching, to name just a few features (see Table 1 for some specifics on Google's query features. The Google Web API also gives you the ability to narrow results to a particular country and/or a specific language. You can accomplish this easily by passing in a different string parameter to the doGoogleSearch method:
// Narrow results to Germany and
// change the language to German
GoogleSearchResult results =
gss.doGoogleSearch("Your License Key",
"Your Query", 0, 10, false, "countryDE",
false, "lang_de", string.Empty,
string.Empty);
Now take a closer look at some of these advanced queries and how you can implement them in your applications. The only field in the example GUI that is truly required is txtBasicSearch, but let's say you want to restrict searching to a particular sitefor instance, www.ftponline.com. You can accomplish this by simply concatenating " site:www.ftponline.com" to the end of your basic search. The BuildQuery method does this for you by building the query string that you pass into the doGoogleSearch method (see Listing 1).
For example, say you're doing some research on MSMQ, and you want to narrow your search to www.ftponline.com. Enter "MSMQ" into txtBasicSearch, then enter www.ftponline.com in txtSite. After executing the search, you'll notice that only 71 results are returned (down from 38,100). You can extend this search functionality to any Web site, such as your blog or your company's Web site (please consult the licensing agreement prior to production use).
You format document filtering and date-range restrictions in the same manner as the site-restriction query. For example, you can search for only certain document types by concatenating " filetype:pdf" to the end of the basic search, which limits the results to PDFs. The date-range restriction, although formatted the same way, has one twist, and that is that you must input the date range as a Julian date. You calculate a Julian date by determining the number of days since January 1, 4713 BC (download the code to see how to calculate the Julian date). After generating the Julian "to" date and "from" dates, you format the query string the same way by concatenating " daterange:<start_date>-<end date>" to the end of your basic query.
You can further extend the functionality of the Google API by combining any of the specialty queries. For example, you can restrict searching to a Web site, filter for a specific document type, and limit the date range, all in the same query string. This article has only scratched the surface of the Google Web API. There are many other queries you can use in your applications, including spell checking, cache searching, and more.
About the Author
Vijay Mehta works in enterprise architecture for a Fortune 500 company, where he uses VS.NET to design, develop, and architect enterprise solutions. Reach him at vijay@mehtasolutions.com.