- 37 -
Problem/Scenario
It seems that you can never win when it comes to buying something. No matter how
much you pay for an item, someone always tells you that they got the same thing for a
better deal. The media is filled with stories of how you can use the Internet to find better
deals for everything from books and cars to real estate and groceries. However, you will
quickly realize that finding a better deal can take a long time, and time is a cost in and of
itself. Thus the market was created for people to create Web sites that let you compare
the prices of many separate vendors. The more vendors they support, the more choices
you have to choose from.
The problem, however, is that collecting information from so many disparate sources is
hard. Assuming that a particular vendor has an online site, one way of interfacing with the
vendor’s information is to ping its Web site, receive its HTML-generated page, carefully
parse it, and extract the information you want (item name, price, and so on). You would
then repeat this process for as many vendors as you want to include. This process is
fairly tedious, and you have to become familiar with the vendor’s system before you can
parse the HTML data that is being output.
Assuming that you’ve somehow collected all this information, you then need to
consolidate it and display it to your customer who wants to see the comparative prices
before making a purchase. Again, assuming you’ve stored your results in a format that
you can use (encoded with your own logic and set of tricks like separating data values
with white spaces or special characters), you then need to generate an HTML page at
your server and serve that back to your client. This entire process, though doable, is both
irritating and messy. Figure 2.13 illustrates the non-XML approach.