Thalasar Ventures

Developing Applications Over Multiple Web Service Layers

One of the more interesting challenges in developing Earlymiser has turned out how to normalize performance across multiple web service layers. When querying multiple web service providers we have found varying degrees of performance and reliability.


Earlymiser.com is a classic web mashup, but the number of web services it is built on is far greater than most mashups. Most web mashups are typically built on a single service and add additional functionality on top of that layer. If a developer is feeling particularly ambitious he might add data from an external source to create more value. Earlymiser.com is a bit more complex. We hit four separate web services (Shopping.com, Amazon, Ebay, and Yahoo Shopping). Each one of these web services has varying response times and different query limits.
The problem becomes pretty clear if you look at it this way. If each web service takes 200 milliseconds to respond you can get some pretty significant slowdowns. We don’t hit the services serially but multiple that response time by several thousand queries per second and the problem becomes clear. Additionally you need to account for the different performance levels of each individual web service. Not all web services are created equally and some providers have more experience than others and are more reliable than others.
How can you provide a unified user experience when your data sources are can be unreliable and subject to error? There are a lot of ways to mitigate this issue. Here’s a few we tried at Early Miser.
Cache on the presentation side.
Most of our content is updated roughly every four hours. When a specific product is presented to a user, that page is then cached for four hours. We chose the limit of 4 hours since only two providers update results that frequently. Caching the page presentation has significantly improved the application’s performance. We use JPCache and it has allowed us to handle between 50,000 – 70,000 queries a day on a standard shared hosting account.
Cache seldom updated data longer.
Amazon has explicitly recognized that some data just doesn’t change, so they allow you to cache certain data for longer periods. In the case of Early Miser we get other data that simply isn’t as crucial. The director of The Godfather isn’t going to change, so you shouldn’t need to hit the web service to present that data. You need to spend some time segmenting the data out and caching (or storing locally in a MySQL database) the content that doesn’t change very often and adjust the length of time you are going to cache the data.
Monitor your queries to each one of your vendors..
Each one of my web service providers has a different number of queries we are allotted daily. You need to track your total queries to each web service for those web service where you have limits. You also need to see which queries fail with your providers. When we first started with Early Miser we self-certified with Ebay. This got us 10,000 queries a day. I am in the process of completing the Affiliate certification which will raise our query limit to 1.5 million queries per day. It’s routine that by 7:00 Mountain time (6:00 AM Pacific) that I have already hit my query limit with Ebay. By monitoring the application’s usage of a vendor’s web service, I can gather some insight how to improve the queries and the conversion.
Don’t use SOAP!
When hitting multiple web service vendors, it’s often easier to develop the application if they offer SOAP. This is a mistake. Soap is the slowest of the XML services and the overhead it brings just slows the application down.
These are just a few of thing you can do to improve performance on a multi web service mash-up. I would love to hear some additional ideas.

Both comments and pings are currently closed.

Comments are closed.