I've been quite enjoying writing some code for Google Appengine this week. Just a small project for querying a variety of book retailer websites for information.
I'm really impressed with it so far. Having the ability to simply import and use things like memcache and use it is excellent. It works locally with some kind of cache, and then when I deploy my app it uses a memcache on the server side seamlessly.
I'm also liking that the urlfetch infrastructure built into appengine supports async downloading of webpages, so i can fire off 5+ API queries simultaneously and then collect the results, instead of doing them in sequence.
Here's a random bit of code I threw together that lets me cache the results of doing API queries in memcache so that a user doing reloads won't cause more API requests to occur than are strictly necessary:
from google.appengine.api import urlfetch
from google.appengine.api import memcache
CACHE_TIMEOUT = 3600 # 1 hour
class APIError(Exception):
pass
def cache_result(rpc, url):
result = rpc.get_result()
if result.status_code == 200:
data = result.content
memcache.add(url, data, CACHE_TIMEOUT)
def apiquery(url):
data = memcache.get(url)
if data is not None:
decoded_data = json.loads(data)
return lambda:decoded_data
logging.debug("Cache miss for %r", url)
rpc = urlfetch.create_rpc()
rpc.callback = lambda: cache_result(rpc, url)
urlfetch.make_fetch_call(rpc, url, method=urlfetch.GET)
def answer():
result = rpc.get_result()
if result.status_code == 200:
data = result.content
return json.loads(data)
raise APIError, "Broken!"
return answer
With the above code I can now do something like:
queries = [apiquery(url) for url in generate_api_queries()]
for query in queries:
result = query()
... # do stuff with result
I don't really like the variable names I've used, but it's an interesting enough bunch of code linking together a few bits and pieces so I thought I'd throw it on my blog.