[three]Bean
Python + WSGI + protovis @ BarcampRoc Fall 2010
Oct 24, 2010 | categories: python, toscawidgets, turbogears View CommentsI gave a presentation at BarcampRoc yesterday on some of the new widgets I've been working on. I started with a two-minute Powerpoint and then jumped right into a live hack session. The audience helped chime in to tweak pieces of code, make the visualizations more interesting, and catch my clumsy syntax errors. Few people in the room had python experience and none had experience with WSGI or protovis but it felt like everyone walked out with a good sense of how to carry out the same work again.
Here's the rundown.
Setting up Turbogears
Before we do anything, you'll need Turbogears 2.1 and virtualenv setup on your system. If you're running an rpm-based distro, the following should cut it:
[rjbpop@grossman ~ []]# sudo yum install python-tg-devtools python-virtualenv
Start with a fresh Turbogears 2.1 quickstarted app. TG provides a wonderland of middleware to make writing robust web-apps easier and so will ask you some questions when quickstarting. We won't need any repoze authentication stuff and we'll use genshi templates.
[rjbpop@grossman ~ []]# mkdir devel [rjbpop@grossman ~ []]# cd devel [rjbpop@grossman devel []]# paster quickstart Enter project name: barcamp.roc.fall.2010 Enter package name [barcamprocfall2010]: melissa_is_a_babe Would you prefer mako templates? (yes/[no]): no Do you need authentication and authorization in this project? ([yes]/no): no
paster
should barf some output while setting up the app. We have some dependencies we need to install and since we don't want to risk messing up our system python environment, we'll use the handy virtualenv
tool.
[rjbpop@grossman devel []]# cd barcamp.roc.fall.2010 [rjbpop@grossman barcamp.roc.fall.2010 []]# virtualenv venv-test New python executable in venv-test/bin/python Installing setuptools............done. [rjbpop@grossman barcamp.roc.fall.2010 []]# source venv-test/bin/activate (venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# which paster /usr/bin/paster (venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# which pip ~/devel/barcamp.roc.fall.2010/venv-test/bin/pip (venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# pip install --upgrade PasteScript tw2.protovis.custom
pip
should barf some output here too. To verify that an instance of paster local to our virtual environment was installed, check:
(venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# which paster ~/devel/barcamp.roc.fall.2010/venv-test/bin/paster
Good stuff. Now, we're going to be using toscawidgets2 (package: tw2) not toscawidgets1 which turbogears2.1 has enabled by default. Let's change that.
Add the two highlighted lines to the bottom of melissa_is_a_babe/config/app_cfg.py
so that it looks like:
#Configure the base SQLALchemy Setup base_config.use_sqlalchemy = True base_config.model = melissa_is_a_babe.model base_config.DBSession = melissa_is_a_babe.model.DBSession base_config.use_toscawidgets = False base_config.use_toscawidgets2 = True
The WSGI stack of our Turbogears2 app is now using the toscawidgets2
middleware layer. Great! There's one more thing to do before we can launch and
check out our app: Turbogears2.1 has a single spurious reference to
toscawidgets1. Let's remove it. Edit
melissa_is_a_babe/lib/base.py
and remove the following highlighted
line.
from tg import TGController, tmpl_context from tg.render import render from pylons.i18n import _, ungettext, N_ from tw.api import WidgetBunch import melissa_is_a_babe.model as model __all__ = ['BaseController']
Word. We should be set to spin up the jumpdrives.
(venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# paster serve --reload development.ini Starting subprocess with file monitor Starting server in PID 2250. serving on http://127.0.0.1:8080
God willing, if you point your browser at http://localhost:8080/ you should see the default quickstarted Turbogears index page.
Fiddling with Turbogears
Time to shave away stuff we don't need. Open up
melissa_is_a_babe/templates/index.html
(which is a
genshi
template) and scoop out its guts so it looks like:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:py="http://genshi.edgewall.org/" xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include href="master.html" /> <head> <meta content="text/html; charset=UTF-8" http-equiv="content-type" py:replace="''"/> <title>Welcome to TurboGears 2.0, standing on the shoulders of giants, since 2007</title> </head> <body> </body> </html>
Since we told paster
to launch our app with the
--reload
option, it should automatically notice we changed
something and be ready for us. Return to your browser, reload the page and
behold:
Not very exciting now, is it?
Our first tw2.protovis chart
Open up melissa_is_a_babe/controllers/root.py
. There you'll
find a class called RootController with a number of methods. When you visit a
particular location, say, http://localhost:8080/foobar
, your TG2.1
BaseController will redirect the flow of control to your RootController's
foobar
method (if it has one).
We don't need any of the methods other than index
so you can go
ahead and remove them.
Inside the index
method, we're going to configure an instance of
tw2.protovis.custom.BubbleChart
and return it to the WSGI context.
At the top root.py
add the appropriate import statements to bring
in the random
module and the BubbleChart
class from
the tw2.protovis.custom
module. As for the index
method, we'll need to do three things: 1) initialize some bogus data to
visualize, 2) fill in the parameters for BubbleChart
, and 3) return
our configured widget from the controller method.
As for our bogus data, the p_data
parameter requires a list of
python dictionaries (we'll create 40 of them), each with four keys: a
'name'
that provides the alt-text for the bubbles, a
'value'
that provides the size of each, a 'text'
that
provides the label, and a 'group'
which is the basis on which to
colorize them.
Mangle melissa_is_a_babe/controllers/root.py
to look like
this:
# -*- coding: utf-8 -*- """Main Controller""" from tg import expose, flash, require, url, request, redirect from pylons.i18n import ugettext as _, lazy_ugettext as l_ from melissa_is_a_babe.lib.base import BaseController from melissa_is_a_babe.model import DBSession, metadata from melissa_is_a_babe.controllers.error import ErrorController __all__ = ['RootController'] from tw2.protovis.custom import BubbleChart import random class RootController(BaseController): """ The root controller for the barcamp.roc.fall.2010 application. All the other controllers and WSGI applications should be mounted on this controller. For example:: panel = ControlPanelController() another_app = AnotherWSGIApplication() Keep in mind that WSGI applications shouldn't be mounted directly: They must be wrapped around with :class:`tg.controllers.WSGIAppController`. """ error = ErrorController() @expose('melissa_is_a_babe.templates.index') def index(self): """Handle the front-page.""" data = [ { 'name' : random.random(), 'value' : random.random(), 'text' : random.random(), 'group' : random.random(), } for i in range(40) ] chart = BubbleChart( id='a-chart-for-my-friends', p_width=750, p_height=750, p_data=data ) return dict(page='index', widget=chart)
Now that our bubble chart of random data is available to our templates, we'll
need to make mention of it in
melissa_is_a_babe/templates/index.html
. Just add the following
single line inside the body
tag.
<body> ${widget.display()} </body>
Again, paster
should have noticed we changed something and
restarted itself. Revisit http://localhost:8080
and you should see
something like:
Querying Google's Ajax Search API
Cool? Cool. Well, sort of. Random numbers are unfortunately not as
interesting as we might like them to be so let's do something a little less
entropic. Google has a sweet
ajax search API we can use. Add the following to the top of
melissa_is_a_babe/controllers/root.py
import itertools import urllib import simplejson import itertools import math base_url = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0" def make_entry(combo): phrase = '"%s"' % " ".join(combo) print "Querying for %s." % phrase query = urllib.urlencode({ 'q': phrase }) url = "%s&%s" % (base_url, query) results = urllib.urlopen(url) json = simplejson.loads(results.read()) if 'estimatedResultCount' in json['responseData']['cursor']: count = int(json['responseData']['cursor']['estimatedResultCount']) else: count = len(json['responseData']['results']) value = math.log(count+1.000000001) return { 'name' : "%s : %i " % (phrase, count), 'value' : value, 'text' : phrase[:10], 'group' : len(combo), }
The above function will, given a list of words (combo), perform a google search on that exact phrase and return a python dictionary ready for BubbleChart the value of which is the estimatedResultCount of the search term. We use the urllib module to prepare and make our query and the simplejson module to convert the json (javascript object notation) that google hands us into a python dictionary.
We'll need to make use of the make_entry
function in our
controller.
Keyword arguments passed to our app via the URL's query string arrive as
keyword arguments to controller methods. Let's add one to the
index
method called sentence
.
Furthermore, we'll take that sentence, break it up into its constituent words and, making use of the itertools module, prepare a list of every unique combination of words in that sentence.
With our make_entry
function and such a list of word
combinations, its trivial to map our list to a map of BubbleChart-ready
dict
objects.
Modify melissa_is_a_babe/controllers/root.py
as follows:
@expose('melissa_is_a_babe.templates.index') def index(self, sentence="word to your moms"): """Handle the front-page.""" words = str(sentence).split() combos = [] for i in range(len(words)): combos += list(itertools.combinations(words, i+1)) data = map(make_entry, combos) chart = BubbleChart( id='a-chart-for-my-friends', p_width=750, p_height=750, p_data=data ) return dict(page='index', widget=chart)
Reload the page http://localhost:8080/
and you should get
something like:
Since we made sentence
a keyword argument to our
index
method, we should also be able to visit
http://localhost:8080/index?sentence=the+quick+brown+fox+jumps+over+the+lazy+dog
and see a pretty dope chart.
Speeding things up with multiprocessing
Wow.. that was cool but it took forever to make all those queries.
What exactly is it that's taking so much time? It seems to be all the
handshaking and waiting for a response before making the next request. We can
likely speed this up greatly by making use of python's
multiprocessing
module.Just below your definition of the make_entity
function, add the
following lines:
return { 'name' : "%s : %i " % (phrase, count), 'value' : value, 'text' : phrase[:10], 'group' : len(combo), } from multiprocessing import Pool pool = Pool(processes=150) class RootController(BaseController):
The Pool class has a map method that performs the same operation as the
builtin map, but distributes the workload among all available processes. Inside
the index
method, make use of the thread pool when mapping
make_entity
onto the list of word combinations. Change the
highlighted line as follows:
for i in range(len(words)): combos += list(itertools.combinations(words, i+1)) data = pool.map(make_entry, combos) chart = BubbleChart( id='a-chart-for-my-friends',
Repoint your browser at
http://localhost:8080/index?sentence=the+quick+brown+fox+jumps+over+the+lazy+dog
and it should fly!
Sorting our data gives different results
One last thing: BubbleChart places the bubbles in the order they are given
and tries to pack each subsequent entry as tightly as it can. We can achieve
differently flavored charts by reordering our list of dicts. In
melissa_is_a_babe/controllers/root.py
try adding the following
line:
for i in range(len(words)): combos += list(itertools.combinations(words, i+1)) data = pool.map(make_entry, combos) data.sort(lambda x, y: cmp(y['value'], x['value'])) chart = BubbleChart( id='a-chart-for-my-friends',
Which gives us the following:
Resources
You can find the full source code for this tutorial on my github account
Live demonstrations of all my widgets are available at http://tw2-demos.threebean.org/
Cheers!