[three]Bean

Python + WSGI + protovis @ BarcampRoc Fall 2010

Oct 24, 2010 | categories: python, toscawidgets, turbogears View Comments

I gave a presentation at BarcampRoc yesterday on some of the new widgets I've been working on. I started with a two-minute Powerpoint and then jumped right into a live hack session. The audience helped chime in to tweak pieces of code, make the visualizations more interesting, and catch my clumsy syntax errors. Few people in the room had python experience and none had experience with WSGI or protovis but it felt like everyone walked out with a good sense of how to carry out the same work again.

Here's the rundown.

Setting up Turbogears

Before we do anything, you'll need Turbogears 2.1 and virtualenv setup on your system. If you're running an rpm-based distro, the following should cut it:

[rjbpop@grossman ~ []]# sudo yum install python-tg-devtools python-virtualenv

Start with a fresh Turbogears 2.1 quickstarted app. TG provides a wonderland of middleware to make writing robust web-apps easier and so will ask you some questions when quickstarting. We won't need any repoze authentication stuff and we'll use genshi templates.

[rjbpop@grossman ~ []]# mkdir devel

[rjbpop@grossman ~ []]# cd devel

[rjbpop@grossman devel []]# paster quickstart
Enter project name: barcamp.roc.fall.2010
Enter package name [barcamprocfall2010]: melissa_is_a_babe
Would you prefer mako templates? (yes/[no]): no
Do you need authentication and authorization in this project? ([yes]/no): no

paster should barf some output while setting up the app. We have some dependencies we need to install and since we don't want to risk messing up our system python environment, we'll use the handy virtualenv tool.

[rjbpop@grossman devel []]# cd barcamp.roc.fall.2010

[rjbpop@grossman barcamp.roc.fall.2010 []]# virtualenv venv-test
New python executable in venv-test/bin/python
Installing setuptools............done.

[rjbpop@grossman barcamp.roc.fall.2010 []]# source venv-test/bin/activate

(venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# which paster
/usr/bin/paster

(venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# which pip
~/devel/barcamp.roc.fall.2010/venv-test/bin/pip

(venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# pip install --upgrade PasteScript tw2.protovis.custom

pip should barf some output here too. To verify that an instance of paster local to our virtual environment was installed, check:

(venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# which paster
~/devel/barcamp.roc.fall.2010/venv-test/bin/paster

Good stuff. Now, we're going to be using toscawidgets2 (package: tw2) not toscawidgets1 which turbogears2.1 has enabled by default. Let's change that.

Add the two highlighted lines to the bottom of melissa_is_a_babe/config/app_cfg.py so that it looks like:

#Configure the base SQLALchemy Setup
base_config.use_sqlalchemy = True
base_config.model = melissa_is_a_babe.model
base_config.DBSession = melissa_is_a_babe.model.DBSession

base_config.use_toscawidgets = False
base_config.use_toscawidgets2 = True

The WSGI stack of our Turbogears2 app is now using the toscawidgets2 middleware layer. Great! There's one more thing to do before we can launch and check out our app: Turbogears2.1 has a single spurious reference to toscawidgets1. Let's remove it. Edit melissa_is_a_babe/lib/base.py and remove the following highlighted line.

from tg import TGController, tmpl_context
from tg.render import render
from pylons.i18n import _, ungettext, N_
from tw.api import WidgetBunch
import melissa_is_a_babe.model as model

__all__ = ['BaseController']

Word. We should be set to spin up the jumpdrives.

(venv-test)[rjbpop@grossman barcamp.roc.fall.2010 []]# paster serve --reload development.ini 
Starting subprocess with file monitor
Starting server in PID 2250.
serving on http://127.0.0.1:8080

God willing, if you point your browser at http://localhost:8080/ you should see the default quickstarted Turbogears index page.

Default quickstarted Turbogears2.1 index page

Fiddling with Turbogears

Time to shave away stuff we don't need. Open up melissa_is_a_babe/templates/index.html (which is a genshi template) and scoop out its guts so it looks like:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
                      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:py="http://genshi.edgewall.org/"
      xmlns:xi="http://www.w3.org/2001/XInclude">

  <xi:include href="master.html" />

<head>
  <meta content="text/html; charset=UTF-8" http-equiv="content-type" py:replace="''"/>
  <title>Welcome to TurboGears 2.0, standing on the 
  shoulders of giants, since 2007</title>
</head>

<body>

</body>
</html>

Since we told paster to launch our app with the --reload option, it should automatically notice we changed something and be ready for us. Return to your browser, reload the page and behold:

The tg2.1 index page with its guts scooped out

Not very exciting now, is it?

Our first tw2.protovis chart

Open up melissa_is_a_babe/controllers/root.py. There you'll find a class called RootController with a number of methods. When you visit a particular location, say, http://localhost:8080/foobar, your TG2.1 BaseController will redirect the flow of control to your RootController's foobar method (if it has one).

We don't need any of the methods other than index so you can go ahead and remove them.

Inside the index method, we're going to configure an instance of tw2.protovis.custom.BubbleChart and return it to the WSGI context. At the top root.py add the appropriate import statements to bring in the random module and the BubbleChart class from the tw2.protovis.custom module. As for the index method, we'll need to do three things: 1) initialize some bogus data to visualize, 2) fill in the parameters for BubbleChart, and 3) return our configured widget from the controller method.

As for our bogus data, the p_data parameter requires a list of python dictionaries (we'll create 40 of them), each with four keys: a 'name' that provides the alt-text for the bubbles, a 'value' that provides the size of each, a 'text' that provides the label, and a 'group' which is the basis on which to colorize them.

Mangle melissa_is_a_babe/controllers/root.py to look like this:

# -*- coding: utf-8 -*-
"""Main Controller"""

from tg import expose, flash, require, url, request, redirect
from pylons.i18n import ugettext as _, lazy_ugettext as l_

from melissa_is_a_babe.lib.base import BaseController
from melissa_is_a_babe.model import DBSession, metadata
from melissa_is_a_babe.controllers.error import ErrorController

__all__ = ['RootController']

from tw2.protovis.custom import BubbleChart
import random

class RootController(BaseController):
    """
    The root controller for the barcamp.roc.fall.2010 application.

    All the other controllers and WSGI applications should be mounted on this
    controller. For example::

        panel = ControlPanelController()
        another_app = AnotherWSGIApplication()

    Keep in mind that WSGI applications shouldn't be mounted directly: They
    must be wrapped around with :class:`tg.controllers.WSGIAppController`.

    """

    error = ErrorController()

    @expose('melissa_is_a_babe.templates.index')
    def index(self):
        """Handle the front-page."""
        data = [
            {
                'name' : random.random(),
                'value' : random.random(),
                'text' : random.random(),
                'group' : random.random(),
            } for i in range(40) ]

        chart = BubbleChart(
            id='a-chart-for-my-friends',
            p_width=750,
            p_height=750,
            p_data=data
        )

        return dict(page='index', widget=chart)

Now that our bubble chart of random data is available to our templates, we'll need to make mention of it in melissa_is_a_babe/templates/index.html. Just add the following single line inside the body tag.

<body>
${widget.display()}
</body>

Again, paster should have noticed we changed something and restarted itself. Revisit http://localhost:8080 and you should see something like:

tw2.protovis.custom.BubbleChart with random data

Querying Google's Ajax Search API

Cool? Cool. Well, sort of. Random numbers are unfortunately not as interesting as we might like them to be so let's do something a little less entropic. Google has a sweet ajax search API we can use. Add the following to the top of melissa_is_a_babe/controllers/root.py

import itertools

import urllib
import simplejson
import itertools
import math

base_url = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0"

def make_entry(combo):
    phrase = '"%s"' % " ".join(combo)
    print "Querying for %s." % phrase
    query = urllib.urlencode({ 'q': phrase })
    url = "%s&amp;%s" % (base_url, query)
    results = urllib.urlopen(url)
    json = simplejson.loads(results.read())

    if 'estimatedResultCount' in json['responseData']['cursor']:
        count = int(json['responseData']['cursor']['estimatedResultCount'])
    else:
        count = len(json['responseData']['results'])

    value = math.log(count+1.000000001)

    return {
        'name' : "%s : %i " % (phrase, count),
        'value' : value,
        'text' : phrase[:10],
        'group' : len(combo),
    }

The above function will, given a list of words (combo), perform a google search on that exact phrase and return a python dictionary ready for BubbleChart the value of which is the estimatedResultCount of the search term. We use the urllib module to prepare and make our query and the simplejson module to convert the json (javascript object notation) that google hands us into a python dictionary.

We'll need to make use of the make_entry function in our controller.

Keyword arguments passed to our app via the URL's query string arrive as keyword arguments to controller methods. Let's add one to the index method called sentence.

Furthermore, we'll take that sentence, break it up into its constituent words and, making use of the itertools module, prepare a list of every unique combination of words in that sentence.

With our make_entry function and such a list of word combinations, its trivial to map our list to a map of BubbleChart-ready dict objects.

Modify melissa_is_a_babe/controllers/root.py as follows:

    @expose('melissa_is_a_babe.templates.index')
    def index(self, sentence="word to your moms"):
        """Handle the front-page."""
        words = str(sentence).split()
        combos = []
        for i in range(len(words)):
            combos += list(itertools.combinations(words, i+1))

        data = map(make_entry, combos)

        chart = BubbleChart(
            id='a-chart-for-my-friends',
            p_width=750,
            p_height=750,
            p_data=data
        )

        return dict(page='index', widget=chart)

Reload the page http://localhost:8080/ and you should get something like:

BubbleChart of google results of every combination of

Since we made sentence a keyword argument to our index method, we should also be able to visit http://localhost:8080/index?sentence=the+quick+brown+fox+jumps+over+the+lazy+dog and see a pretty dope chart.

Speeding things up with multiprocessing

Wow.. that was cool but it took forever to make all those queries. What exactly is it that's taking so much time? It seems to be all the handshaking and waiting for a response before making the next request. We can likely speed this up greatly by making use of python's multiprocessing module.Just below your definition of the make_entity function, add the following lines:

    return {
        'name' : "%s : %i " % (phrase, count),
        'value' : value,
        'text' : phrase[:10],
        'group' : len(combo),
    }

from multiprocessing import Pool
pool = Pool(processes=150)

class RootController(BaseController):

The Pool class has a map method that performs the same operation as the builtin map, but distributes the workload among all available processes. Inside the index method, make use of the thread pool when mapping make_entity onto the list of word combinations. Change the highlighted line as follows:

        for i in range(len(words)):
            combos += list(itertools.combinations(words, i+1))

        data = pool.map(make_entry, combos)

        chart = BubbleChart(
            id='a-chart-for-my-friends',

Repoint your browser at http://localhost:8080/index?sentence=the+quick+brown+fox+jumps+over+the+lazy+dog and it should fly!

Sorting our data gives different results

One last thing: BubbleChart places the bubbles in the order they are given and tries to pack each subsequent entry as tightly as it can. We can achieve differently flavored charts by reordering our list of dicts. In melissa_is_a_babe/controllers/root.py try adding the following line:

        for i in range(len(words)):
            combos += list(itertools.combinations(words, i+1))

        data = pool.map(make_entry, combos)
        data.sort(lambda x, y: cmp(y['value'], x['value']))

        chart = BubbleChart(
            id='a-chart-for-my-friends',

Which gives us the following:

Resources

You can find the full source code for this tutorial on my github account

Live demonstrations of all my widgets are available at http://tw2-demos.threebean.org/

Cheers!

View Comments