[three]Bean

Open Science Grid Braindump

Mar 10, 2011 | categories: science, grid View Comments

I spent Monday through Thursday of this week at the Open Science Grid (OSG) all-hands meeting in Boston, MA. Work sent me to get me up to speed on what's going on.

The OSG is cool. It is a national, distributed computing grid for data-intensive research that is really bent on an open approach to high-throughput computing. On average, it serves 1.2 million cpu-hours of computation per day.

Unlike Blue Waters (which you can't get onto) and unlike the TeraGrid (which requires you apply for and wait for allocations), the OSG makes it simple for you to run as much research code as you need to. The whole project appears to be really driven by the Compact Muon Solenoid (CMS) and the A Toroidal LHC ApparatuS (ATLAS) experiments from the LHC. There are a ton of different disciplines computing on the OSG, but CMS and ATLAS dominate the usage.

The OSG is really complex and the conference was total acronym overload but here are the two coolest things:

1. XRootD-- Think 'the bittorrent of file systems'. XRootD is a real necessity for the CMS and ATLAS projects that need to move tremendous amounts of data very quickly.

Before XRootD, compute submissions to the OSG would have to have their jobs directed only to sites where the data was known to already exist or users would have to copy the data there directly (with globus-copy-url or something equivalent).

With XRootD, a running job simply asks for a file and and the local redirector asks its peers, who ask their peers until all the instances that have the file respond and simultaneously feed the data to the requester. It scales linearly!

2. Cloud Computing(!)-- A couple different projects are going on to introduce 'dynamic, on-demand cloud infrastructures'. The team at Clemson is building a piece of software called kestrel that submits a number of 'pilot' jobs to condor that then spin up each of their own VMs and phone home for more work. Argonne Labs and the NERSC Magellan team have had some luck running super-physics software in a different by similar setup.

Heading home, I've got a lot of new ideas and things to try and play with. Connecting our resources to the OSG seems a priority, but first we'll need to write a GRAM plugin for the Globus Toolkit to add SLURM support (acronym overload, right?).

View Comments
blog comments powered by Disqus