Apologies in advance for the long email. The thought is that we might get more bang for our buck by buying more racks and less disks, getting other people to invest to fully populate the clusters. Though discussions at the I2 meeting fueled it, the idea mostly came from my reflections on a conversation I had with Boyd Knosp (U. of Iowa) at SC. He said that if he did the Capricorn thing for his campus, he'd probably start by buying the rack with only 3 or 4 slots filled. Then as people on campus came to him with storage needs, he'd let them plunk down their cash for units to populate the rest of the rack. (Alan may have described similar ideas to me too.) The whole thing could be managed as a single, shared pool, but people who bought in would have a prioritized claim on amounts of space/time according to the number of units they bought. Or something like that. When I talked to Les Finken (Boyd's colleague) at the I2 meeting, he confirmed that they had used this kind of strategy successfully in the past. They think that using LN technology would give them a lot more flexibility in the schemes for paying/sharing they could propose. So, what I was thinking is that we might offer the following deal to some more places that we had not planned to provision with Capricorn clusters with REDDnet money:
We'll buy you a Capricorn rack, provision it with a modest "starter" amount of storage, e.g. 10T or something, put it at your location to be used as an IBP depot cluster. You agree that, along with the machine room space, connectivity and maybe a small slice of an FTE, to try to populate the rest of the bays with funding you get from your own application communities.
At a 1K per TB for storage, this could look like a pretty sweet deal. So the idea is that buying many more racks and somewhat less storage with the REDDnet money, we could end up with a REDDnet that has a much bigger footprint and still ends up with substantially more storage than we could get if we bought the REDDnet units fully populated.
I agree that, as compared to "buy as much storage as you can", this is a somewhat non-standard approach and I'm sure there are reasons to think that it might not work. But here are some of the considerations that make it attractive to me:
1. I can think of at least 4 locations right now that would be interested in the deal. I put them down for illustrative purposes; there are various approaches we could use to select the sites for these "depot cluster starters (DCS)" : 1. Texas Tech -- Not only is Alan Sill an OSG/CMS guy, TT has one of PR's TexasView core collaborators there. They might be willing to provide extra funds to populate the depot. 2. Iowa -- Again they have both OSG/CMS people there (http://news-releases.uiowa.edu/2006/october/100206osg-grant.html) and AmericaView guys there as well. Having LN savvy people there is a very good thing. 3. UAB -- A very LN savvy person there (John-Paul) and one of Phoebe's collaborators. 4. Paul Avery wants to put depots at 4 locations on FLR. We could follow this approach to give him more racks but less storage and get him going now. 2. The availability of DCS units could enable our REDDnet application communities to widen the participation of their community collaborators and make it easier, by attracting more investment, to build up the pool of storage available to that application community. Could also be very good politics with their application homeys. 3. I've always had a question about how other, not-now-REDDnet application communities could or would be brought in. Deploying CDSs seems like a natural way to do this. For example, Les Finken is very interested in doing something with DV (and especially HD) and bringing it to the I2-DVI folks. And I think, with some reason, that they'd be happy just to buy a few more boxes for empty slots to enable this effort. In general, wherever we have a CDS and the REDDnet LN software infrastructure deployed, enabling a new application is just a question of buying more boxes and filling the slots. 4. Continuing on the "bigger REDDnet footprint" theme, I think that getting CDSs on more campuses with local application groups involved will really help to propagate our paradigm in the academic community generally. It seems that everywhere I go, the IT guys tell me that their studying the issue of storage and data on their campuses. They have real problems and no good answers. Wider deployment of CDSs would put our paradigm in front of these folks in a very tangible way. The network effects and "word of mouth" could be very potent.
I've gone on way, way too long here, and gotten somewhat off track. So again, the essence of the idea is that rather than buying a few big nodes for a few locations, we could buy more smaller (but expandable) nodes for more locations, and leverage the resources of our collaborators to get even more storage resources deployed and more sites/people involved in the end. As a baseline option, it would give the application PIs -- Paul, PR, John, Phoebe -- a different option and way of engaging with their main collaborators. I would at least think that this strategy could work very well in the CMS/OSG community, and that Paul S. might be able to convince Harvey, Paul A. and Sean to spend some of their own funding (they are already spending money for storage, aren't they?) to fully populate the nodes at their sites if it meant that more sites in the CMS community would be doing the whole REDDnet/Ultralight thing.
This is just a suggestion, so all comments and criticisms are welcome. Thanks.