CMS: Difference between revisions

From ReddNet
Jump to navigation Jump to search
Line 21: Line 21:
! iperf1 !! iperf2
! iperf1 !! iperf2
|-
|-
|local disk | N/A
|local disk  
| N/A
|-
|-
|local gpfs | N/A
|local gpfs || N/A
|-
|-
|local ibp depots | 10
|local ibp depots | 10

Revision as of 17:10, 4 April 2007

Goals

  • L-Store plugin for root, so that rootio can read (write?) L-Store files
  • CMS then uses REDDnet as temporary storage for user analysis (Tier 3 and below)
  • Other CMS Applications possible, begin with above.

Benchmarks

  • IBP --> CMSSW streaming tests
    • CMSSW_1_2_0
    • input: 1.5GB root file, 100 events
SINGLE CMSSW JOB 100 EVENTS, RUNNING ON VAMPIRE, 2GHz CPU
Data Source Stripes Ping iperf file download CMSSW 100 events
Ping1 Ping2 Ping3 iperf1 iperf2
local disk N/A
local gpfs N/A
10
1
5
N/A N/A N/A N/A N/A N/A N/A N/A
SINGLE CMSSW JOB 100 EVENTS, RUNNING at CALTECH, 2.4GHz CPU
Data Source Ping iperf file download CMSSW 100 events
Ping1 Ping2 Ping3 iperf1 iperf2
local disk
vandy reddnet depots
remote reddnet depots
N/A N/A N/A N/A N/A N/A N/A N/A

Current Work in Progress

  • Figure out how to get necessary code included in CMSSW
    • Talk to Bill Tannenbaum, Phillipe Canal,...
    • include L-Store code in CMSSW distribution so it is built on platforms correctly for use with rest of CMS software.
    • that way no software for users to download themselves, no changing of configuratino scripts, etc.
    • how test and validate before checking in?
    • how to check code in?
  • Figure out all issues needed to integrate with CMS data distribution model
    • phEDeX, TFILE, DBS/DLS,...
  • Switch Root plugin to use L-Store version of libxio

Demos

Demos at March 2007 OSG Grid Meeting (UC San Diego)

Can use Vladimir's or Dmitri's analysis for all of below.

Interactive Root

  • First upload a file to Depots spread across the WAN, use LORSView to show where and how they go.
  • Then read it back in root, show it works.
  • Mainly an introduction to the issues.

100 Node ACCRE Run

  • each reads its own file from WAN set of depots.
  • show speed versus local copies of file (data tethered analysis).

100 CPU Grid Job

  • similar to ACCRE Run, each job reads its own file from WAN depots.
  • jobs are distributed accross open science grid sites
  • demonstrates complete lack of data tether.

To Do To Get Ready

  • Run all of the above many times before actual demo!
  • Get LORSview upload working
  • Figure out how to submit 100 CPU Grid Job.
  • Want to run all 100 ACCRE jobs simultaneously? Need to work with ACCRE on that...

Get Rick Cavanaugh to run his analysis

  • need most of the stuff needed for "Summer 2007 demo" but maybe not all fully in place.
  • he runs himself.
  • work with him so he understands full functionality possible.
  • work with him to develop ideas for better implementing Summer 2007 demo
    • what docs are needed
    • best approach to getting users using it
    • etc.

Summer 2007 Demo

  • A "scratch space" demo for CMS users.
  • Use deployed REDDnet resources which should become available June 2007
  • Load REDDnet with "hot" data files, convince a few users to try them out
  • Must have L-Store code fully integrated with CMS software

General Testing

verify ROOT/L works

  • package up plugin for CMS L-Store test community
  • gain experience via benchmarking
  • finalize API (add write access?)
  • checkin plugin to root cvs
    • it will take a while for this ROOT addition to propagate into CMSSW
  • explore CMSSW procedures for checkin of LORS and/or LSTORE
    • it will take months for L-Store to be available for check-in

increasing level of stress tests:

(validate and benchmark)

do various combinations of the following:

  • single jobs vs simultaneous jobs
  • many jobs: at one cluster vs across the grid
  • simultaneous jobs hitting same ibp depot accessing one or many files
  • simultaneous jobs hitting same file at one depot or striped across many depots

Also need to benchmark and profile various types of jobs:

  • I/O intensive skims
  • CPU intensive jobs
  • show benchmarks/demonstrate which jobs work well with L-Store and which jobs won't work well (if any). Have thorough benchmarks for the worst-case scenario.
  • gather numbers to discuss impact on bandwidth as L-Store usage explodes.
  • will people feel more free to do unnecessary computations?

assemble interactive analysis demos:

  • host variety of interesting datasets
    • need to identify these datasets
  • make a wiki
    • with instructions
    • links to necessary data catalogs
      • L-Store
      • DBS/DLS
  • gather visually interesting ROOT Macros
    • event/detector displays
    • histograms/results
  • any FW-lite tools (even development versions) to try

assist user-analysis batch production:

  • identify and host a wide variety of datasets
    • calibration datasets
    • various backgrounds, pileup
    • variety of signal samples
  • populate catalogs to find datasets
  • web tools to assist this
    • how to find datasets
    • how to upload results
    • how to register results in catalogs
    • how to coordinate with L-Store and DBS/DLS

provide info on joining CMS/L (via L-Store? via REDDnet? ultralight?)

  • how to add an ibp depot


Long Term Needs

  • L-store version of libxio
  • Stable, production REDDnet depot deployment
    • including request tracker support!
  • L-store software fully integrated with CMS software, and being distributed.
    • this means need source code for L-Store version of libxio - check into CMS distribution

SRM interface

  • in principle, this is important for CMS usage.
  • Need to get new support person on board with this and up to speed.