Feb 29, 2008

From ReddNet
Jump to navigation Jump to search

Bi-Monthly Collaboration Meeting

Coordinates

  • Feb 29, 2008 -- 11:00ET/10:00CT/8:00PT
  • Call 510-665-5437
  • Meeting ID is 7333

Attending

  • Santi, Alan, Larry, Bobby (Vanderbilt ACCRE)
  • Paul (Vanderbilt)
  • PR and Diane (SFASU)
  • John Cobb (ORNL)
  • Hunter (Nevoa)
  • Terry and Chris (UTK)

Agenda

Status Report on Production REDDnet Deployment

new keys on their way out to Caltech and UFL. UMich will be sent out today. Still waiting to hear from UCSD. All sites willing to help to do re-imaging. As they are re-imaged (as soon as next week) they can be brought online.

Once all exisiting depots are online Nagios will be back up.

there is a partition on each system that hopefully will allow us to install rescue software in a seperate partition to help with recovery of lost nodes. But this project still needs some work to complete it, which probably won't be looked at until the end of March.

AmericaView Status

they are routing over I1 and not I2. so that is one particular problem with download speeds. investigations going on at SFASU.

timeout issue with metadata server... but timeouts are at Vanderbilt... traceroute timing out at Vanderbilt? could be Vanderbilt doing "traffic shaping"

they will keep us posted.

everything is uploaded to default LUN. its there and coming down fine. (modulo the network issues).

they will consider augment to SFASU for persistent copy and then augment to default which will spread out to everyone for use.

LODN Status

new depot code required. should now work with both warmer and LODN code. but this exposed another issue with storecore and so new depot code will not be distributed until then. Nevoa is looking into this and after that is working ACCRE will roll out depot to rest of REDDnet.

UTK will try to install local depot so they can do testing. Alan has to package it up a bit better first.

And Chris has fixed a problem locally at UTK (not with LODN code but a campus problem) that was also causing trouble.

everything works fine now (LODN and warmer) on test depots at Vanderbilt so optimistic will work fine on production depots.

TeraGrid Status/ORNL

David Giles has joined ORNL as a systems person, John has him for half time. Some hope now to get the UTK depots up.

Around end of March or early April, Bobby and other from ACCRE should stop by ORNL to meet with David and John to discuss getting ORNL REDDnet depots up.

Library of Congress Status

To find out how things were going, Terry contacted Andy Boyko at the LC. Here is what Andy said:

"Hi, Terry. I'm putting together the summary email now, but the summary summary is: transfer rate never exceeded 40-50Mbps, which is lower than I hoped for but should still be fine; a bigger concern is that the transfer ended up retrieving only 53439 out of the 56712 files named in the manifest, without any indication of a failure. Hoping we can talk about the best approach to using the client on our call later today..."

Santi and Alan will call Andy at LC before the LC call this afternoon to discuss technical details.

Need to find out if the problem with missing files was on the upload or download, for example.

And network speed can be worked on.

OSG Status

Dan was not present and could not report, but he is making progress. For example, his gridftp server works completely now, you can upload or download a file via gridftp into/outof REDDnet and the files appear on the LODN website.

CMS Status

Dan was not present.