Main Page: Difference between revisions

From ReddNet
Jump to navigation Jump to search
 
(25 intermediate revisions by 3 users not shown)
Line 1: Line 1:
__NOTOC__
__NOTOC__


[[Image:reddnetmap.gif|right|550px]]
== {{Template:REDDnet}}:  Enabling Data Intensive Science in the Wide Area ==


== REDDnet Science (Research Projects Using REDDnet) ==
[[Image:reddnetmap.gif|right|500px]]{{Template:REDDnet}} (Research and Education Data Depot network) is an NSF-funded infrastructure project designed to provide a large distributed storage facility for data intensive collaboration among the nation's researchers and educators in a wide variety of application areas. Its mission is to provide "working storage" to help manage the logistics of moving and staging large amounts of data in the wide area network, e.g. among collaborating researchers who are either trying to move data from one collaborator (person or institution) to another or who want share large data sets for limited periods of time (ranging from a few hours to a few months) while they work on it. REDDnet is not designed or intended to be a replacement for reliable archival or long term personal storage and users must make separate arrangements to insure that the data they are sharing via REDDnet's "best effort" storage is also preserved independently with stronger guarantees. 
 
One example comes from the [http://cms.cern.ch/ CMS] collaboration, a high energy physics experiment that will be taking data soon at the Large Hadron Collider (LHC) at [http://public.web.cern.ch/ CERN].  Groups of researchers, distributed across the country and the world, will want to use data products derived from the raw data produced by collisions in the LHC to do a variety of tasks from calibrating the detector to searching for new physics.  They will want the newest data products available for anywhere from a month to a few months, after which it can be archived to make way for the next batch of data. Although all the data will be stored long term at CERN and [http://www.fnal.gov/ Fermi Lab] they would benefit greatly if this data could be made more readily available for processing on their distributed computing infrastructure, especially on the [http://www.opensciencegrid.org/ Open Science Grid]. REDDnet is the kind of resource needed to deal with the data logistics of this application.
 
Another example, from the [http://www.americaview.org/ AmericaView] project, might occur in the aftermath of an earthquake in California or a Hurricane on the Gulf Coast, where researchers across the country will want access to the geospatial image data
from satellites covering the affected region.  For a few months after the event, this data could be uploaded to {{Template:REDDnet}} and made available to this community with much higher levels of performance and availability.
 
Initially, {{Template:REDDnet}} will deploy >700 Terabytes of distributed storage with an emphasis on scalability, speed and fault tolerance. Currently (Spring 08), there are roughly 160 TB deployed.
For example, at the
Supercomputing 2006 Conference in Tampa, Florida, {{Template:REDDnet}} demonstrated sustained transfers at a rate of 10 Gigabits per second between Caltech and the convention floor. These transfers were limited by the bandwidth of the network connection. At the same conference, {{Template:REDDnet}} demonstrated fault tolerance by striping data across thirty depots and then successfully reading the data even after turning off nine of these depots.
 
== Research Projects Using {{Template:REDDnet}} ==


* [http://www.americaview.org/ AmericaView] - Satellite remote sensing data and technologies in support of applied research, K-16 education, workforce development, and technology transfer.
* [http://www.americaview.org/ AmericaView] - Satellite remote sensing data and technologies in support of applied research, K-16 education, workforce development, and technology transfer.
Line 17: Line 28:
* [http://www.vanderbilt.edu/americas/English/pagemanager.php?page=Merin.php Retinopathy] - Diabetic Eye Disease Screening in Peru and Bolivia
* [http://www.vanderbilt.edu/americas/English/pagemanager.php?page=Merin.php Retinopathy] - Diabetic Eye Disease Screening in Peru and Bolivia


== REDDnet Documentation ==
<br />
 
=== Online Documentation ===
 
* Documentation for all aspects of REDDnet needs to be completed and added or linked to this wiki.  Below is the list of what needs to be developed/Core institution and person assigned to do the task/Deadline
** How to get started with L-Store/VU-person/March 31
** How to get started with LoDN/UTK-person/March 31
** L-Store/VU-person/March 31
** LoDN/VU-person/March 31
** IBP/VU-Alan/March 31
** Standard IO/UTK-person/March 31
** Data and Directory Services/VU-person/March 31
 
=== REDDnet RT Helpdesk ===
 
* Request Tracker for REDDnet to be set up by April 30th by VU/Mat.
 
=== Past Documentation ===
 
* [http://events.internet2.edu/2007/spring-mm/sessionDetails.cfm?session=3160&event=267 Network Storage Virtualization for Data Intensive Collaboration] Track Session at the [http://events.internet2.edu/2007/spring-mm/ Spring 2007 Internet2 Meeting] in Arlington, VA
** [http://mimir.accre.vanderbilt.edu/cgi-bin/public/DocDB/DisplayMeeting?conferenceid=8 Agenda and Talks]
 
* [http://mimir.accre.vanderbilt.edu/cgi-bin/public/DocDB/ShowDocument?docid=86 Three slide summary of L-Store, IBP, and REDDnet]
 
* [http://mimir.accre.vanderbilt.edu/cgi-bin/public/DocDB/ShowDocument?docid=73 REDDnet NSF MRI Proposal]
 
* [[L-Store Usage Instructions]] - setup, uploading, downloading, and several other Lstore options explained (includes [[LoRS Instructions]])
 
* [http://mimir.accre.vanderbilt.edu/cgi-bin/public/DocDB/ShowDocument?docid=84 L-Store Presentation at the University of Sao Paulo, July, 2006]
 
* [http://mimir.accre.vanderbilt.edu/cgi-bin/public/DocDB/ShowDocument?docid=82 L-Store Presentation at LBNL, Sept. 9, 2006]
 
*[[Protocol Standardization Efforts]] and development ideas
 
*'''The Vanderbilt/ACCRE Booth at [[SC06]] will highlight REDDnet technology'''
 
*'''Sign up for the [http://lists.accre.vanderbilt.edu/cgi-bin/mailman/listinfo/reddnet REDDnet Mailing List]
 
== Logistical Networking Software Development ==
*[[Protocol Standardization Efforts]] and development ideas
 
== Component Technologies and Partners ==
 
* [http://www.lstore.org/pwiki/pmwiki.php L-Store], the Logistical Storage project at ACCRE (Vanderbilt)
 
* [http://loci.cs.utk.edu/ LoCI], the Logistical Networking and Internetworking Laboratory at the University of Tennessee
 
* the [http://www.ultralight.org/ UltraLight] Project, an Ultrascale Information System for Data Intensive Research
 
* the Vanderbilt [http://www.vanderbilt.edu/americas/ Center for the Americas]
 
== REDDnet@Work ==
* [[REDDnet@I2: REDDnet Activities meeting, 21April08]] at the [http://events.internet2.edu/2008/spring-mm/ Spring 2008 Internet2 Member Meeting], Washington, DC.
* [[REDDnet at Work Page]] -- Organization, [[REDDnet Meetings and Minutes Page|Meeting Notes]], Work Plans, Events, etc.
* [http://mgmt.reddnet.org:8080/storcore/jsp/depotsPrintView.jsp?external=true REDDnet Depot Status Page]
*[[REDDnet Tools and Applications Meeting Spring 2007]]
*[[REDDnet Tools and Applications Meeting 2006]] December 4, 8:00am-5:00pm, Hyatt Regency McCormick Place, Chicago, IL. In coordination with the [http://events.internet2.edu/2006/fall-mm/index.html Fall 2006 Internet2 Member Meeting]


== Collaborators ==
== Collaborators ==


=== Core Institutions ===
{{Template:REDDnet_Collaborators}}
 
<table width="600px" border=0 cellspacing="0" cellpadding="0">
<tr><td>
[[Image:vubw.jpg|center|Vanderbilt]]
</td><td>
[[Image:utorange.gif|70px|center|Tennessee]]
</td><td>
[[Image:SFA.gif|70px|center|Stephen F. Austin]]
</td><td>
[[Image:nevoa.png|60px|center|nevoa]]
</td><td>
[[Image:NCstate.gif|50px|center|N. C. State]]
</td><td>
[[Image:udel.gif|55px|center|Delaware]]
</td></tr>
<tr><td align="center">
Vanderbilt
</td><td align="center">
Tennessee
</td><td align="center">
S. F. Austin
</td><td align="center">
Nevoa Networks
</td><td align="center">
N. C. State
</td><td align="center">
Delaware
</td></tr>
 
</table><BR>
 
=== Collaborating Host Institutions ===
 
{| align="center"
|-
|[[Image:usp.gif|90px|center|USP]]
|[[Image:uerj.jpg|70px|center|UERJ]]
|[[Image:michigan.jpg|60px|center|Michigan]]
|[[Image:fermilab.gif|55px|center|Florida]]
|[[Image:fnal.gif|55px|center|Fermilab]]
|[[Image:citlogo.gif|55px|center|Caltech]]
|[[Image:AMPATH.gif|55px|center|AMPATH]]
|[[Image:FIU.gif|55px|center|FIU]]
|[[Image:Loc_small.png|55px|center|LOC]]
|[[Image:Ornl_small.png|55px|center|LOC]]
|[[Image:Sdsc_small.png|55px|center|LOC]]
|[[Image:Stanford_small.png|55px|center|LOC]]
|[[Image:Ucsb_small.png|55px|center|LOC]]
|-
|align="center"| S&atilde;o Paulo
|align="center"| Rio de Janeiro
|align="center"| Michigan
|align="center"| Florida
|align="center"| Fermilab
|align="center"| Caltech
|align="center"| AMPATH
|align="center"| FIU
|align="center"| Library of Congress
|align="center"| ORNL
|align="center"| SDSC
|align="center"| Stanford
|align="center"| UCSB
|}<br />
 
=== Survey for Collaborators/Application Community ===
 
*  The survey linked below has been developed to understand the needs of the application community so that obtainable expectations can be established between Core Institutions and Collaborators and any other members of the application community.
 
*  Goal:  Have survey completed by all members of the application community by May 30
** [http://www.reddnet.org/REDDnet_Survey.doc click here for survey]


== Support ==
== Support ==


[[Image:NSF.gif|50px]]  <B>This work is supported by NSF Grant PHY-0619847 and by the Vanderbilt [http://www.vanderbilt.edu/americas/ Center for the Americas]</B>
[[Image:NSF.gif|50px]]  <B>This work is supported by NSF Grant PHY-0619847 and by the Vanderbilt [http://www.vanderbilt.edu/americas/ Center for the Americas]</B>

Latest revision as of 22:04, 22 April 2009


REDDnet: Enabling Data Intensive Science in the Wide Area

Reddnetmap.gif

REDDnet (Research and Education Data Depot network) is an NSF-funded infrastructure project designed to provide a large distributed storage facility for data intensive collaboration among the nation's researchers and educators in a wide variety of application areas. Its mission is to provide "working storage" to help manage the logistics of moving and staging large amounts of data in the wide area network, e.g. among collaborating researchers who are either trying to move data from one collaborator (person or institution) to another or who want share large data sets for limited periods of time (ranging from a few hours to a few months) while they work on it. REDDnet is not designed or intended to be a replacement for reliable archival or long term personal storage and users must make separate arrangements to insure that the data they are sharing via REDDnet's "best effort" storage is also preserved independently with stronger guarantees.

One example comes from the CMS collaboration, a high energy physics experiment that will be taking data soon at the Large Hadron Collider (LHC) at CERN. Groups of researchers, distributed across the country and the world, will want to use data products derived from the raw data produced by collisions in the LHC to do a variety of tasks from calibrating the detector to searching for new physics. They will want the newest data products available for anywhere from a month to a few months, after which it can be archived to make way for the next batch of data. Although all the data will be stored long term at CERN and Fermi Lab they would benefit greatly if this data could be made more readily available for processing on their distributed computing infrastructure, especially on the Open Science Grid. REDDnet is the kind of resource needed to deal with the data logistics of this application.

Another example, from the AmericaView project, might occur in the aftermath of an earthquake in California or a Hurricane on the Gulf Coast, where researchers across the country will want access to the geospatial image data from satellites covering the affected region. For a few months after the event, this data could be uploaded to REDDnet and made available to this community with much higher levels of performance and availability.

Initially, REDDnet will deploy >700 Terabytes of distributed storage with an emphasis on scalability, speed and fault tolerance. Currently (Spring 08), there are roughly 160 TB deployed. For example, at the Supercomputing 2006 Conference in Tampa, Florida, REDDnet demonstrated sustained transfers at a rate of 10 Gigabits per second between Caltech and the convention floor. These transfers were limited by the bandwidth of the network connection. At the same conference, REDDnet demonstrated fault tolerance by striping data across thirty depots and then successfully reading the data even after turning off nine of these depots.

Research Projects Using REDDnet

  • AmericaView - Satellite remote sensing data and technologies in support of applied research, K-16 education, workforce development, and technology transfer.
  • CMS - Elementary Particle Physics at the CERN Large Hadron Collider.
  • Structural Biology - Image reconstruction of large macromolecular assemblies through a collaborative effort of Vanderbilt and Lawrence Berkeley National Laboratory researchers.
  • Retinopathy - Diabetic Eye Disease Screening in Peru and Bolivia


Collaborators

Core Institutions


Error creating thumbnail: Unable to save thumbnail to destination
Tennessee
Stephen F. Austin
Error creating thumbnail: Unable to save thumbnail to destination
Error creating thumbnail: Unable to save thumbnail to destination
N. C. State
Delaware
Vanderbilt Tennessee S. F. Austin ORNL Nevoa Networks N. C. State Delaware


Collaborating Host Institutions


USP
UERJ
Michigan
Florida
Fermilab
Caltech
São Paulo Rio de Janeiro Michigan Florida Fermilab Caltech


AMPATH
LOC
LOC
Error creating thumbnail: Unable to save thumbnail to destination
LOC
AMPATH FIU Library of Congress SDSC Stanford UCSB

Support

NSF.gif This work is supported by NSF Grant PHY-0619847 and by the Vanderbilt Center for the Americas