- 1 Test Documentation
- 2 Proposed Tests
- 3 Current Tests
- 4 Test Archive
- For each test:
- description of what you are testing
- estimate duration, estimate what you expect?
- description for your test method (what you did)
- Stress test on SFASU
- What happens when depots start to get full? (Harold and Dan)
- depots should do low-level reclamation/resource recovery - Dan and Alan will do this test later this week
TSI simulation data using LoDN tools May '08 (Blondin/Sellers)
Depot Fill-up Test on Michigan-cap (Gonzales)
We've been meaning to do a test on depots when they are full and users continue to request storage. Since Chris's tests with Dr. Blondin apparently got these depots close to full, I figured it was a natural time to see what happens when such overfilling behavior happens. The plan currently is to use lodn_cp to send 40GB files to all 4 depots and have the warmer replicate a copy to each. When the time comes to remove our data, this will also give us a good test of LoDN's best-effort deletion on the REDDnet depots.
Tests completed today (5/23/08) with the following results.
- Depots gave correct error codes after all available storage was occupied.
- LoDN delete freed up all the storage gobbled up by the test once the upload had been stopped.
All in all, both the depot and LoDN performed as hoped. The test did not test reclaiming of storage, which will have to be covered by some later test. Also, I think that repeating this test over a couple of fill and delete cycles would be a good way of coaxing out any storage leaks which may be occurring. Without objection, I will go though a few cycles of this over the weekend and post the results here early next week.
Ongoing Usage Tests - Spring '08 (Sheldon)
These tests are continuous, although they can be stopped (just ask me).
Results are updated hourly.
Description of Tests
Currently all of these tests run on a handful of systems in the Vanderbilt Physics department. Our connection out is limited at 1 Gbs, this is one potential bottleneck on the tests I can run (the other is the 622 Mbs bandwidth of Vanderbilt's external connection.
- Downloads of a large 1.1 GByte file
- I have loaded several 1.1 GBytes files into the full REDDnet deployment of depots (40 depots or so).
- My tests continuously download this file and compare the checksum of the downloaded file to the original to check for bit rot (amoung other things).
- As soon as one download completes, another is started.
- A cron job is used to keep this process going.
- Several of these jobs are run on on different systems in the physics cluster on the 9th floor of Stevenson Center.
- Uploads/downloads of a large 1.1 GByte file
- I initially download one of the several files that I seeded REDDnet with.
- Once I verify that the download checksum is equal to that of the original file, I upload it back to REDDnet, giving it a new name so it doesn't overwrite the orginal.
- After each upload/download cycle, the files checksum is compared to the original.
- These process is kept going by a cron job.
- I count and plot the number of times files are successfully uploaded and downloaded, and add the number of downloads from the previous test, to get the total number of 1.1 GByte files moved each hour. This is ploted on the results webpage. Also shown are the number of times a move error occurs, and the average upload and download times for the 1.1 GByte files.
- Recursive uploads/downloads of a directory.
- I use a recursive download to download a directory that was originally seeded on the REDDnet depots. I compare this directory to the original. If it is OK, I upload to a new temporary directory, download it again, and compare again.
- The upload/download cycle is repeated several times, and is kept going by a cron job.
- Statistics for these movements are shown on my results webpage.
- provide an ongoing "heartbeat" for REDDnet (is it working?)
- tests for bit rot in depots (which would result in regular failures).
- provide some feedback regarding performance (which would result in changes in download/upload times).
- provide some feedback on failure rates...
- Results are updated hourly.