Mass Storage System - Gyrfalcon
Solving the world's energy needs can involve working with massive amounts of data. getting that data to the HPC systems can take quite an effort. Once you get the data to the system, you'll want to keep it close by, particularly when you need to analyze that data over and over again. To help solve this mass data handling problem, NREL's Scientific Computing facility has installed a mass storage system where the scientists can easily store and retrieve data critical to their work.
Components of NREL's mass storage environment
The mass storage system is designed to keep the most used data quickly accessible, while economically storing data that is less often accessed. It's done by employing high-performance disks for the freshest working data, and software using algorithms to quietly move older data to economical tape storage with the help of a large robotic tape library and a series of high-performance tape drives.
The mass storage system is built around four main components. They include high-speed near-line disk storage, the tape library, high-speed network interconnectivity, and management hardware and software to direct the whole process.
The high-performance disk storage piece of the mass storage system is based on a cluster of Oracle 7420 ZFS disk storage appliances. Each appliance uses a combination of 7200 and 15K SAS-2 drives in combination with ultra-performant SSD storage.
The base tape library is a Oracle StorageTek SL3000 tape management enclosure. With 760 physical tape slots available for growth, we initially have 400 tapes loaded into the device. The robot feeds data into it's six tape drives that work in parallel to write and read all the tape storage requests.
The disk storage appliances and the tape library are interconnected with bonded Fibre Channel lines via a set of Brocade 5100 SAN switches. Clients (such as the Peregrine HPC system, standalone servers, and the Windows HPC system) gain access to the mass storage system over multiple 10 Gigabit Ethernet connections.
Management and client access
All those connections and how to get to the data needs to be managed with some pretty sophisticated gear and software. NREL is using SAM-QFS to orchestrate the data management processes. SAM-QFS is an integrated hierarchical storage manager (HSM) and storage area network (SAN) file system running on a number of control and client servers. In concert with Oracle StorageTek's Automated Cartridge System Library Software (ACSLS), SAM-QFS handles the work flow and manages the policies that efficiently and economically our scientific data.