This summer, I will be doing my internship at ACCRE (Advanced Computer Center for Research & Education)

I will be working on a distributed file system (DFS) called L-Store which is a a new high-performance data archiving system. My primary project will be to replace their socket-based networking layer with a distributed communication middleware layer of CORBA. My research goals will be to discover new ideas for application-level fault tolerance, data placement and caching, high bandwidth data transfer optimizations, deployment and configuration in systems like these.

This work will help the DOC group diversify into the area of Cluster and Grid computing and storage area networks (SAN).

My mentors here will be Dr. Alan Tackett and Larry Dawson.

Following is a brief summary of the technologies being used and researched:

REDDNET: Researchers at ACCRE have developed a network of data storage clusters at nine universities and institutions across the Americas, including Vanderbilt and Fermilab. It’s called the Research and Education Data Depot Network, or REDDnet, and it works as a Grid Computing system, with each institution acting as a separate storage drive. When a file is saved to the system, it’s split into many smaller pieces and distributed among the institutions. Later, when a scientist retrieves that file, the pieces travel back all at once from different locations and are pieced together. These small chunks of data load into the researcher’s local system much faster than they would in one large lump, greatly speeding access.

L-Store: L-Store implements a complete virtual file system using LN as the underlying abstraction of distributed storage and the Chord Distributed Hash Table (DHT)[Dabek2001] developed for peer-to-peer systems as a scalable mechanism for managing metadata. L-Store is designed to provide: virtually unlimited scalability in both raw storage and associated file system metadata; a decentralized management system; security; role-based authentication and authorization; policy-based data management; fault tolerant meta data support; user controlled replication and RAID-like striping of data on a file and directory level; scalable performance in both raw data movement and metadata queries; a virtual file system interface in both a web and command line form; and support for the concept of geographical locations for data migration to facilitate quicker access.

  • Provides a file system interface to (globally) distributed storage devices (“depots”).
  • Parallelism for high performance and reliability.
  • Uses IBP (from UTenn) for data transfer & storage service.
    • Write: break file into blocks, upload blocks simultaneously to multiple depots (reverse for reads)
    • Generic, high performance, wide area capable, storage virtualization service
  • L-Store utilizes a chord based DHT implementation to provide metadata scalability and reliability
    • Multiple metadata servers increase performance and fault tolerance
    • Real time addition/deletion of metadata server nodes allowed
  • L-Store supports Weaver Erasure Encoding of stored files (similar to RAID) for reliability and fault tolerance.
    • Can recover files even if multiple depots fail.