Data safety and retention is a key responsibility of Research Computing. We accomplish it in several ways that encompass the needs of desktops/laptops, labs, and core facilities.
Data protection of desktop/laptops is described in this section entitled Desktop/laptop backups.
Research Computing’s infrastructure for lab data storage is a fleet of ZFS-based storage servers in a series named RC-STOR#. These are typically 250 terabyte servers that are configured in redundant pairs: a primary and a mirror. Each server is subdivided into a set of eight logical devices providing robust data protection, so if a hard drive should fail, data integrity will not be affected
On the primary server, the file system creates snapshots, which are images of the storage at a point in time, similar to Time Machine and Windows backups. The servers are configured to create snapshots four times an hour, hourly, daily, weekly, and monthly. This provides a number of data restore options that a user can choose from if a current working file has been deleted, corrupted, or otherwise compromised.
In addition to the snapshots, each of the primary servers is copied to an equivalent ZFS server, the mirror. Each primary server is synchronized to its mirror daily. The mirroring server also maintains snapshots, so restoration is still possible during a catastrophic event on the primary server.
For future lab data storage, we are working toward a system where all servers will be archived for permanent retention of data
To request network storage space, please fill out our form.
Center and core storage
To support DFCI’s research centers and cores, we have a large, petabyte-scale storage array supporting broad collaboration and data distribution. The system itself, named BIGZFS, is configured for high availability and maintains a similar backup scheme as Lab storage.
For large projects and new instrumentation, please fill out our form to receive expanded storage capacity.
For new centers or cores coming online, please contact one of our directors to discuss the scope and impact of your services.
All of Research Computing’s storage systems are integrated to provide instant delivery of data with no duplication of large data sets. Data are conveniently stored, organized, and accessible so researchers do not need to download data or receive external media for data delivery.
If you represent a center or core and would like to participate in this data management system, please contact one of our directors. Please note that this option is only available for distribution to DFCI labs.
Collaborative storage solutions
For sharing and collaborating with internal or external research groups, we support a product called Pydio. Pydio is open-source software that turns any file system into a sharing platform. It is an alternative to Dropbox and other cloud storage options, with more control, safety, and privacy.
Pydio advantages for DFCI labs, centers, and cores:
- Free of charge
- Leverage your existing network storage resources
- Currently no limit on storage sizes
- Manage your own access
- Share files with external collaborators
If you currently use Dropbox, we encourage you to try out Pydio. To request Pydio be configured to allow for external file sharing, please fill out a storage request.