New Hampshire EPSCoR Research Infrastructure Improvement (RII)
Data Management Plan - Version 0.2 (Revised March 2014)
The following management plan has been developed for the New Hampshire NSF EPSCoR Research Infrastructure Improvement (RII) grant EPS-1101245. It will be reviewed annually by the Cyberinfrastructure (CI) Team in consultation with the RII Scientific Leadership Team (SLT). Recommended revisions will be submitted to the RII Management Team for final approval.
As benefactors and contributors in the EPSCoR Program we recognize that we are a team of scientists and administrators who have a collective obligation to maximize the utility of the granted resources through 1) active communication between project leaders, teams, and the cyberinfrastructure (CI) team; 2) collaboration to efficiently allocate and share resources (e.g. personnel, equipment, and software); and 3) recognition that raw data and subsequent data products generated internal to EPSCoR projects are a common good and priority should be given to provide mechanisms for access and sharing to the research community and the public as rapidly and openly as possible according to the policies outlined below.
For the purpose of this document we have defined “research data” based on the definition proposed by the National Academy of Sciences, National Academy of Engineering, and Institute of Medicine (2009) in “Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age.” They define research data as “information used in scientific, engineering, and medical research as inputs to generate research conclusions.” It includes raw data (observations), processed data (models and simulations), published data, and archived data, gathered for research purposes as well as for other purposes and then subsequently used to support research.
UNH Research Computing and Instrumentation (RCI) will provide Information Technology (IT) infrastructure support for the scientific research. Located in Morse Hall, the UNH RCI has 12 full time staff members who will provide system and network administration, database administration, security, and analytical/scientific support to the research effort. Additional IT management and technology support will be provided by UNH Information Technology data center, located at 1 Leavitt Lane.
Automated backups will be performed using the Legato Networker software to a robotic tape library unit running four AIT-3 drives located in the Morse Hall data center. Monthly fulls and daily incrementals are retained for 90 days. For systems requiring offsite storage, BackupPC software is used for disk-to-disk backups that are mirrored to the UNH IT data center and across the University System of New Hampshire wide area network to Keene State College and Plymouth State University.
The project’s data archiving, display, and visualization deliverables will be served from dedicated equipment, including multiple servers and data storage technologies. A robust data management system will be implemented, based on open-source solutions, to provide access to the data sets that drive environmental analyses, assessments, and models, and cover multiple spatial scales, temporal scales, thematic content, and formats. The existing Earth Systems Science (ESS) data archive at the UNH Institute for the Study of Earth, Oceans, and Space, coupled with the NH GRANIT statewide GIS Clearinghouse, will serve as the framework for the system. These data archive and distribution systems currently house over 22 Tb of remotely sensed data and vector spatial data, and rely on a hierarchical directory structure on RAID level 5 disks for fast, efficient, and safe data storage and access among all users. They will be enhanced to accommodate new data streams generated by project researchers and partners.
It is the responsibility of individual researchers to contribute their raw data in a timely manner as explicitly described below. The institutional leads from the Scientific Leadership Team (SLT) are responsible for monitoring and following up with individual researchers who need to provide the data. The State EPSCoR Director has the responsibility for enforcement. Any barriers will be clarified and documented by the institutional lead and approved or denied by the EPSCoR Director. If the EPSCoR Director is unable to resolve the issue, the EPSCoR Director will communicate the problem to the appropriate Vice President of Research at the institution where the researcher resides.
New Hampshire EPSCoR adopts the following policy for the timeliness of data sharing:
Raw, unprocessed data: Researchers will catalog and deposit initial data sets, defined as raw data sets that have minimal quality control flags assigned, and corresponding metadata in a tiered- access repository approved by the SLT no later than six months from the time of acquisition or collection. The repository will comprise three tiers, including the general user tier, EPSCoR research team tier, and the administrative tier. The initial data will be available for download to the EPSCoR research team and administrative tiers only. The PI designated as the prime contact for each data set will be notified via email when requests for raw data are submitted.
Derived, processed data products: All derived and processed data products and corresponding metadata will be deposited into the repository no later than two years from the time of their creation. Researchers may deposit their processed data at points prior to the two year limit and opt to restrict access to EPSCoR researchers. After the two-year period, the data will be available to all tiers within the repository.
Citations for data sets will be provided by the prime contact for each data set, and will be made available for all downloaded data sets.
All data, both raw and derived, will be annotated and cataloged with appropriate metadata. New Hampshire EPSCoR will adopt metadata standards that are compliant with one or more of the recognized national (or international) standards, such as ISO 19115-2, FGDC, Dublin Core, etc. Upon request, the CI Team will provide necessary documentation, coordination, and training for effective use of these metadata standards.
Researchers have a right to expect 1) clear procedures for data cataloging & depositing their data; 2) unrestricted access to edit and modify their data or records about their data (i.e., metadata); 3) the availability of tools for cataloging and depositing their data, or in the absence of tools, staff who can assist them with data curation; 4) a secure and well maintained repository for their data; 5) adequate mechanisms for data discovery and access by others; 6) that they, as creators of data, have first rights to analyze and publish those data; and 7) to be credited for their data contributions.
After data have been deposited and cataloged in an approved repository, users of the data should appropriately acknowledge the National Science Foundation, New Hampshire EPSCoR and the individual investigators responsible for the data set. Any use of data provided by the New Hampshire EPSCoR must acknowledge New Hampshire EPSCoR and the funding source(s) that contributed to the collection of the data. Any restrictions on the usage of the data shall be clearly stated and obvious to the potential data user. Until systems are integrated, individual repositories may set their own citation policies. These policies should be easily discoverable and honored by data users.
Protection of Personal Financial Information: The Federal Trade Commission’s Safeguard Rule and the Financial Services Modernization Act of 1999, also known as the Gramm‐Leach‐Bliley Act (GLBA) requires institutions of higher education to implement administrative, technical, and physical safeguards for certain types of nonpublic personal financial information. Therefore, all deposited research data shall be separated from any data with nonpublic personal, financial, or confidential content.
Personal Medical Information: All New Hampshire EPSCoR researchers must comply with all applicable Federal and State regulations concerning the privacy and security of personal medical information, including but not limited to the Health Insurance Portability and Accountability Act (HIPAA) and the Health Information Technology for Economic and Clinical Health Act (HITECH).
Finally, we are sensitive to the potential effect that these policy statements may have throughout academia. We are particularly aware of potential deleterious consequences to junior academic research faculty if the data they collect are not adequately cited and if there is not institutional consideration given to these contributions with respect to tenure and promotion. To promote the new paradigm for rapid and open access to research data being advocated by NSF, the need for additional institutional policies are being assessed by the Senior Administrations and the VPRs are evaluating the consistency of policies across all research institutions as part of the State Science and Technology Plan (currently in progress).