Let’s Start at the Very Beginning, a Very Good Place to Start
This project began with the Next Generation Repository Planning Project, launched in December 2014 as a collaboration between IT and Libraries. That project’s primary goal was to create a roadmap for central repository services at NYU by studying the national landscape to see how other institutions are addressing the needs of researchers and librarians with respect to the research data lifecycle, and learning about the latest approaches being employed and explored from a technological and organizational perspective. This planning work was completed in July 2015, and it informs our approach to the Digital Repository Services for Research Project.
Why Repository Services? Why Now?
Providing support services for NYU research is a key responsibility for the Libraries and IT. High-quality infrastructure services enable scholars to focus on the research work itself. In scientific and humanities research, there are growing needs for storing, moving, finding, sharing, and accessing digital assets. The research process moves through a series of sequentially related stages or phases – outlined in the chart below – in which information is produced, processed and shared. In recent years, the amount of information has been growing exponentially.
Figure created by Vicky Steeves, New York University
The lines between circles represent the transitions that occur in research as work is finished and passed to the next stage. (Note: The term “data” here is used generally for any digital content that serves as raw material for research.) It is critical for the institution to support each stage of the lifecycle and to facilitate the transitions between stages for maximum use and impact.
NYU’s commitment to growth as a research institution, coupled with the growing data needs in the research process, drives the need to deliver strong services and support for storing and managing digital content within and between each stage of the research data lifecycle. By 2014, several internal and external changes had occurred that warranted taking a closer look at the existing support environment for research data within IT and the Libraries to assess the current state and evaluate how to improve it. These changes included:
- Data sharing requirements for grant funding
Increasing numbers of granting agencies now require investigators to share the primary data, samples, physical collections, and other supporting materials created or gathered in the course of funded research work. - Data explosion, need for safe backup
With the huge increase in the amount of data used in research, large file storage and backup is a critical need. - Moving and sharing files
As data files become larger and more widely used, researchers are spending more time managing files (moving and sharing), often at the expense of focusing on their subject work. Improved integrated tools are needed to make these tasks easier and efficient.
- Upgrades needed for existing repository services
Existing NYU repository services (e.g. Faculty Digital Archive and Spatial Data Repository) provide valuable functionality for preserving, sharing, and citing digital assets and for making them discoverable. However, the technology design of the current services cannot accommodate the expanding requirements for preserving and sharing digital content.
- Growth in Libraries’ digital collecting
The Libraries’ digital collection is growing at a rapid rate. Meanwhile, scholars’ expectations for gaining access to digital materials quickly and easily are increasing.
Rather than addressing these issues individually, Libraries and IT teamed up to take a holistic view of the research data lifecycle in order to consider services and environments that are interconnected and would benefit everyone involved in these facets of research including librarians, professional support staff, technical staff and the researchers themselves.
Sounds Good, But How Do We Create That?
This project proposes the creation of an integrated model for digital repository services. In addition to creating a scalable foundation, the integrated model builds on existing strengths in our working relationship between Libraries and IT and offers the most robust service model. The core of this project is building a shared portfolio of four research-related storage service bands that are distinct from a technical perspective but will satisfy the needs of all dimensions of the research data lifecycle seamlessly for users. One or more of these four bands may encompass multiple end-user services, but they will be unified internally, i.e. each band will share an overarching design and will use a shared foundational infrastructure where possible.
Additionally the four service bands will be planned so they are interconnected from an architectural and support perspective. The end-user services within the four bands will be utilized by library curatorial staff in building collections as well as by researchers throughout all stages of their research process.
It is these services that will collectively be referred to as the Digital Repository Services for Research, or DRSR.
The four service bands will provide:
- Temporary storage for data analysis requiring very fast access to data, including large files
- A storage environment designed for ongoing activities
- A feature-rich publication environment with preservation and curation options available
- An environment for working with high security restricted data.
Where the Rubber Meets the Road: Working Groups
To develop these four service bands, we created four working groups. These groups will collaborate on determining the portfolio of end-user services that rely on each of Service Bands 1-4, and define the underlying infrastructure model for the bands. They will also deliver the following:
- Architectural Working Group: determine the design of each of the four service bands plus the overarching design and interconnectedness.
- Technical Working Group: make software and technology recommendations for fulfilling architectural design group recommendations.
- Functional Validation & Prioritization Working Group: represent user needs to development teams, elaborate on functional specs, finalize requirements, recommend priorities, prepare for transition to live service.
- Policy Working Group: identify policy issues and needs around each of the services. (Note: this work may feed additional requirements to the Functional Validation and Prioritization group.)
For more information on all the teams associated with the DRSR project, see Meet the Teams.