Monthly Archives: November 2010

What we’ve been up to

This has been a busy year for us and we have taken a multi-pronged approach to meet our programmatic goals. 


1. Digital library platforms

One of the first tasks assigned to our new architect and curator was an assessment of our currently digital library platforms (contentDM, Olive, ETD system, and DPubs). This analysis highlighted gaps in our application suite and also enabled the architect and curator to quickly gain understanding of what that suite looked like. Following the assessment, recommendations were made to decommission one of these applications (DPubs), migrate another to a new platform (ETDs), and improve functionality of the two remaining (contentDM and Olive).


The results of the platform review reinforced our determination to develop a new architecture and platform to support needs not currently being met  – foremost amongst these being the deposit of scholarly content, research data, and electronic business records from the University Archive. We initiated the Curation Architecture Prototype Service (CAPS) project this month with a projected four-month period for the prototype phase. The platform is based on a service-oriented architecture model, and entails the development of “microservices” – atomistic services to support functionalities such as “ingest,” “store,” “replicate,” or “annotate,” for example. The curation microservices approach was developed by the California Digital Library, national leaders in the digital library domain, and is gaining adoption at a steady pace.


2. Research Data

The UL and ITS are currently creating a Data Curation Services Working Group to develop data management services to support our research enterprise and ensure compliance with federal mandates. This effort will initially target pilot projects in three areas: NSF data management requirements; curation services consultation; and research data demonstration projects. We envision using the CAPS platform for the latter.


3. National engagement

In addition to our work at home, we have taken deliberate steps to develop strategic partnerships in our domain at the national level. This fall, with the California Digital Library, we co-sponsored a two-day workshop at UC-Berkeley on microservice development. We had participants from over 30 institutions, and invitations to repeat the workshop at other conferences. We will be co-leading the same workshop at the upcoming International Digital Curation Conference in Chicago in December. These events have successfully put Penn State’s name on the map for curation architecture and technology development, and relationships formed will be very useful as we move forward with the implementation of curation services.


4. Governance

Two new groups have been formed within in the UL to deal respectively with governance (the Content Stewardship Council) and operations (Digital Operations Team) of our projects. A new group is in the process of being formed which will have an advisory and consultative role – more to follow on that. 

And there’s been a lot more happening but I’ll leave it to other team members to take it from here (you know who you are).


Welcome to the Content Stewardship Blog!

Welcome to the blog for the University Libraries and ITS’ joint Content Stewardship program.

In April 2009, I blogged about this program on the DLT blog — eContent Stewardship Program: Part 1 Setting the Scene (note to self, don’t say Part 1 if it’s not followed by Part II, III, etc.). I never did get back to that story and my excuse is a good one: we’ve been extraordinarily busy with the groundwork it’s taken to get the program going. However, all that work is paying dividends now and here’s the update.

A bit of background first: In our current strategic plans, ITS and the University Libraries (UL) have committed to joint development of an institutional Content Stewardship program. (Originally titled “Cyberinfrastructure, e-Content and Data Stewardship Program,” the program name is now abbreviated to “Content Stewardship.”) The goals of the program are to provide “a cohesive suite of access, security, discovery, preservation, curation, repository, archival, and storage services for born digital data” to meet existing and emerging needs in research, scholarly communications, archives, and digital library collections. There were several reasons for our wanting to take a programmatic and comprehensive approach:


  • The first of these was the state of our existing digital library ecosystem: a series of stovepipe applications, each dedicated to unique needs – an application for ETDs, another for scholarly publication, another for digitized newspapers, for example – making them difficult to search across and resource-intensive to develop and maintain. These applications typically had gone through a lot of customization, adding to the support burden, and their capacity to meet new requirements is limited.

  • To a large degree, our existing applications support discovery and access but do not address digital preservation needs – the management of the digital object over time. The storage model for our digital library collections has also not included digital preservation requirements, such as support for mitigation of format obsolescence, replication, and tiered storage strategies. Managing digital assets across their entire life span is thus a key goal of our program.

  • The University Libraries had chosen not to implement an institutional repository because of the failure of the “build it and they will come” model at many institutions. The downside to the lack of repository infrastructure, however, means a limited capacity to flexibly manage new content that falls outside the boundaries of the currently deployed digital library applications.

  • Penn State has no existing services to formally manage or curate research data.

Getting from where we were to where we need to be has taken a very considerable amount of groundwork. From the outset we knew we needed an overall technical architecture and roadmap, and we did not have that expertise in-house. We therefore set out to hire a Digital Library Architect, and Mike Giarlo joined us at the beginning of 2010. The UL made a complementary hire on the user and content side, their first Digital Collections Curator, Patricia Hswe, and she also started in January. 

Apart from these strategic hires, on the DLT side we reorganized, retooled existing staff, and made new hires, particularly in system administration, storage, and development. Meanwhile we pushed out service management to improve our operations and worked on service consolidation at the same time, pushing away services such as email, calendar, desktop imaging, and printing. We also performed a major overhaul of our infrastructure, and greatly expanded our DR and Business Continuity capabilities.

And in our next installment I’ll update you on what we’ve been up to in the last year. Yes, back to installments but only because I want to save you from a really long post.