In my last post a couple of weeks back, I outlined a vision for Penn State publishing and curation services (PCS) in order to frame further discussions about the policies that will guide those services. Now I’d like to discuss several policies that are implicit in that vision, as well as their implications for how we will offer these services.
1) Everything we do must help researchers and students achieve their scholarly aims.
Every decision we make needs to be tested against this one. Anyone care to argue with that? Moving right along, then…
2) Penn State users are the primary contributors of content we publish and curate.
There will be some exceptions: co-authors or collaborators may be faculty from other universities; a journal will include articles from authors from various schools. But because we receive our primary funding from Penn State, we will need a contributor or a sponsor of some sort to be directly affiliated with the University.
3) Non-Penn State consumers of the content are as important as Penn State users.
This is really important, and, because it’s also somewhat contrary to the primary model of service for an academic library, we are apt to forget it. Consumers and contributors will interact with content differently and have different needs. Our researchers’ primary audience for their work may not be their colleagues at Penn State, it is probably their colleagues around the world. There are implications here for how we direct our efforts to enhance discovery. Most people will not come to the Penn State Libraries website, or the Cat, or LionSearch, to find the material Penn State researchers trust us with.
The repository is not a box in which we punch some holes for people to reach into. We have to leave to the box open, and even strew its contents across the web for others to find. This means most of our material is available open access. However, the requirements of the contributor may well trump those of our external researchers, such as when there are embargo requirements on a dataset or a publication.
4) The services we provide will be for stuff that is primarily scholarly.
What does this exclude? Hardly anything.
If the researcher/contributor can articulate a reason that the material is of some scholarly import, why would we not agree to make it accessible? Again, this is very different from how libraries have traditionally made decisions about what stuff we care for, e.g. what we collect. But we have done that in the context of other institutions, such as publishers, making decisions for us by choosing what to publish. Rarely have we had to evalute the work of our faculty face-to-face. Our collection development policies broadly support the curricular and research needs of the University. If material is produced through the Penn State curriculum or research activities, would it not fit? (By the way, I have heretical ideas that talk of “building a collection” may not make sense in the context of publishing services.)
There are some materials that will be better served by developing additional services. For example, electronic records management will use of much of the same infrastructure as publishing and curation services, but the mission is not the same. Having already had pretty extensive discussions about what those services will look like, we know they are very different from what we are proposing now.
There will be some matters, such as resource limits, as well as legal or policy restrictions may limit what we can handle or how we do so. But the presumption is in favor of the researcher who wants or needs help in sharing and managing their stuff.
5) Most of the stuff we handle will be inactive, if not at a stage of completion.
We’ve had some internal discussion among the stakeholders about research needs for collaborative workspaces that will allow for teams to more easily share data and tools. Something like a combination of wikis, GoogleDocs, and DropBox, but with a lot more storage, the ability to execute code, much better object management, and much more security. There are platforms that can help some communities, and some of our colleagues are piloting similar services through their own publishing services in conjunction with others.
We need those services, but that’s not what we are developing now. Within the Research Life Cycle, we’re now primarily aiming at the dissemenation and discovery stages. I think that the vision statement in my last post is big enough to accommodate more support during the active phases of research. But first let’s get some basics in place first.
6) The scholarly stuff we handle will persist.
That is my digital preservation policy. We are a library, and people expect us to keep stuff. And if the material is to have any value to researchers, it needs to be citeable and continually accessible. Above all else, researchers value libraries because of our reputation for preservation and stewardship. We should do nothing to call it into question. (How often do libraries promote their weeding projects?)
So what exactly must persist? Only the bits? A file or group of files? The relationships between those files? The code that prepared derivative data from raw data? The whole thing, exactly as it is and was? We worry a lot over format obsolescence, but answering those questions may depend more upon what the content represents, the expectations of the contributor as well as the user community, the value of the content for others, and several other less technological factors. Tim Pyatt pointed out in one recent conversation that depending upon the nature of the material, we may define different tiers of curatorial attention. Official university scholarship, such as ETDs that are required for the credential, probably get the highest attention, while less valuable (or perhaps poorly documented) materials would be more lightly touched over time.
We definitely have to educate our clients. I am not saying that we shouldn’t develop guidance for researchers on best practices to make content durable, or that we wouldn’t develop tools to help them prepare materials for our care. Certainly it would be irresponsible for us to claim materials that we can’t adequately care for. However, while we may not be able to guarantee the readability of a particularly idiosyncratic pile of data, that in and of itself shouldn’t be a reason to reject it. The museums of the world still hold thousands cuneiform tablets for which no Rosetta Stone has been found.
7) We must respect and work within a framework of other policies and laws.
These include copyright law, conditions imposed by funders, or university policies on intellectual property. Our services will help researchers navigate those policies and laws to achieve their scholarly aims. This means that we will at times have to interpret contracts (Does the publisher allow pre-prints to be shared?), law (Is that within the bounds of fair use?), and policy (Does NSF require this stuff be open access, or just require that it not be destroyed?). So yes, we’ll develop our own policies and we will frequently consult other experts. All librarians need to develop better understandings of the legal and policy regimes we live in.
Some of the above may seem obvious, but they deserve to be called out because they need to direct our focus for the next year and more. Penn State is late to develop coherent services around a commonly defined infrastructure (in other words, we ain’t got no IR, y’all). But that means we can learn (borrow, steal) from colleagues at other institutions to help us answer the next round of questions about guidelines, policies, or practices that will help our researchers to achieve their scholarly aims.
I’ve only tried to provide a framework for future discussions among our developers, our scholarly communications staff, and our public services stakeholders. It might be interesting to expand on anyone or all of the above six items in full blog posts. I invite my colleagues to do so, and I invite any reader to chime in with other implicit policies that derive from the original vision.