Scaling Up a Collaborative Consortial Institutional Repository

Together, we are developing an affordable, open-source, and collaborative institutional repository solution based on the Hyku software.


  • PALNI and PALCI Partner to Remove Barriers to Hyku Adoption with IMLS Grant Award

    The Institute of Museum and Library Services (IMLS) has awarded $248,050 to the Private Academic Library Network of Indiana (PALNI) in partnership with The Partnership for Academic Library Collaboration & Innovation (PALCI) for Hyku for Consortia: Removing Barriers to Adoption as part of the National Leadership Grants for Libraries Program. IMLS received 172 applications requesting more than $47 million in funding and selected 39 applicants to receive awards during this grant cycle. With this award, the partners will increase the flexibility, accessibility, and usability of Hyku, the multi-tenant repository platform system.  

    Repositories are a critical piece of library infrastructure, enabling access to many types of digital materials created by an institution’s students, faculty, staff, and researchers. Libraries, cultural heritage institutions, and other organizations also use repositories to provide access to digitized special collections.

    In the face of continued budgetary pressures, libraries are seeking cost-saving approaches to their work. Those unable to deploy Institutional Repository (IR) services on their own due to costs or other constraints are increasingly looking to consortia to serve this role. This project specifically seeks to develop Hyku to support the repository needs of library groups by increasing affordability and flexibility. 

    PALNI Executive Director Kirsten Leonard notes, “This grant will provide the foundational support for PALNI and PALCI to remove remaining barriers to more widespread deployment of the repository software. Together with input from our new consortia project participants from VIVA and LOUIS, we will create business modeling and a toolkit to support other consortia to provide this service for their members, potentially reaching thousands of libraries.” 

    This project will extend work completed under the previous PALNI/PALCI IMLS grant, which resulted in the establishment of Hyku Commons, a production-level, low-cost, multi-tenant repository service shared by the supported institutions of PALNI and PALCI.  This new round of funding will further improve Hyku by directly addressing needs articulated by stakeholders in a scalable, multi-tenant environment.

    The project will kick off with a user study and gap assessment to further define existing barriers and software requirements needed to support the adoption of the service. PALNI and PALCI will employ Notch8, an open-source software development firm and long-time contributor to the Hyku project, to deliver enhancements and changes prioritized in the early phases. Rob Kaufman, Notch8’s Founding Partner and current Product Owner for Hyku, sees this as an extraordinary opportunity to increase visibility and adoption of Hyku. “Hyku for Consortia has been one of the key projects in the community, expanding the functionality of Hyku in ways that really matter to the users. Notch8 is excited to continue this partnership into this game-changing new phase.”  

    The project will also expand its partnership to include consortial partners LOUIS and VIVA, who will pilot the service and offer feedback critical to ensuring widespread adoption. A Consortial Institutional Repository Toolkit will provide guidelines, documentation, and other materials to support the development of similar collaborative repository services in other consortia.

    Jill Morris, Executive Director of PALCI, is excited at the opportunity to drive strategic innovation of community-owned infrastructure. “This project builds on the strengths of consortia and stretches our relationships to leverage our respective strengths. We are thrilled to continue our partnership with PALNI to explore new solutions, business models, and collaborative approaches to building and sustaining our library infrastructure.

    Anne Osterman, Director of VIVA, said, “We are delighted to be piloting and supporting this important project as it develops scalable options for groups of libraries.The creation of a truly community-led, open, sustainable, and multi-tenant repository service meets needs long articulated by academic libraries and the consortia that serve them.” 

    Teri Oaks Gallaway, Executive Director of LOUIS, expressed her interest in the grant, “One of our strategic goals as a consortium is to explore opportunities with other libraries, consortia, and vendors for the development of an open-source library services platform. This project is a perfect example of how we can pool our collective knowledge and resources to improve upon and expand the reach of a needed tool like Hyku. We are excited to be a part of this opportunity with our partners and colleagues and look forward to supporting the development of this project.”

    “As pillars of our communities, libraries and museums bring people together by providing important programs, services, and collections. These institutions are trusted spaces where people can learn, explore and grow,” said IMLS Director Crosby Kemper. “IMLS is proud to support their initiatives through our grants as they educate and enhance their communities.”

    Updates for the project will be made available at https://www.hykuforconsortia.org/.


    About the Institute of Museum and Library Services:

    The Institute of Museum and Library Services is the primary source of federal support for the nation’s libraries and museums. We advance, support, and empower America’s museums, libraries, and related organizations through grantmaking, research, and policy development. Our vision is a nation where museums and libraries work together to transform the lives of individuals and communities. To learn more, visit www.imls.gov and follow us on Facebook and Twitter.

    About the Private Academic Library Network of Indiana, Inc. (PALNI): 

    PALNI is a non-profit organization supporting collaboration for library and information services to the libraries of its twenty-three supported institutions. Over time, the library deans and directors who sit on the PALNI board have adjusted the organization’s strategic direction as the internet and information services landscape has changed. PALNI has expanded beyond providing a resource management system to sharing expertise in many areas, including strategic planning, reference, information fluency, outreach, data management, and configuration, and has identified greater collaboration in acquisitions as a key goal. www.palni.edu

    About The Partnership for Academic Library Collaboration & Innovation (PALCI):

    The PALCI organization was originally founded as the ‘Pennsylvania Academic Library Consortium, Inc.,” and was formed in 1996 as a grassroots federation of 35 academic libraries in the Commonwealth of Pennsylvania. Today, PALCI is known as Partnership for Academic Library Collaboration & Innovation, with membership consisting of 74 academic and research libraries, in Pennsylvania, New Jersey, West Virginia, and New York. PALCI’s mission is to enable cost-effective and sustainable access to information resources and services for academic libraries in Pennsylvania and surrounding states. PALCI Members serve over 800,000 students, faculty, and staff at member institutions, through a variety of programs, including the highly-regarded EZBorrow resource sharing service. PALCI also serves as the home for the Affordable Learning PA program, creating a community of practice for open textbooks and related educational resources. http://palci.org

    About LOUIS: The Louisiana Library Network

    LOUIS is a consortium of public and private college and university libraries in the state of Louisiana. This partnership was formed in 1992 by the library deans and directors at these institutions, in order to create a cost-effective collaboration among the institutions for the procurement of library technology and resources. We are currently forty-seven members strong.

    About VIVA

    VIVA is the academic library consortium serving 71 nonprofit higher education institutions in Virginia, including 39 state assisted colleges and universities, 31 independent private, nonprofit institutions, and The Library of Virginia. VIVA’s mission is to provide, in an equitable, cooperative, and cost‐effective manner, enhanced access to library and information resources for Virginia’s academic libraries serving the nonprofit higher education community. 

    About Notch8:

    Founded in San Diego, CA in 2007 by Rob Kaufman, Notch8 is a Ruby on Rails-based web consultancy with additional expertise in React and React Native mobile applications. Today we are a team of 18 developers and technical experts located across three time zones. Since 2016, we have been active with digital repository solutions, primarily through our involvement with the Samvera Community. We are Samvera Partners and both in and out of the Samvera framework, we have contributed to more than 20 projects in the digital repository space. 

  • Happy 2021! Hyku for Consortia Project Update for the New Year

    (Feature photo by Olya Kobruseva from Pexels)

    We thought we’d kick off the new year with a project update. 

    At the end of 2020, as we looked to the end of “Phase 2” of our work (improving features for multi-tenant administration), we took some time to review our project goals with Notch8.  In this review, we laid out the deliverables promised in our IMLS grant and determined a deliberately scoped path to complete these goals by Spring 2021.  We both took stock of our progress to date and identified areas to further define our goals for the sake of efficient progress.  This was done with hopeful anticipation of an additional round of grant funding for Phase 3, in which we plan to remove identified barriers to adopting Hyku, both in and outside the consortial community.  

    As a refresher, here’s an overview of our grant goals and deliverables:

    Collaborative Workflow Support

    Development is currently underway for collaborative workflow support.  Updated scoping for this work now includes:

    • An admin can create a new group in a tenant
    • An admin can then assign roles to that group 
    • Users who are added to a group will receive the permissions from that group. 
    • On the user management tab user’s groups as well as any individual permissions granted to the user will be displayed
    • Tenant level roles on the User Matrix will be created 

    In a future project phase, we hope to add the Multi-tenant Manager and Multi-tenant Editor roles.  We also want to create a groups and permissions area on the consortia admin page that will create one workflow for adding groups permissions across multiple tenants.

    Worktypes

    Development is also underway for worktype development.  The OER worktype is completed (specs here), with specifications for the ETD worktype are complete and the “shell” worktype (a copy of the generic workype) already created.  Work now underway and now nearly completed includes:

    • Metadata customization for the ETD worktype
    • Fields will be configured according to the ETD worktype specifications
    • Once the fields are configured Bulkrax mappings will be set up for importing and exporting the ETD worktype.

    We’d love to further explore easy creation of worktypes in the future, as well as greater flexibility for controlled vocabularies.

    Themed Templates

    We’ve done a lot of work gathering specifications and mocking up wireframes representing the themes (IR, cultural heritage, and neutral) we’d like to implement as part of Hyku.  Scoped work in this area for the remainder of this project phase are as follows:

    • A Theme tab will be added under the Appearance page. 
    • On this tab, a user will be able to select a home page theme, a search results page theme, and a work display page theme. This will allow for greater flexibility for repository managers and extend the core offering to a wider range of use cases. 
    • The theme pages will respond to the colors, logos, and feature flippers set in the app.
    • The following Pages will be built as themes (referencing preliminary mockups):
      • 3 Home Page options (Cultural Repository, Institutional Repository, Neutral )
      • Search Pages with Gallery, Masonry, and Slideshow
      • Images Based and Text Based show Pages.

    This is an exciting new development path, and we can’t wait to see how it turns out!  In the future we may make some changes to how the template elements function, and possibly additional options to make the theming as flexible and customizable as possible.

    For the remaining deliverables (DOI minting, cross tenant searching, and multi-tenant shared works), we’ll continue to gather requirements from our user communities and explore work being completed in complementary projects. As always, we look to integrate our work with the larger Hyku Roadmap, contributing our improvements back to the Hyku base code and avoiding duplicative development efforts whenever possible.  

    We’ll continue to post updates on our project here, and please feel free to contact us with any questions.

  • Hyku for Consortia Roadshow

    We’ve been on the virtual road the last two months talking about Hyku for Consortia at two conferences: the USETDA 2020 Conference and the annual Samvera Connect conference.

    While Covid-19 has made travel impossible, the bright side is that we have recordings of presentations that can be shared for all. Below is the video from our presentation at the USETDA 2020 conference held September 23-24: Providing Flexibility and Affordability for ETD with HYKU

    https://vimeo.com/460283980

    In October we participated in three events in the annual Samvera Connect Conference, a poster, a panel, and a presentation:

    Both the panel and presentation were given on the same day (10/28/2020) and you can find video from both at on Samvera’s YouTube channel.

  • Bulking Up, Part 2: Bulk Upload in Hyku Commons

    This post is the second of a two-part look at bulk upload in Hyku.  The first examined the background of bulk operations and why they are difficult to do well. This post focuses specifically on the application of a bulk import solution in Hyku Commons.  The need for bulk upload in this project is similar to those identified by the large Hyku user community.  We too need an “easy in and easy out” data solution for our repository users.  (Photo above by Pexels on Pixabay)

    In PALNI’s 2018 white paper, we identified several valued repository attributes, and have since adopted them as the shared vision Hyku for Consortia project.  One of these values speaks directly to the need for bulk upload solutions: “The collaborative institutional repository should be a system which is interoperable and allows free-flow of data. Easy import and export of metadata and objects are possible.” The use cases for bulk ingest are numerous.  Migrations from another platform, repurposing data from external sources such as finding aids, and user preference rank high on the list for why one would rather import works and their metadata in bulk rather than piecemeal.

    To further illustrate the need for bulk importing, you will find a list of workflow examples in the Hyku for Consortia project documentation.  These examples provide hypothetical consortial profiles and repository scenarios based on real-life stories contributed by our Product Management Team.  From these scenarios:

    • Scenario 1, “Midwest Library Consortium”: Tenant-only Editor is in the archives department and has a digitized archives collection to add to the repository. He creates a new collection and uses one of the pre-populated admin set choices. He then bulk uploads the content and saves it but does not publish it.
    • Scenario 3, “Wealthy Alumni College”: Tenant-only Editor begins uploading student works in bulk into the repository with draft metadata, licenses, and embargoes. 
    • Scenario 5, “Sunnydale Community College”: Student staff member is made Tenant Editor. She uploads minutes in batches with a spreadsheet of basic metadata. The collection is not yet published.

    These scenarios helped us to envision all the ways that Hyku might be used for various IR users and content, and to define our collaborative workflows and user roles.  Also, without us realizing it at the time, they very much highlighted how essential bulk import is to this work. In three out of five of these examples, we envisioned works being uploaded in bulk by Tenant Editors, who might be an archivist/librarian, grad school staff member, or even a trusted student.  These users have metadata in an existing external format, and rekeying hundreds or thousands of metadata values would be a waste of their time.

    Shifting away from the hypothetical to the actual, now that we are using Hyku Commons for real-world pilot repositories, the need for bulk import functionality is even more apparent.  For example, one of our partner institutions moved content from Digital Commons to CONTENTdm as a stop-gap when they lost access to the platform due to cost. Now they want to move that content into Hyku. 

    Using the Bridge2Hyku project’s CDM Bridge tool, export was a breeze.  We were able to extract all the files and metadata from CONTENTdm in a way that Hyku would understand.  But how to get the described works into Hyku?  The native Hyku batch import did not provide a solution, since it applied identical metadata to each item.  The records we wanted to bulk upload have complete, individual descriptions. We soon learned that this kind of desired bulk import was a much more complicated task, and reached out to Notch8 to find a solution.  

    With Notch8’s help, we investigated HyBridge (the import counterpart to CDM Bridge’s export), Cdm_Migrator, and Bulkrax as potential bulk import solutions for Hyku Commons.  We selected Bulkrax for our project because it seemed to work best for our multi-tenant environment and was easiest to configure within our setup.

    According to the Samvera Labs webpage, “Bulkrax is a batteries included importer for Samvera applications. It currently includes support for OAI-PMH (DC and Qualified DC), XML, Bagit, and CSV out of the box. It is also designed to be extensible, allowing you to easily add new importers into your application or to include them with other gems. Bulkrax provides a full admin interface including creating, editing, scheduling and reviewing imports.”

    Check out this poster from Samvera Connect 2019 for more information about Bulkrax.

    Bulkrax poster by Keving Kochanski, used with permission

    After working with Notch8 to install and update Bulkrax into Hyku Commons, we viewed developer-supplied walkthrough videos (like this one) and wiki documentation to get a better understanding of how to use the CSV importer.  It is now possible to bulk upload to Hyku Commons with Bulkrax by importing a zipped folder containing a folder of files and a properly formulated CSV file.  The CSV contains rows for each object’s descriptive metadata.  Additionally, the first four fields are administrative fields, which govern how the importer imports the files. 

    Administrative Fields

    • item – Lists the name and extension of the item being imported, such as file.jpg. 
    • source_identifier – Establishes a persistent identifier for the object being imported. 
    • model – Identifies the worktype the work will be created as. 
    • collection – Determines what collection(s) the work will be added to. 

    One of our challenges is the lack of step by step documentation for these processes.  It’s a complex process and a tad finicky, so a very detailed guide would be helpful.  Another is the need for separate parsers, and the intervention of a developer to create them, for custom worktypes.  For our bulk upload to work for our OER worktype, for example, separate work had to be done to add the parser and to allow the relationships between items mentioned in the last post.  Lastly, there were a few oddities along the way that we reported and were added to the Bulkrax project board so that they can receive feedback from the community.

    In considering bulk capabilities for our project, the next step is to look towards bulk export with Bulkrax. This functionality currently exists in limited capacity, but it is in further development for wider usability.  In keeping with the “easy in, easy out” theme, there are many use cases in which we’d desire the ability to export metadata as well as files from the Hyku Commons tenants. Stay tuned for additional developments on this process!

  • Bulking Up

    This post is the first of a two-part look at bulk upload and data remediation in Hyku. Part one is going to take a look at the background of bulk operations and why they are difficult to do well. Part two will talk about our specific work to try and address some of these needs in the Hyku for Consortia project. (Photo above by Ryoji Iwata on Unsplash)

    Bulk operations in Hyku have a long history. In the initial user survey, conducted way back in 2015, one of the main findings was that Hyku needed to support the “easy in and out” of metadata. Metadata migration/remediation/transformation has always been a major activity in libraries. Think back to what an enormous task retrospective conversion of card catalogs to MARC was. Any library system containing metadata has to be able to manage that data at a large scale. 

    The design team for Hyku knew that bulk operations would be a key element to allowing potential users to commit to migrating out of their current tools. Hyku entered a market with a number of existing repositories. This new solution might have been able to solve many of the community’s frustrations with those tools, but only if there was an easy way to migrate to it. The initial requirements and personas therefore both reflected the needs to tools to upload and transform metadata from one system to another. Mockups reflected the need to both migrate data as well as remediate it.  

    Summary from an early Hyku design document

    This work was then reflected in Github issues during the project development (see: https://github.com/samvera/hyku/issues?q=is%3Aissue+is%3Aopen+bulk), but other more basic needs for repository development (you need a repository to migrate data to, after all) took a higher priority. So a new grant project called Bridge2Hyku picked up where development left off and explored the issue of migration in more depth (https://bridge2hyku.github.io/). Our colleagues at the Bridge to Hyku project did great work analyzing not only how to upload data and objects to Hyku, but also how to get it out of some of the major repository systems currently in use.

    All of this work then…but why is metadata migration and bulk creation/upload so difficult? 

    The nature of structured data is what makes it so powerful: you can index and search it, you can compare like to like, you can organize and sort. In short, it makes order out of chaos. And as humans, that’s what we naturally do: recognize patterns. But, also like humans, we might all see the world slightly differently. So different metadata schema and repository systems can have their own way of seeing the world. Some are quite simple and allow for the same basic type of description of everything. Others are quite granular, allowing for more nuanced description of subtle details that can be important and powerful. So any system to migrate or convert from one system to another typically relies on a lot of human intelligence to see the patterns and make the connections.

    Photo by Luke Chesser on Unsplash

    But human capacity is only so much. How do you analyze thousands of records? Analytical tools like Open Refine can be helpful. So can guidelines for general rules on the major categories of migration as shown in crosswalks from other projects. But, as these examples perhaps show, these tools are not simple and not necessarily easy to pick up and learn. So any migration process is either going to require a lot of manual intellectual effort, or the creation of new tools to help with this business of organizing and translating.

    The quirks of particular systems can also provide barriers. You may come up with a great crosswalk that works for one system, but doesn’t capture the nuance of another. Within Hyku for example, all works are sorted into worktypes. These types define the metadata schema used, the relationships between objects that can be created, and in some cases, the way that the object itself is presented and handled within the repository. 

    Data from other systems that don’t use this type of organization then require an extra step to define the worktype data should be migrated to. The system that data is coming from can also prove a barrier. Some systems are opaque making it hard to know exactly how data is stored. Others make it difficult to export data out. Many systems can provide an XML feed of records through a tool using the OAI-PMH protocol, but these are then just records, not objects themselves. Others might use a newer protocol like ResourceSync for export, but may be incompatible with systems still relying on OAI-PMH.

    Finally, issues can come from the very nature of materials themselves. A particular challenge we’ve had with migration relates to the inter-relationships between objects. As I’ve talked about before, and will likely write about on this blog in the future, one of the key needs we found to assist in the uptake of Open Educational Resources (OER) is the availability of related teaching tools or ancillary materials. A freely available textbook is great, but if there are also related quizzes, videos, or lecture slides, an educator has all they need to make the switch. In order to make these materials visible in an OER repository, we need to have the ability to define lots of different types of relationship like “translation of”, “part of”, or “replaced by” (for new editions). 

    Creating these relationships may be easy when materials are being uploaded as they are created on an ad-hoc basis. But migrating them to a new environment presents a new challenge: how do you create a relationship between materials that may be next in the queue to be created? There isn’t a simple solution. For us, it’s meant creating some new code to handle the creation of relationships as a second step in the data migration process. The point of this example isn’t necessarily the solution we found to this problem, but the acknowledgment that many other types of materials may present their own unique needs. While uniformity and standardization is good, it’s the balance between standardization and diversity that makes a repository useful.

    So bulk operations in repositories is a hard nut to crack. There are similarities in any migration or conversion, but there are also a lot of specific challenges to every situation. In our next post, we ‘ll talk about the development of bulk upload functionality for Hyku Commons and how we addressed challenges in our own work.