Drupal in Libraryland, Part 1: Challenges

Nina McHale

The last weekend of June, 26,000 librarians descended upon Chicago to attend the annual meeting of the American Library Association. As always at “Annual,” as it's affectionately known, part of the draw and the excitement was the presence of 693 library vendors assembled on the exhibit hall floor at McCormick Place. Book publishers of all stripes comprise a good portion of the exhibitors, but also out in force are companies who provide electronic content and software platforms for library services. Not a single one, however, provides a web product for libraries that offers a totally seamless web experience for library customers—as in, the people who use libraries, not the librarians and library staff themselves.

Why is this? Well...it's complicated. First, the notoriously proprietary legacy systems at the heart of most library technology ecosystems make it difficult to provide a single interface to all library content, resources, and services. Secondly, lack of web development skills among staff and access to robust web development environments locally is often a barrier to modern and innovative practices. Finally, it's not that there isn’t anyone out there, trying to create a better library user experience; in fact, the use of Drupal in libraries has grown significantly in recent years. However, the lack of coordination in open source development generally is a problem that is passed down to the library-specific Drupal community as well. These three problems stand in the way of solutions that could provide a flood of relief for libraries wanting to provide better web experiences to their clients.

So, let's start at the beginning by tracing the evolution of the typical library technology ecosystem. In the 1960s, Henriette Avram, a computer analyst at the Library of Congress, developed the Machine Readable Cataloging (aka MARC) data standard. MARC facilitated the sharing of information about books in computer files, which may not sound like much now, but at the time, it was nothing short of revolutionary. Catalogers, the librarians who create the local inventories of books and other library items, could share, and eventually download, MARC records to prevent duplication of work. Since 1973, MARC has been the international metadata standard in libraries. The 1970s and 1980s saw the advent of the Integrated Library System (ILS), the type of computer system that, relying on the MARC record, to this day supports the behind-the-scenes activities of ordering, organizing, and tracking library materials. Currently, there are only a dozen or so ILS vendors, but according to Marshall Breeding, in 2012, they did 770 million dollars' worth of business worldwide.

In the mid-1990s, ILS vendors began adding a public interface, or OPAC (online public access catalog) which library customers could use to search for, and place holds on, books and other library materials. However, almost twenty years later, few OPACs provide a web experience equivalent in ease of use to modern ecommerce sites. This was compounded a few years later by the entrance of electronic content into the library market: other vendors began providing licenses to access to magazine and journal content online. This accounts for the lion's share of many library budgets, particularly in academic libraries. A typical library web site around the year 2000 was a simple static site with links to the OPAC and a handful of these subscription magazine and journal “databases,” as they are commonly called by librarians. A few extra HTML pages containing location, hours, contact, and policy information rounded out the typical library site.

This was all well and good until the subscription database market exploded. Linking to one or two external interfaces wasn't a problem, but it's now common for libraries to provide access to hundreds of them, each one with its own user-unfriendly search interface. The desire then arose for a means of providing a single search interface; library customers wondered, “Why can't the library's web site be more like Google?” This resulted in a product that vendors called “federated searching,” which was an interface that attempted to integrate library catalogs (regardless of the brand of ILS in use by a customer library) and subscription database content. However, lack of adherence to metadata standards made simple things, like searching for publications in a range of dates, impossible. Many librarians disliked how the research process was “dumbed down.” Also, no product was initially able to integrate ALL of the subscription databases into an integrated search. The fact that federated search products took tens of thousands of dollars out of library budgets, while providing no actual content, won them no favor as well.

From the ashes of federated search tools rose the “Next Generation Catalog.” Fed on the promises of Web 2.0, the NextGen catalog sought to refrost the ILS cake and make it look and act more like Amazon.com. It added book jacket covers (previously available as additions to the basic ILS through yet another library vendor) and the ability to write book reviews, tag items, give them a star or other rating. In most libraries, these features have never been heavily used. The price tag for NextGen catalogs is also in the tens of thousands of dollars. The current iteration of these metasearch products is often called a “discovery layer service.” There are a handful of open source tools available that attempt to make sense of the silos of physical and electronic library content as well, but fear, uncertainty, and doubt, accompanied by the lack of technical knowledge and skill, as well as the desire for 24/7 technical support provided by proprietary products, often wins out. Additionally, platforms like Blogger and WordPress were adopted by librarians for blogging, which paved the way to adoption of broader implementation of library-wide CMSs.

Separate from all of these developments in managing and providing access to the actual STUFF that libraries are traditionally made of—information about books and other physical materials, electronic copies of magazine and journal articles—was that these systems did not provide an easy means of offering up local information; in addition to any of the services described above, a web site was still needed to provide that local information (locations, hours, promoting services and events) and to tie all of the silos together. Lack of web developers on staff in libraries has lead to another curious development in the library software and web platform market: a proliferation of mini-CMSs that provide functionality or services. For example, LibGuides (by SpringShare) is a library-specific content management system that allows librarians to organize resources around a particular topic into “guides,” resources about a specific topic, for library users. Evanced's Events Management is a platform that provides registration and room booking for library events, whether storytimes or author readings in public libraries or research instruction in academic libraries. These mini-CMSs are usually reasonably priced (via annual subscription) and well-supported, but coordinated open source development could solve these issues easily and without adding additional silos to the library user experience. Library customers can easily move through half a dozen separate web domains to accomplish what should be simple tasks.

Looking back at all of the pieces, one sees a pattern emerge: library vendor products all arose based on specific needs, and at no point has the ecosystem been effectively and comprehensively re-evaluated with the end user in mind. The specific need to be filled is almost always based on the internal need of a librarian or staff member, rather than an end user. It comes as no surprise to see that Drupal module development for libraries has centered on integration with third-party products. There are multiple modules for integrating content from OPACs (complicated by the different dominant brands being used) to the “mini-CMS” products like LibGuides and Evanced. This all strikes me as backwards, however; why pay for, and then reverse engineer, proprietary products when these resources could be used more efficiently to develop open source components that accomplish the same purpose? Why pay for the proprietary product in the first place?

This “spec-it-out-and-buy-it” mentality is in part the result of understaffing in technology areas, particularly web development. In recent years, many libraries have faced shrinking budgets, staff layoffs, and even closures, which puts modern web development practices low on the list of necessities. Also, available web development environments are all too often tightly controlled by parent institutions-city or county IT units for public libraries, and campus IT departments for college libraries. Local development of open source web tools often feels out of reach.

Finally, lack of coordination among open source developers is a problem in the open source community in general, so it’s not surprising that this persists in the Drupal developing library community. There are a number of venues for sharing and networking for library-based Drupal developers, which include the drupal.org “Libraries” group, the drupallib web site; the American Library Association’s Drupal Interest Group; and the drupal4lib mailing list, but none of these currently serve as an official leadership venue to encourage communication or collaboration.

My new colleagues have reassured me that there’s nothing unconquerable about all of this; Aten specializes in web development for non-profit organizations in a way that releases chunks of Drupal code back to drupal.org so that other organizations similar to our clients can make use of what we’ve made. Other non-profits face similar challenges, and in Part 2, we’ll take a look at some possible solutions to the challenges outlined here for libraries.

Photo credit: Virginia Alexander