From a brief talk given 8 November at the SPARC 2010 Digital Repositories Forum:
Hello, I’m Jay Datema, associate director at the Bern Dibner Library, Polytechnic Institute of NYU. I’m honored to be included in this year’s Innovation Fair at the SPARC conference. I have two minutes, so I’ll keep it short.
My poster is entitled “Full Circle Research: Occam’s Razor for Collection.” As many of you know, Occam’s Razor is a principle taken from the philosopher William of Ockam, who posited that “when several theories model the available facts adequately, the simplest theory is to be preferred.” Â This principle dates back to the 1300s, so it’s had some time to prove itself. Institutional repositories, on the other hand, are just a decade old.
Simply stated, my poster shows that research is a process that starts with an analysis of publications, which of course will then produce more publications. As Samuel Johnson said, “The greatest part of a writer’s time is spent in reading, in order to write; a man will turn over half a library to make one book.” What is the online equivalent? I suppose it would have to be endless surfing of bibliographies, databases, and PDFs. Research only ends when your attention span falters or a deadline awaits.
Thus, the most important component of a repository is not the interface layer or the business logic.To adapt Bill Clinton’s 1992 campaign slogan, Â It’s the data, stupid. Not necessarily datasets, but the base unit of scholarly research: the citation. I’d like to credit Mark Leggott and his Islandora team for restarting the collection process around a complete citation set.
At Poly, we’re engaged in a project to collect all the citations of the faculty. We’ve turned over many databases, CVs, consortial collections, and citation tools. It is our belief that using the librarian’s tools of authority control and complete records, we’ll have something in the aggregate that faculty will be delighted by and that the Office of Institutional Research will be able to use too. As Joseph Nadan, inventor of EZ Pass (and a Poly professor) said to me, it was great to find my long lost article on REVIS.
We think Occam would be proud of our impulse to use Zotero for capture, Mendeley for PDF extraction, and Endnote for duplicate detection. Using these razors, we now have flexibility to put the data into our library website which uses Drupal, and any current or future systems that can make good use of the data for preservation, collection analysis, or citation ranking. All of these tools have robust support for export formats, so we’re not in danger of recreating the Roach Motel.
We’re also excited about adapting tools devised by the open source community to library purposes. SOLR was invented at CNET to handle large volumes of news items, Drupal was invented to share information about wireless networks, and the Biblio module for Drupal was created to share a single institute’s publications. Since software has the shelf life of a banana, we’re focusing on being smarter monkeys. And we’re watching BibApp, VIVO, and Islandora make exciting headway on orthogonal problems. We’re not there yet, but we’re making good headway. I look forward to discussing this further. Thank you.
If you can read the above text in two minutes, then you’re faster than I was.
Innovation Fair from SPARC on Vimeo.