Consuming Zotero Libraries into Drupal with Feeds/Xpath Parser

Part II of this post can be found here.

The Problem

Recently, organizers of the Berkshire Conference on the History of Women approached us because they wanted Drupal to display a Zotero group library populated with references to relevant digital scholarship projects, including the tags that had been associated with them by the conference organizers.

Figuring out an Approach

The Zotero plugin for Drupal requires Biblio, which was a little more furniture than we wanted in our Drupal site. In addition, it’s been a little while since the module’s received any love (stay tuned for news on that front)! New Jack Librarian’s 2012 post on loading Zotero into a webpage with PHP suggested a compelling and quick solution, but didn't really suit this non-bibliography-ish data, and it didn't let us leverage the Zotero tags in Drupal taxonomies. 

So, we sidestepped the plugin using Drupal's feeds Xpath parser, which allows you to use Xpath to select XML nodes in an incoming feed, and map them to fields in a Drupal content type. 

The Setup

Here’s our endpoint. A page with content that has been brought in from a Zotero library. I’m not linking to the page, because it’s likely to change location. 

1. First, we needed the right content from Zotero. If you go to a Zotero Group Library, view the whole library, and then subscribe to the feed, the default query is as follows:

https://api.zotero.org/groups/[group-id]/items/top?start=0&limit=25

So, it's not all of the content, and more significantly, we don't get the tags.

After playing with the API documentation we decided on the following

https://api.zotero.org/groups/[group-id]/items/top?start=0&format=atom&content=mods

Tags were mapped as follows in MODS

 <default:subject>
 <default:topic>[tag]</default:topic>
</default:subject>

2.  Once we could get the content we needed from Zotero, built two content types: one to process the feed, and the other to store the resulting nodes. 

Our "Zotero Custom Feed Item" was set up with all the fields we needed to leverage in a view at the end. 

3. After our content types were set up, we navigated to Structure > Feeds importers > Add importer and we created a feeds importer attached it to the Zotero Custom Feed Processer content type we created in the last step. Here are our basic settings for the feeds importer.

Of particular interest here are the mappings, which use Xpaths as their source. Note also that we needed a unique element. We used the Title, which is not really ideal, but worked well for our content. Any unique field will work. Also important here was that we want Zotero tags to search the Drupal Taxonomy Vocabulary "Tags," and when a term does not exist, we want the importer to create it. 

4. Once our importer was set up, we created a Zotero Custom Feed Processer node. This is where we put our actual Xpath queries, and the url to the feed we want to parse, and imported the content. 

* Oxygen wrote the Xpaths for me. 

When you import from your Feed, you can dump out the results of your queries. This will help you troubleshoot if you’re not getting the results you want. 

Finishing touches

We used Drupal Feeds Image grabber to get an image into most of the content. It’s a crapshoot - sometimes the image is spot on, and sometimes it’s pretty random. I’m sure there’s tuning that could be done. Though it’s a little outdated, there’s a great screenshot/tutorial for FIG here

We set up the view to display our content, and used Taxonomy menu block to give us a navigation panel for the tags. Feeds Tamper helped us clean up some of the output.

One problem we had that we didn’t anticipate is getting the content to auto-update. As far as we can tell, you need to map a unique element to NodeID in order to get nodes to update (and to prevent a feed from creating duplicates). We tried with zapi:key from our feed, but it threw a bunch of sql errors relating to content typing and uniqueness. We tried several other tamper plugins to try to generate hash or otherwise uniquely identify items in the feed, but so far no dice.

Sadly, we can't spend more time on this at the moment, but there are thankfully minimal changes to the library at present (and it’s a two-click operation to delete all existing nodes and just refresh from the feed manually). We're hoping to return to this before the conference, because our plan is to use a similar process to bring in other projects we find and keep the conference feed alive, and it would be great for this to auto update.

For now, I was really happy for the opportunity to play with the xpath parser plugin. Please feel free to contact me with errors and addendums, or just to commiserate if you're facing a similar set of challenges!