Wednesday, October 11, 2017 - 11:26

The Digital Scholarship Unit is happy to promote the University of Toronto's campaign for Cyber Security Month. A page with resources for staying safe in the Uof T context is available online.  Our colleagues in IITS are always a great source of information for Information Security on the Scarborough Campus. The DSU is always happy to work with you and your subject liaison to create more secure workflows and backup strategies. Please contact us if you are interested in learning more about our preservation and security strategies for the special digital collections held at the UTSC Library.​

Thursday, September 28, 2017 - 15:44

The 16th Tamil Internet Conference was held at UTSC in late August, and two DSU personnel gave presentations on Digital Scholarship Tools and Techniques and Towards Building Ontologies for Linked Open Data in the Tamil Context (presentation in Tamil). Learn more by accessing the slides via these links in Tspace. 



Wednesday, September 13, 2017 - 13:57

We are always excited to see the students return to campus for the start of fall term. This fall, the DSU is involved in several pedagogical intiatives in the classroom. We're proud to support the Nearby Studies initative through Chris Berkowitz by maintaining a custom course website, through which students submit Oral Histories to a growing corpus of material covering the lives of locals as well as write and comment in a cross-disciplinary space. We are also happy to be present in Anne Milne's English courses this fall, through the DSU-hosted Hogarth Project. This project will be adding another set of images for students to annotate. Jayeeta Sharma will also be using materials hosted on her EHRN site in the classroom this fall. We have also heard from others who are using the materials from our Pedagogy Project in the classroom this year. We are grateful to the UTSC Library Liaisons for all of their hard work making these collaborations a success. If you are interested in seeing how DSU infrastructure or materials can help further digital pedagogy in the classroom, contact your Liaison Librarian.


Wednesday, August 30, 2017 - 10:37

The DSU have been collaborating with the Islandora CLAW team by taking part in the discussions, providing use cases, participating in the sprints and contributing to testing. For the last two weeks, Natkeeran and Marcus have also been taking part in the Islandora CLAW Sprint.


During the spring, Nat developed an ansible role in order to install activemq for the claw-playbook that's used to easily install Islandora CLAW.


About Islandora and Islandora CLAW

Islandora is a Free and Open Source Software that several institutions, companies and individuals collaboratively build.  If you are interested in learning more about the platform or contributing to its development, please contact us or the CLAW team.   


Islandora CLAW (CLAW Linked Asset WebFramework) is a next generation digital repository platform.  It is being designed to work with Drupal 8, Fedora 4, Apache Solr, Blazegraph and other related technologies.  It is a full fledged Linked Data application platform.


Tuesday, August 15, 2017 - 12:35

We are very excited to see the initial release of the Move to Islandora Kit tool (v0.9.0). Our very own developer Marcus Barnes along with Mark Jordan from Simon Fraser University are the current maintainers for this module.

The Move to Islandora Kit is an extensible PHP command-line tool for converting source content and metadata into packages suitable for importing into Islandora (or other digital repository and preservations systems). We find it to be a very powerful, extensible tool with a lot of potential applications and uses for migrations across many different systems.

Release page:

Cite the code:


Thursday, July 20, 2017 - 12:40

The unit is proud to announce that we have an article published in the most recent issue of Code4lib (37). Our Article is titled Annotation-based enrichment of Digital Objects using open-source frameworks and describes the technical details of the Unit's work on the Web Annotation Utility Module in more detail. This software supports several projects for the library, and has benefited from the support and use of UTSC Faculty. 

Wednesday, July 5, 2017 - 15:44

There is now a release of our Oral History Module and Web Annotation Utility Module compliant with the 7.x-1.9 version of Islandora - Take a look, and let us know if you notice any issues.

Oral History Module

Project page Release page On Github

Web Annotation Utility Module

Project page Release page On Github
Wednesday, June 21, 2017 - 10:53

This week we're attending the Joint Conference on Digital Libraries, an international conference jointly sponsored by IEEE-CS and ACM. This year, the conference is being held at the University of Toronto. Kim served on the organizing committee as Tutorial Chair to evaluate and select tutorial proposals for the conference. The schedule is packed with a number of interesting sessions with great topics such as web archiving, indexing and enhancing digital collections. Check out the schedule here.

Wednesday, May 24, 2017 - 15:43

We're back from IslandoraCon in Hamilton, ON and had a great time collaborating with over 100 local and international library and memory institution colleagues. Our presentations on the Oral History Solution Pack and Web Annotation Solution Pack are now in Tspace. Thanks as always to our UTSC Faculty collaborators for providing the use cases for this software. There is a lot to look forward to in Islandora, including a Fedora 4 compliant version that is now in alpha!  

Wednesday, May 10, 2017 - 13:25

If you're looking for Library DSU members next week, most of us are out of the office at IslandoraCon in Hamilton, Ontario from May 15-19th. We'll be hacking at the Hackfest, Presenting on both the Web Annotation Utility Module and Oral History Solution Pack, and generally meeting and greeting the librarians, developers, and systems administrators we work with all the time and rarely get to see face to face. 

Wednesday, April 26, 2017 - 14:10

If you are at the University of Toronto and attending this year’s TechKnowFile conference (“everything IT at U of T”) then we'll be presenting on the unit in IC 212 (4-4:50).

If you’d like to see our slide deck from the presentation, we’ll upload it this week to our Tspace collection

Wednesday, April 12, 2017 - 11:19

We’re in full on Islandora testing mode in preparation for the upcoming 7.x - 1.9 release and the release of our Web Annotation Utility Module and our Oral History Solution Pack prior to IslandoraCon.

As part of our testing, we found we needed the ability to have multiple users simultaneously accessing the same VM. Our systems administrator showed us a neat trick for those of you using Islandora VMs for testing and development. So, here it is:

Irfan's cool trick for letting others into your VM: With your Virtual Machine off, go to settings/network. There are slots for 4 adapters. By default the drop-down is set to "Nat." Change this to "Bridged Adapter" and start machine.  Login to the machine using the interface provided by the VM and find your IP address by running ifconfig -a | grep inet Provide this address to others. The IP + :8000 is Drupal for the VM provided by the Islandora release team (for example, Note the following: You now login (ssh) at vagrant@IPaddress (like vagrant@ Your IP might change, and you may to have to find the address again  Your network may change, causing you to have to run sudo /etc/init.d/networking and restart to update the machine's IP address This may have some unintended affects when performing Drupal functions

Overall, YMMV, but this has been very useful to us when testing. 

Wednesday, March 29, 2017 - 14:51

Our own Kim Pham has returned from Code4Lib 2017 in Los Angeles, where she presented a unit poster describing our work on the Web Annotation Framework (a solution pack with a release coming soon!). You can view the Poster in Tspace.


Thursday, March 16, 2017 - 12:38

We're happy to announce that a new publication by the unit, titled Supporting Oral Histories in Islandora is available in the January issue of Code4Lib

"Since 2014, the University of Toronto Scarborough Library’s Digital Scholarship Unit (DSU) has been working on an Islandora-based solution for creating and stewarding oral histories (the Oral Histories solution pack). Although regular updates regarding the status of this work have been presented at Open Repositories conferences, this is the first article to describe the goals and features associated with this codebase, as well as the roadmap for development. An Islandora-based approach is appropriate for addressing the challenges of Oral History, an interdisciplinary methodology with complex notions of authorship and audience that both brings a corresponding complexity of use cases and roots Oral Histories projects in the ever-emergent technical and preservation challenges associated with multimedia and born digital assets. By leveraging Islandora, those embarking on Oral Histories projects benefit from existing community-supported code. By writing and maintaining the Oral Histories solution pack, the library seeks to build on common ground for those supporting Oral Histories projects and encourage a sustainable solution and feature set."

Check us out in issue 35: online and open source! If you're interested in the module, check out the code on Github.

Thursday, April 14, 2016 - 08:59

[The following post is shared on behalf of Haoran Wang, one of the practicum students hosted by the Digital Scholarship Unit this term.]

Make Metadata Discoverable via the OAI-PMH in WorldCat

For the past few months, I was working as a practicum student at the University of Toronto Scarborough Library trying to figure out how to utilize the OAI module to help the Digital Scholarship Unit (DSU) make their metadata discoverable via the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) in WorldCat. Based on what I learned from the course INF2186 Metadata Schemas & Apps at UofT, I already had some basic knowledge of how to share metadata with a local, state, or regional digital metadata repository, and expose current metadata for OAI harvesting. This tutorial will teach you how I did this step by step.

Let’s start with some basic terms.

Step 0 - Terms to Get Started

Open Archive Initiative (OAI) is an initiative to develop and promote interoperability standards that aim to facilitate the efficient dissemination of content.

OAI Protocol for Metadata Harvesting (OAI-PMH) is a lightweight harvesting protocol for sharing metadata between services. In the OAI context, harvesting refers specifically to the gathering together of metadata from a number of distributed repositories into a combined data store.

There are two classes of participants in the OAI-PMH framework:

Data Providers administer systems that support the OAI-PMH as a means of exposing metadata; and Service Providers use metadata harvested via the OAI-PMH as a basis for building value-added services.

Data Providers (open archives, repositories) provide free access to metadata, and may, but do not necessarily, offer free access to full texts or other resources. OAI-PMH provides an easy to implement, low barrier solution for Data Providers.

Service Providers use the OAI interfaces of the Data Providers to harvest and store metadata. Note that this means that there are no live search requests to the Data Providers; rather, services are based on the harvested data via OAI-PMH. Service Providers may select certain subsets from Data Providers (e.g., by set hierarchy or date stamp). Service Providers offer (value-added) services on the basis of the metadata harvested, and they may enrich the harvested metadata in order to do so.

Basic functioning of OAI-PMH

The OAI-PMH protocol is based on HTTP. Responses are encoded in XML syntax. OAI-PMH supports any metadata format encoded in XML. Dublin Core is the minimal format specified for basic interoperability.


The diagram below is the overview and structure model of OAI-PMH.

Step 1 - Set Up Your OAI Module

The DSU currently use Islandora, an open-source, OAIS-based digital preservation repository and asset management system built on Drupal. First of all, going to the DSU home page, select Islandora ←  Islandora Utility Modules ← Islandora OAI from the navigation bar.

Then, the OAI module allows you to configure your URL path to the Repository. In this example, the base URL is If you want to see more records on your base URL, input the number you want to see under the “Maximum Response Size”. The default number here is 20 records per response.

Click on the Configure button below, you will find more setting configurations based on OAI request handler.

In OAI request handler, select the dc.identifier.thumbnail. If selected, a URL to the object's thumbnail will be added as a dc:identifier.thumbnail if the object has a thumbnail.

The DSU currently use MODS for all generic content going forwards - in past DSU used Dublin Core, but Islandora natively prefers MODS and it's more flexible for complex objects. For all fields that you want to display in WorldCat, you have to configure the metadata fields so that they are mapped to Dublin Core. Thus, I choose to transform MODS to Dublin Core.

Services like WorldCat expect links back to the object such as a Handle URL. If your metadata doesn't have this, self transforming XSLTs can be used to add specific elements tailored to individual needs.

Make sure you save all the settings in the end by clicking the Save Configuration button.

Step 2 - Test Your Base URL

OAI-PMH supports six request types (known as "verbs"). You can use them by simply adding these verbs after the base URL.

URLs for GET requests have keyword arguments appended to the base URL, separated from it by a question mark [?]. For example, the URL of a GetRecord request to DSU base URL that is could be:


Here is an explanation of all six request types:

GetRecords: This verb is used to retrieve an individual metadata record from a repository. Identify: to retrieve information about a repository. ListIdentifiers: retrieving only headers rather than records. ListMetadataFormats: retrieve the metadata formats available from a repository. ListRecords: used to harvest records from a repository. ListSets: used to retrieve the set structure of a repository.

After you have exposed content types and some fields, your repository is available at /oai2

Some example requests are as follows:

Step 3 - Build a Gateway from WorldCat to Add Records from OAI-PMH

In order to use the Gateway, be sure that the following conditions are met:  

Your OAI-PMH compliant repository is running.   You have one or more existing collections with metadata fields mapped to Dublin Core and/or Qualified Dublin Core (dcterms).   You have an OCLC-supplied Key for the Gateway.

If your institution does not already have a Gateway account, Go to the Gateway registration page at to register your account.

In a few days, OCLC will send a welcome Email that includes user credentials that you can use to log in to the Gateway and add additional users. After you have registered and have received your Gateway user credentials, you can log in to the Gateway and begin synchronizing metadata with WorldCat from your OAI repositories.

After you have registered your account with the Gateway, you need to associate your repositories with the appropriate Gateway key.

Go to the Gateway login page and login If you’re not already in the Manage Account tab, click to select it now. Click Keys and Repositories. Click to select the key for which you want to add repositories. Click Add Repository. You’ll see a display something like this:


6. When the Add Repository window appears, enter the OAI-PMH base URL for the selected repository. Then click Test.

7. When the repository has been tested successfully, Gateway displays the message “All OAI tests passed.” You can now click Add to associate the repository with your Key.

8. After you have successfully added the repository, you’ll be able to edit and manage settings for the repository you just added.



In the Repository area of the page, the following information is displayed:  

Institution symbol (OCLC symbol)   Gateway license key   URL, name – The OAI-PMH base URL and name of this repository Type – To change the repository type, use the pull-down menu to select one of the following: CONTENTdm (pre version 5), DSpace, Fedora, Eprints, Digital Commons or other. After changing the type, you must click Change to save your choice.

You can use the Show Sets in Collection List? Pull-down menu to configure the way the Gateway harvests content from a repository.

Your OAI repository allows you to manage sets (collections of records) separately in the Gateway. Using sets is the default approach. In the Gateway, a set name is the same as a collection name.

By default, Show Sets… is Yes. This default setting allows you to set up different metadata maps for each collection (or set) in a repository.

If you want to create a single metadata map for all records in your OAI repository, regardless of what collection the records are in, you can select No from the pull-down menu. Selecting No will create a special collection named Entire Repository. When you create a metadata map for that special collection, your mappings apply to the entire repository.  

Note: If you select No, you cannot subsequently undo that setting in the Gateway. For this reason, we strongly recommend that you do not change the default setting. Moreover, with multiple sets you may choose to apply one profile to several (or all) of the sets at any time.

In this example, I select No to apply mappings in the entire repository.


9. Since your license key may be used by more than one Gateway user, you can assign users with that key to particular collections. The users can then map metadata and synchronize with WorldCat for each collection to which they are assigned. Then you have to select the type of record processing for this collection and prepare your collection for synchronization with WorldCat through the Gateway.

Then on homepage you will find several sections:  

Collection Details – In addition to the general information displayed for this collection, you can set the WorldCat Record Processing type, collection-level record, and more.   Sync Details – You can edit the synchronization schedule for this collection, view its synchronization history, or view a synchronization status report. Metadata Map – You can click the link to edit the collection’s metadata map.  


Congratulations! Now you will be able to see your collections in the WorldCat.

The entire repository is avaliable at:

QA Analysis for Current Repository

Finally, I also did a quality assurance analysis for DSU’s repository. As you can see, the total DC completeness is 73.12%. Some collections need to add dates.

Enjoy your harvesting!


Lagoze, C & Van de Sompel, H. (2002). The Open Archives Initiative Protocol for Metadata Harvesting. Available:


OCLC Online Computer Library Center, Inc. (2012). The WorldCat Digital Collection Gateway Tutorial. Available:


Tripp, E. (2014). Get Discovered: Sitemaps, OAI and More. Available:


Jackson et al. (2008). Dublin Core metadata harvested through OAI-PMH. Available:


Shreeves et al. (2006). Moving towards shareable metadata. Available: