Thursday, October 2, 2014 - 11:40

Date: Monday October 6th, 2014

Time: 4:00 pm to 5:30 pm

Location: Doris McCarthy Gallery, University of Toronto Scarborough 

Co-presented by the Art History Program, Chemistry Program, and the Doris McCarthy Gallery

Calling all art and science students!

Attend this FREE workshop to learn about the role of conservation in technical art history - led by Art Gallery of Ontario conservators:

Margaret Haupt - Deputy Director, Collections Management and Conservation

Maria Sullivan - Manager, Conservation


Monday, September 1, 2014 - 20:53

I feel no shame in the sense of accomplishment that I got from learning how to make a solution pack module for Islandora!

Working at the DSU has given me the chance to explore the possibilities and push the boundaries of what I'm capable of. It was great because I came out of the process learning a lot more about the mechanics behind Islandora, Fedora and Drupal and how the existing processes faciliate the lifecycle of a digital object in the system.

To me, a solution pack module is like the blueprint used to set up a digital object factory in Islandora city™. The module specifies what objects to produce, how they should look and what data types and packages are associated with an object. This is all done the programming languages understood by Drupal and Islandora.

When it is installed, the Living Research Lab solution pack designed for the DSU creates 5 collections (Births, Mice, Protocols, Publications, Experiments) that is used to manage and add objects.  The objects in each collection are ingested using specially designed forms in Darwin Core Standard (Mouse, Birth, Protocol, Experiment, Publication, and Data Session). What I've done is a really basic setup.  You can and should add hooks and configure so much more to streamline the way your data is ingested, processed and accessed.

The github link for the Living Research Lab Solution Pack can be found here:

I should note that so far module works well in development mode, but it is in definite need of debugging and refining before it can actually be used in a production environment.

It began with XML

In one of our projects we needed to build custom XML forms to store metadata in Islandora. We soon realized that customization would be really important in order to automate some of the actions and to improve usability. Building a solution pack was the next logical step to explore customization.

Tips to get started Look at other solution packs! When you're starting with nothing, it helps to look at something. What I did was look at simple examples, going through and breaking down the code line by line, and trying to trace the lines in the code and when each step is called in Islandora. Draw. Creating diagrams made it easier for me to make connections between different the little parts that contribute to the module's functionality What I used* Sandbox Virtual Machine Image ( - the testing environment Creating a shared folder from a local folder to the VM - that way I could modify module on my desktop Drupal's Devel module ( - Selected Display $page array, Display machine names of permissions and modules, Kromo backtrace). Used to debug and quickly re-install the module Islandora Documentation ( Sublime Text - the code editor Module Examples - a reference, comparing across multiple solution packs when you're trying to start your own helps Islandora Porcus ( - basic and heavily commented Biological Entity ( - usage of Darwin Core and multiple content models Biodiversidad ( - lots of customization and uses Darwin Core

Update 2014-09-11 - other useful resources:

PHP code checker: Documentation: The start of something

The first step I took to understand module building was to create map to see where and how variables and files were being referenced. What I soon discovered is that most files, functions and variables are initially called from the .module file. The .module file is the main file that contains the configuration and hooks and variables. It's in there where you add hooks, and where you wield a lot of power in if you know what you're doing.

The map below is a neater version to my original sketch.  It lists all of the hooks from the .module file and what files they refer to.  It also shows what component it affects in the Islandora system architecture.

Here are my original notes that went with the map when I was going through the Islandora Porcus module. I wrote out the names of all of the files contained in the module and within those files I wrote out all of the function names and tried to figure out what they did:

islandora_porcus.install > install and uninstall the module under Modules, module_load_include draws from islandora modules usually .inc islandora_porcus.module > everything comes together > SEE BELOW > info under Modules ./css/slandora_porcus.css > for Object View Page called from .tpl.php ./images/piggie.png > for Object View Page, called from ./includes/ > SEE BELOW > upload form > admin config page when you go to Islandora > Click on Module ./js/islandora_porcus.js > for Object View Page from .tpl.php ./theme/islandora-porcus.tpl.php > Object View Page, when you create a new object this is what appears ./xml/islandora_porcus_form_mods.xml > the form!

Questions: So content models customized for your own datastreams and way you want things to be what is a hook even - when you go to a page something (function, template) is activated?

function islandora_porcus_menu() > the admin menu item, getting things started and loading
function islandora_porcus_theme($existing, $type, $theme, $path) > the Theme is the Template - the Object View Page
function islandora_porcus_preprocess_islandora_porcus(array &$variables) > sets up variables to be placed within the template (.tpl.php) files. From Drupal 7 they apply to templates and functions
function islandora_porcus_islandora_porcusCModel_islandora_view_object($object, $page_number, $page_size) > first content model association, if object is a Cmodel that you specify here
function islandora_porcus_islandora_porcusCModel_islandora_ingest_steps(array $configuration) > hooks ingest steps
function islandora_porcus_islandora_porcusCModel_islandora_object_ingested($object) > calls
function islandora_porcus_islandora_xml_form_builder_forms() > load form
function islandora_porcus_islandora_content_model_forms_form_associations() > form association, for the form
function islandora_porcus_islandora_required_objects(IslandoraTuque $connection) > construct content model object, ingests it
function islandora_porcus_create_all_derivatives(FedoraObject $object) > derivative spec for the .txt uploaded object, potential file conversion
function islandora_porcus_transform_text($input_text) > for the .txt uploaded object
function islandora_porcus_upload_form(array $form, array &$form_state) > http://localhost:8181/islandora/object/porcus%3Atest/manage/overview/ingest this page to upload the file
function islandora_porcus_upload_form_submit(array $form, array &$form_state) > submit into what datastream

islandora-porcus.tpl.php connects to things from preprocess theme

Even if you don't want to create a module from scratch, this should help you modify existing modules in little ways that could make using Islandora easier.  Feel the power.

[*] My secret wish for tutorials is to list tools and strategies for troubleshooting.  That would help immensely.

Wednesday, August 27, 2014 - 10:28

Working at the Digitial Scholarship Unit this summer has been amazing. I can’t believe how complementary it has been for my education. I have learned so much in a practical sense but also in a much broader sense of real workplace experience in a library with digital initiatives. It’s hard to list all the skills and knowledge I have acquired from such an immersive experience but let me talk a bit about the highlights of this summer.

Right out of the gate I was given the daunting task of preparing metadata for 7000 plus photographs in UTSC’s Photographic Services Collection, one of the archival collections currently housed at the DSU. The end goal of the item level description was to create detailed metadata to be used in an online collection that intended to be representative of the collection and to compliment UTSC’s 50th Anniversary celebrations in 2015.

My first task was to design a workflow and a selection criteria for the series. I was assisted by two excellent work-study students, Vrishti Dutta and Mary-Ellen Brown, who took on the very exciting work of individually numbering each image. This was extremely time consuming but very necessary as it allows us to easily find and process specific images. The numbering, measuring and meta-data creation took approximately one month.

During this time I was also selecting images for the future online collection. Though, as my eyes started to hurt and my brain was sluggish I inexplicably stopped this practice halfway through. This turned out to be a blessing in disguise. By the time I finished processing everything I was able to look at the collection differently. Having seen every image, I had gathered a better understanding of what would best represent the collection and was able to create five sub-series. The sub-series or subjects are Student Life, Academic Life, Campus, Faculty and Staff, and Community. It was very exciting when we finally reached the last box and I realised that I had written metadata for over 7000 images. My feelings of brain-melting and elation are preserved via Twitter and reproduced here for future generations.

But even through all the brain hurt I could tell that this project was so useful. Here I was able create and execute a curation and digitization workflow. This provided me with a better understanding of what it means to work with digital initiatives. After the meta-data was finished, we used OpenRefine and exported the MODS spreadsheet lines as .XML files. I had assigned each image a digital identifier so we used that identifier to name both the .XML files and the .TIF’s of the images, creating neat little packages of images and their metadata. These packages were then ingested into Islandora over the course of a week and make available online.

In addition to processing the series I was also able to assist with archival reference questions. Some people were interested in finding interested photographs for the 50th Anniversary celebration projects. One user was interested in discovering why Pierre Eliot Trudeau had visited UTSC between 1979 and 1980, we were able to find a photograph and a newspaper article, though, no reason was given. Another user was hoping to find a photograph of her sister performing in UTSC’s student theatre for her upcoming 50th birthday party. When we were able to find a previously unseen photograph she was so happy there were tears and hugs. My previous experience with Archives had been as a researcher so it was exciting to be given the chance to be on the other side of the exchange.

To round out the experience I had the opportunity to attend workshops, camps and conferences. First, a workshop on AtoM and Archivematica see earlier blog post. A couple weeks back I volunteered to help out at Islandora Camp GTA. In between coffee breaks I was able to sit in on the admin track sessions. The sessions were expertly run and were very useful whether you were just starting out or building on foundation. I’ve already been working with the Islandora system for some time now but with the help of the sessions I was able to better understand what I had been working with all this time. I look forward to continuing to learn about all the customizable options in Islandora’s offerings.

Lastly, I was able to attend many of the sessions offered at the Digital Pedagogy Institute put on by the Digital Scholarship Unit at UTSC. There I learned about all the incredible educators, librarians, and faculty members that were making use of digital tools in their classrooms. So many of the presenters were embroiled in the most engaging projects and it was refreshing to see such novel uses for technology in the interest of learning and teaching. One great take away was how lucky we seem to be at the University of Toronto to have administrators and librarians dedicated to digital initiatives and how this kind of openness will only become more important as we continue to move in that direction. One attendee of the DPI told me that her takeaway was TTYL (Talk To Your Librarian). I’m constantly blown away by the expertise and commitment of people working in the fields where libraries, archives and technology intersect.

Some final points I’ve gleaned from my time here are a general but nonetheless important for someone starting in this field.

1) Professionalism, attention to detail and the ability to communicate well are key.

2) Staying connected to colleagues via social media is extremely useful, there are many interesting discussions and helpful tips that go down on twitter on a daily basis.

3) Whatever you don’t know, you can learn.

4) If you have a question, Google it first. Chances are someone else has already answered it.

5) Volunteer at conferences. They are excellent opportunities to learn and meet people. I can’t believe I made it this far without volunteering.

6) The archivist/librarian divide is often arbitrary in a digital context and the rivalry is dumb

I am so happy to have been given the opportunity to work at the DSU this summer. I feel as though my experiences here have taught me new skills and given me the confidence to pursue fields where I may have been uncomfortable before. I feel privileged to have worked with so many incredible people here, that they have let me to pick their brains and that they have offered me excellent advice for the future. Thanks to them, I have now secured part-time employment as the Digital Curation Intern at Information Technology Services (ITS) at Robarts Library this school year. I am confident that my time at the DSU has prepared me for new challenges both at ITS and after graduation. I am so very thankful for everything they have done for me this summer.

Rachelann Pisani

Wednesday, August 6, 2014 - 11:37

by Sara Allain

Islandora Camp GTA kicked off with nearly 40 librarians, developers, and archivists gathered at the University of Toronto Scarborough. Fortified by coffee and muffins, we got down to the business of getting to know each other. Campers hailed from throughout Ontario and the East Coast, as well as Oklahoma, Michigan, Ohio, and Florida, and represented a diverse range of use cases and experience levels. The group ran the gamut from people who'd heard the word "Islandora" thrown around but had never touched the platform to folks who've been developing/administering Islandora for years. Leading us through the day's activities were Nick Ruest (York University), Jordan Dukart (discoverygarden), Kirsta Stapelfeldt (UTSC), and David Wilcox (Duraspace).

Introductions included the question, "If you were a sandwich, what kind of sandwich would you be?"

Turkey rueben, an acquired taste (our own Kirsta) Ice cream sandwich, because he's also interested in Android development (University of Toronto ITS's Ken Yang) Poutine, because she'd always choose poutine over a sandwich (University of Toronto ITS's Kelli Babcock) Montreal smoked meat with wayyyy too much stuff in the middle (DuraSpace's David Wilcox) (That's all I can remember - if you recall any more, leave them in the comments!) New Release and Future Developments

Nick described the new modules and tools that are contained within the newest release, Islandora 7.x-1.3. In particular, he talked about the suite of modules - Checksum and Checksum Checker, FITS, BagIt, and PREMIS - that make Islandora so much stronger as a preservation system. Some excellent work here.

David talked about Fedora 4, which is a major rearchitecting of the repository software that Islandora works on. David highlighted the way that Fedora will now structure data within the repository, as well as the linked data capabilities and performance enhancements. Jordan Dukart talked about Drupal 8, which is much more object-oriented. 

Community Overview

We looked at some cool things being done with Islandora in the community:

Nick presented York U's browse map using Solr queries + Leaflet.js MJ Suhonos presented Ryerson's usage stats tracking module Nick talked about the Islandora Deployments github repo, a place where developers can write out their deployment stories, which he created with Mark Jordan during Open Repositories 2014 Mark Jordan's Background Processes Discussion Paper, looking at what's going on under the hood when Islandora ingests objects University of Southern Carolina's Moving Image Research Collections, which uses PBCore and runs a lot of non-standard processes York U's Solr Views galleries - specifically, dogs and cats Using Fedora Connector to represent Islandora objects via Omeka Interest Groups

There are currently four Islandora Interest Groups. The groups are formed and maintained by community members (read: anyone who wants to convene one) in order to address specific problems or questions related to the software and/or the community.

Nick Ruest, Donald Moses, and Mark Jordan convened the Preservation Interest Group to standardize and steward some of the new preservation modules in Islandora, including Checksum, FITS, BagIt, Vault, and PREMIS. The Preservation Interest Group is also working with Archivematica (/Artefactual Systems).

The Documentation Interest Group, convened by Kirsta, Kelli Babcock, and Gabriela Mircea, is focused on improving the Islandora documentation wiki as well as creating new documentation for training and development purposes.

David convened the Fedora 4 Interest Group to help plan how Islandora will integrate with the new version of the repository during the next phase of development.

The newest group is the Archival Interest Group, convened by me, which focuses on how archivists and archival collections interact with Islandora, incorporating questions of training, development, and linked services.


Random notes and recurring themes:

Deployment - specifically, issues with deployment that are consistent across implementations Integration with other systems, specifically archival description systems (AtoM and ArchivesSpace) and Omeka Ontologies, migrations, systems, integration with other systems, deployment Good idea/bad idea: multi-sites Drupal 8 and Fedora 4 and how Islandora 7/8 releases will play with one or both of these "It depends"
Friday, June 27, 2014 - 13:21

by Sara Allain

Lately we've been trying to come up with a better way to create metadata for batch ingestion into Islandora. We just started preparing the UTSC Photographic Services Collection to go online - our lovely Young Canada Works summer student, Rachel, has been diligently selecting a few hundred candidates for the first phase of digitization - and it makes sense to start creating the metadata as well so that once we have digital surrogates we can bundle it all into Islandora via the batch ingest quickly. Since metadata creation/manipulation takes up a lot of my day, I started thinking about the most effective way to create XML using a workflow that would be optimal for our students, our systems, and me.

This is fairly long and detailed, so feel free to jump to the bottom for the highlights.

We often work with faculty and other people outside of the unit to create metadata for the various digital scholarship projects that we steward. Spreadsheets are an easy and accessible way for faculty, students, researchers - whomever - to come to grips with structured data. Things are tidy, they're easy to manipulate, we can derive CSV files - but most importantly, our project collaborators are familiar with how they work. There's no learning curve. We use a range of products from Excel to LibreOffice to Google Drive to do this - whatever's most suited to the project.

Step 1 - Set Up Your Spreadsheet

We're using MODS for all generic content going forwards - in past we used Dublin Core, but Islandora natively prefers MODS and it's more flexible for complex objects. (We may use other schemas for subject-specific content in the future, like Darwin Core for biodiversity data, which will be an interesting blog post in itself.) I set up a Google spreadsheet that uses human-friendly versions of the smallest child elements in MODS as column headers; that specific spreadsheet doesn't reflect all the fields in MODS that are available, so think of it as an infinitely extensible collection mechanism. In truth, it doesn't even matter what the headers are, as long as they map easily to MODS and the content is consistent.

Step 2 - Add Some Metadata

This step is pretty simple. We have generic guidelines for creating metadata - things like "Transcribe title from the object or create a title that describes the object." or "Use the format YYYY-MM-DD." Our goal in the DSU is to intervene as little as possible into this process. Usually all we'll do is a bit of clean-up before making it publicly available. You can see the instructions that we provide for users as comments if you hover over the column headers on the spreadsheet.

Step 3 - Import into Google Refine

Open Refine (also called Google Refine) allows you to perform sophisticated manipulations on tabular data. It supports regular expressions and a host of other ways to mash up your info. Once you have the program installed, it works in the Chrome browser. One word of warning, though - a desktop install can only handle so many rows of content before it will die on you. It's possible to allocate more memory if the program is having trouble parsing the data that you import.

The import process is simple - export the spreadsheet from Google as .xls, then import into Google Refine using the Create Project function. It looks like this:

Make sure that your data is rendering properly in the preview window and click on Create Project. You'll end up with - surprise! - another spreadsheet, this time in Open Refine.

Step 4 - Refine the Data

You might want to take this time to refine your data, since that's the whole point of Open Refine. You can do things like removing trailing spaces or splitting columns as needed. In the Google spreadsheet, for example, the Subject field includes multiple entities delimited by semicolons; Open Refine will do the work of isolating each of these into a separate column for you, if you should so desire. As mentioned above, it support regular expressions and is very powerful at manipulating data.

Step 5 - Export as MODS

This is the trickiest part, and by "trickiest" I mean surprisingly simple once you've figured it out. Open Refine has several options for exporting data; the one I use to export as MODS is Templating. When you click on it, you get it a form that looks like this:

Within the exporter, you can build any schema you desire. On the left is the editable template and on the right is a preview of how your file will look once it's exported. In this case we want MODS, which was easy to model. You simply need to add the proper tags around the jsonize tags. Here is a template for Open Refine that will show you exactly what to put where - the only thing that might need to be changed is the content within the square brackets in the jsonize tag - the bolded word here: {{jsonize(cells["Title"].value)}} (this is the column header from your spreadsheet). The exporter with the MODS template applied looks like this:

Click export and you'll get a big .txt file of structured data that you can work with - one you save it as .xml it will be valid MODSXML. I like to split that huge file using xml_split, part of the XML::Twig package, but there are any number of different ways of doing it. Zip your individual MODS records up with your objects and everything is ready to batch ingest into Islandora!




This spreadsheet will make metadata creation easy.

Open Refine will make metadata editing easy.

This template will make exporting MODS from Open Refine easy.

Everything is now easy.

Thursday, June 12, 2014 - 14:16

The UTSC Library, in collaboration with the Centre for Digital Scholarship, the Office of the Dean and VP Academic, and the University of Toronto Libraries Chief Librarian’s Office, is organizing a THATCamp.

More and more frequently, professors are creating courses that are centered around digital projects, and incorporate digital tools into their courses. Part of the larger Digital Pedagogy Institute, this THATCamp will focus allow participants to discuss best practices around teaching courses that are centered on digital methods, and digital tools that improve and facilitate research.  It is hoped that a variety of case studies will be presented and discussed in order to bring to light best practices surrounding these emerging methodologies, and the skills that faculty members and librarians need to develop in order to maximize their impact on undergraduates in this specific area.

For more information and to register, please see:

When: Friday, August 15th, 2014


Where: University of Toronto Scarborough Campus, 1265 Military Trail, Toronto, ON, M1C 1A4.

If you have any questions about THATCamp Digital Scholarship Institute, please contact us at

Thursday, June 12, 2014 - 08:14

by Sara Allain

We're really excited that our poster, entitled "Bye Bye, CONTENTdm: a migration to Islandora", was a co-winner for best poster at Open Repositories 2014! Almost 60 posters were presented at the conference on a huge range of subjects. We're incredibly proud to be part of such a diverse and intelligent group of people.

The poster was co-authored by Lingling Jiang, Kim Pham, Kirsta Stapelfeldt, Paulina Rousseau, and myself. Check it out on Slideshare.

Huge congratulations as well to our co-winners Minna Marjamaa, Tiina Tolonen, and Anna-Liisa Holmstrom, whose work on the Theseus Open Repository is inspiring.

Thursday, June 12, 2014 - 04:12

by Sara Allain

We're away at Open Repositories this week (taking lots of notes, so watch out for our blog posts after we all get back to Canada). Everybody is staying up too late since the days are so long, and I've been working on mapping the tweets of attendees. It's still a work in progress, but you can check out mapping on my personal website


Friday, June 6, 2014 - 10:42

This past week I had the opportunity to attend a free information session put on by Toronto Area Archivist Group (TAAG) and University of Toronto Archivist Group (UTAG). As a new summer student employee of the Digital Scholarship Unit it was a great opportunity for someone who trying to break into the world of digital archival initiatives and scholarship. Courtney Mumma, MAS/MLIS, of Artefactual Systems Inc. led the session and introduced the group to Archivematica.

“Archivematica is a free and open-source digital preservation system that is designed to maintain standards-based, long-term access to collections of digital objects. Archivematica is packaged with the web-based content management system AtoM for access to your digital objects.

Archivematica uses a micro-services design pattern to provide an integrated suite of software tools that allows users to process digital objects from ingest to access in compliance with the ISO-OAIS functional model. Users monitor and control the micro-services via a web-based dashboard. Archivematica uses METS, PREMIS, Dublin Core and other best practice metadata standards.” [1]

The session was held in Thomas Fisher Rare Book Library at University of Toronto, St. George Campus. Sitting among the floors and floors of unique and beautiful books set up an interesting dynamic between the analog, the digital and the initiative to bring them together. The workshop started with a demonstration showing steps that may be included in a basic workflow, she explained the output capabilities and used a staggering amount of acronyms, which, as I gather is par for the course in this field. Mumma did an excellent job of explaining the program and the demonstration helped to guide those of us who were new to the program. Even with Mumma’s skill as a presenter there was a lot of information to process and it was impossible to grasp all of the programs’ capabilities in the time given. Thankfully, they have a detailed wiki that explains the basic capabilities of the program.

Archivematica, as Mumma said, “Allows an archivist to remain an archivist,” by facilitating appraisal (a forthcoming feature), preservation and metadata creation. What Archivematica has done is gather together all the best open-source tools, what they call micro-services, to allow for individual configuration to the specifications and needs of individual repositories. In providing an individually configurable program this allows archivists to use the program without fussing around with the multiple and varied individual tools for discrete tasks. Archivematica is also compatible with most storage and access systems.

Archivematica can be downloaded for free and used for free, and it is open-source. It also comes with a detailed user manual and an online forum where users can discuss issues and post questions. In theory, it can be used without costs. However, for those who are uncomfortable with more robust technologies, the set-up and maintenance may be daunting without the help of an IT department. Thus, Artefactual Systems offers Archivematica set-up, configuration, tutorials and maintenance services and more, at a cost. The services provided are extensive and highly valuable but they bring up the issue that is plaguing all heritage organizations these days: money.

Artefactual is upfront about their costs (they can be found here). But many archives or library departments are small and have small budgets and some institutions do not have access to the kind of IT support needed for the DIY option. While we recognise the digital future and want to move toward it, sometimes it seems insurmountable in terms of resources.

During the workshop there was some talk about how to spread out the costs amongst institutions willing to engage in Artefactual’s services. For example, the Council of Prairie and Pacific University Libraries (COPPUL) have formed a consortium to employ and pay for Artefactual’s services amongst them. At the workshop, there was some talk of Ontario institutions also employing Artefactual’s services consortially.

Overall, the workshop was very informative and promising. It shows that there are great initiatives and great interest in the move toward digital. It is exciting to see where the push toward digital will bring archival institutions and how it will shape the heritage professions. Thanks is due to TAAG and UTAG for putting on this session and also thank you to Courtney Mumma and Artefactual Systems for the opportunity to learn more about your services and resources.

Wednesday, May 28, 2014 - 11:20

Oh, they do so many things they never stop. Oh, the things they do there, my stars.

Why hello, I'm the new contract hire at the DSU since May.  So far it's been lovely - I love the work pace and I immediately felt like I was a part of the team.  The first thing I worked on here was to get content into their shiny new online repository (3 weeks my senior).  I was to move all of the metadata from the Doris McCarthy Image Collection contained in ContentDM (the old asset management software) into Islandora (the new asset management software).  My aim is to be as transparent as possible in the hopes that this will be of value to someone such as myself starting out in libraries and working with library data and metadata.  Of course, I will be more than happy to answer any questions if you too share a similar pain.

Hey, lets make it easy.  The code is available on github.

What we had: Doris McCarthy Simple DC Export (.xml) Rename map (document) What we used: oxygen XML editor (30 day trial) text editor (we used Sublime Text, excel) xml_twig (cpan needs to be installed) Scripts: xslt rename map (.xsl) rename (.sh) LOC DC2MODS (.xsl)

To start:

the exported XML file from CONTENTdm - in Simple DC had 750 records The renaming map project document that has old filename and new filename (done manually)

To end:

one individual .xml record in MODS for each associated .tif object (they need to have the same filename in order to be properly batch ingested using the Large Image Content Model)


Create a rename map: create an xml style sheet document (xslt) to replace all the text within <dc:source> to read the current name, lookup in xslt its corresponding replacement identifier

Rename transformation in oxygen xml editor -> 750 records (no loss)

a. ~20 identified duplicates: some had identical identifiers because, some just had two metadata records associated with the object (like some records included OCR transcriptions while the dupe didn’t have)
b. ~30 container metadata records didn’t have a mapping name so they weren’t transformed - acceptable

Split the files -> 750 files (no loss): using xml_twig > xml_split module

Rename the split files -> 730 records as predicted from 2b: using

Transform metadata records from DC to MODS -> 730 records (no loss from step 4): using oxygen xml editor, LOC has templates for MODS transformations that we modified to match our CONTENTdm metadata export

Ready for ingest: single image + xml package, book batch too (steps not included). yay



in almost every step something didn’t work – you will need to go back a few steps, fix, and proceed it’s difficult to figure out the order to do everything, don’t be afraid to try it a different way (you can do one step first and it may cause more problems than if you did it another way - e.g. deciding if you do the dc to mods first or wait until the very end) cleanup is crucial to every step – the more time you devote to clean up earlier on in your workflow the easier the rest of the process will be

All in all, it's been a very exciting month, not just for me but for everyone at the DSU.  Or maybe it's always like this...

Tuesday, May 27, 2014 - 13:29

The UTSC Library, with the support of the UTSC and York University libraries, is proud to be the host for Islandora Camp GTA

In lieu of our usual summer Islandora Camp in PEI, our 2014 Canadian camp is going to the big city. #iCampGTA will take place on the campus of the University of Toronto Scarborough from August 6-8, 2014. If you have any questions about the GTA Camp, please contact us.

Register Schedule Accommodations and Travel  Call for Proposals T-Shirt Logo Contest - win your registration!

See additional details on the Islandora Website. Hope to see you there!

Tuesday, May 27, 2014 - 13:23

Register for this event

We invite proposals for Digital Pedagogy and the Undergraduate Experience: An Institute

Proposals should contain a title, an abstract (of approximately 250 words) plus list of works cited, and the names, affiliations, and website URLs of presenters; fuller papers will be solicited after acceptance of proposals, for circulation in advance of the gathering to registered participants.

Alternatively, you can propose a workshop related to Digital Pedagogy, with the same stipulations as above. 

Please send proposals before 30th June 2014 to Paulina Rousseau,

Tuesday, May 27, 2014 - 12:41

Register for this event

See the Institute and THATCamp Schedule


Instructional Centre - Rooms IC-300, 302, 306, 1265 Military Trail, Toronto 

Join us for an event that considers the way digital scholarship is changing the landscape of undergraduate pedagogy!

Emerging technologies have had an immense impact on the way that research is now conducted by scholars in academic disciplines. There is a move toward the use of computers, applications, and larger, non-discrete data sets for what is increasingly termed “digital scholarship.” Thanks to these advanced developments in computing, research in all fields has taken on a much more collaborative nature, and has resulted in experimentation with research outputs in new formats, and the creation of new intellectual products. These major changes in research methodology mandate the development of new skill sets, both in faculty and in the training of students. As such, Digital Literacy and Pedagogy must become a priority for undergraduate and graduate students, as well faculty members who must inherit and participate in new, digitally-mediated methodologies.

This institute will explore the potential impact that Digital Pedagogy can have on student experience, with specific focus on the undergraduate level. This will include the following topics:

How can digital research methodologies be used to improve undergraduate engagement? What are the best methods for teaching students digital skills so that they can participate in the creation of digital research? What has proven to be successful, What political and ideological decisions do educators involved in digital scholarship need to make in order to benefit students, preparing them for work beyond the academy, and how can this influence the formation of canons that might help stabilize the field, and How can faculty members shift from transmitting knowledge to facilitating projects, co-inquiring and co-learning with students in activity-centered projects.

This institute will bring together digital scholars with considerable expertise in the area of Digital Pedagogy, and will consist of plenary sessions, informational sessions, hands on workshops involving digital tools, workshops focusing in methods for integration of digital pedagogy into both specific courses and the larger curriculum. It will also involve presentations from students who have participated in the development of digital scholarship projects in hopes of gaining insight into the integration of this skill sets improves the learning experience and job readiness.

Featured events will include plenary talks by Rebecca Frost Davis, Director of Instructional and Emerging Technologies (St. Edwards University, Fellow at NITLE), and Lisa Spiro (Executive Director, Digital Scholarship Services, Rice University, Former Director of NITLE labs).

The institute will close with THATCamp, which will allow scholars to discuss the most pertinent issues that concern them in the realm of digital pedagogy, and will be preceded by Islandora Camp. 

This Institute is generously sponsored by the UTSC Library Chief Librarian, the UTSC Office of the Dean and Vice President Academic, the Office of the University Chief Librarian, University of Toronto, and UTSC's Centre for Digital Scholarship. 

See our Call for Proposals. Due by June 30th, 2014

Register for this event

Sunday, May 11, 2014 - 18:09

I'm away from the Digital Scholarship Unit this week in semi-sunny London,  as an instructor for Islandora Camp UK. Here are some of my notes:

Note: Fresh and Maybe Flawed.

The final day of Islandora Camp reunited the Developers and Adminstrator's tracks.

People were trickling to the whiteboard to record their Github handles for addition to the Islandora Github Organization. 

Last night’s late-nighters are among those who got up this morning for a run through the city. After everybody got some coffee, we started with Alan Stanley’s presentation on producing digital editions. 

The presentation comes out of the Editing Modernism In Canada (EMIC) project and its partners. I worked on EMIC in earlier days, and it was interesting to see the progress - particularly the integration of Desmond Schmidt/Austese Work, and the CollateX tool. In FedoraCommons, versions of a work are being stored as separate objects, against which these tools are run to detect differences and help in the contruction of digital editions. I need to get back to a review of the AUSTese workbench to explore what's been happening in the Digital Humanities community. 

The Module that Alan (and discoverygarden) are building also provides WYSIWYG TEI creation through the Canadian Writing and Research Collaboratory (CWRC)’s CWRCWriter application. Alan says that “it works, but it needs a lot of tweaking,” meaning that we have a little while to wait before this project is generalized and released to the community, but it's very exciting to see in action. 

Donald’s Form Builder session came next; Form Builder's a big and complex tool, and Donald promises to post his slides from this and other presentations on the same page as the conference schedule. Beyond teaching the tool and its interface, Donald faciliatated coversation about encoding validation steps for forms through the interace. Validation is currently in the hands of the form creator, or encoded by hand. It would be great to see a more generalized solution. Since camp, there’s been an interesting post on the lists about validation for specific form fields

After this, it was time for a break, followed by the awards - here are our recipients!

“Old School Strength” to Draženko Celjak for VM installation on Windows XP (the brave soul).  “Continuous Passion About Integration” to Simon Fox, from the Freshwater Biological Association, future Travis expert, and sherif of Islandora code.  “Friendly Traveller” to Ken Kim from Next Aeon Korea,  for coming such a long way and being such a collegial camper. “The Spirit Award” to Anna Jordanous with many thanks for her hard work making camp a success, everything from finding us a space to bringing power cords and coordinating the social event. 

We took a group pic at this point. You might have seen it on twitter

After this, we were on to the community presentations.

Luis Martinez-Uribe, from Fundación Juan March, talked about how discoverygarden helped his organization, which distributes funding and provides stewardship for Spain’s cultural life, set up Islandora. There were a lot of interesting things about this presentation: FJM chose Islandora because the project was led by a librarian (Mark Leggott), many of the views in the site were generated via exports from Archivist Toolkit, and there are a lot of custom views for content, some using third-party tools (like the popular Simile widget). FJM has also used Fusion Tables in Google to make some neat visualizations about the artists that have been showcased over the years. The project is a testament to the value of structured data, as Luis says,“It actually pays off to prepare the data.”

FJM is also interesting because they’ve gone to a completely different display layer (not Drupal). Being a Windows shop without any in-house PHP expertise, they developed a .net FJM-Islandora Library that replaces Tuque. We saw the library in action on a large collection of exhibition catalogues dating back to the early 1970s.

This is around the time that Nick turns around with twinkling eyes and says: “I wonder if I can get d3 to work with Solr” - I’m still watching his twitter feed to see what emerges. 

The last presentation before lunch was from Caleb Derven at the University of Limerick. He's spent the last few years developing infrastructure. Although many of the repositories in Ireland run Dspace, Hydra, or bespoke front ends for Fedora, Caleb worked with discoverygarden to build out 20TB of storage affiliated with an Islandora installation, citing Islandora as a more flexible approach better suited to the types of staff and expertise at his institution. He’s interested in EAD support in Islandora, and I sadly have to run to feed Henry before I get to hear the rest of his presenation. I felt like Caleb and I spent most of the camp trying to get together for a conversation about Islandora and archives, but weren’t successful.   

Next up were two presentations from the Freshwater Biological Association (FBA) outlining their approach to RDF - First Nicholas Bywell showed off FBA's Object Linker module, which integrates with fixed vocabularies and provides autocomplete against a preferred term collection. The group creates authorities for terms using MADS, but notes that one could modify to use SKOS pretty easily.  Anna Jordanous follows, and her presentation introduces the group’s use of Linked API from Epimorphics and sparks a discussion of how to take data in a spreadsheet and produce linked data. 

After the FBA presentation, Donald Moses introduces the new IR code, which is probably worthy of its own blog post. It’s a really big suite of modules, with good support for ingest of citations from things like doi, pmid, endnote and RIS through to display that leverages CSL stylesheets and the creation of custom bibliographies. Then it’s time for a quick break.

After the break, Ken Kim talks about how his group has photographed thousands of Korean artifacts and made them available in a Drupal website that supports eight languages. We end camp with a discussion of the future, including the implications of Fedora 4 and Drupal 8 - while nobody had any clear timetables or deadlines, the commitment of the community to a future Islandora is pretty clear, as is the desire for a good upgrade path. For now, there are lots of new sites going up daily in Islandora 7, and I’m amazed at how far the code has come since my first camp in 2010. 

Watch for the presentation slides to go up and we'll tweet when we see them. 

If this sounds interesting, come to Islandora Camp GTA!

Thursday, May 8, 2014 - 16:32

I'm away from the Digital Scholarship Unit this week in semi-sunny London,  as an instructor for Islandora Camp UK. Here are some of my notes:

Note: Fresh and Maybe Flawed.

The second day of Islandora Camp has ended, and I’ve had far less time to take notes. This is because we split into our separate sessions (administrators and developers) and Donald Moses and I were instructing in earnest. Though we missed our developer friends, we pushed on into a deep-dive of the Islandora administrative interface. 

Typically, the administrative track starts off with an overview of basic site-building and user management functions in Drupal (a hurdle for some Islandora administrators) before moving to a review of Islandora permissions (and an overview of FedoraCommons) before ending in with Solr. This day-long session was designed by Melissa Anez, the Islandora Foundation Project & Community manager. Donald Moses and I both admire the graceful approach Melissa has taken in designing hands-on sessions that ease people into the sometimes daunting world of Drupal, FedoraCommons, and Solr and how these applications meet in the Islandora ecosystem. 

That said, this admiration didn’t stop us from getting diverted (sorry Melissa!) into discussions of media management, the philosophy behind Islandora’s extension of pre-existing Drupal modules, the art of authoring namespace prefixes, and desirable server setups (to vagrant or not to vagrant?).

Because we could not get enough of being smooshed together in small underground spaces, camp finished off with a lovely dinner at the bottom of Covent Garden. The dev and admin tracks were reunited with much comparison of personal histories and accents, and plans for Islandora. It sounds like the Dev track also went well. I came back to the hotel with my family (one year olds don't really like talking about Islandora), but most of camp is still out there in the city, painting the town Islandora-t-shirt red.

I forgot how much I get out of these camps, and how great it is (after 4 years of Islandora) to see new faces interspersed with established Islandorians. What a lovely bunch of people!

If this sounds interesting, come to Islandora Camp GTA!