Tuesday, May 31, 2011
Way back at the start of the project, one of the aims we set ourselves as project partners was to be able to demonstrate increased engagement with repositories as a result of our simplified deposit model. In order to show increased engagement, we needed to be able to calculate numbers of unique depositors (so who is doing the uploading), but this wasn't the simplest data to be able to extract.
DSpace doesn't have a report with which to provide this data, and it isn't possible to extract it from the Elements database. However, we knew that when an item was uploaded, along with the metadata for the item and the file, data about who had made the submission was entered into the dc.description.provenance field. How to get at this and turn it into useful numerical data took a little time to work out.
Our IT Services systems administrator performed a search within DSpace looking for matching text strings within the dc.description.provenance field. Once he had this, he provided me with a (very much tidied up) text file which I was able to import into spreadsheet software to begin turning into numerical values. Since the data we wanted was cumulative, it didn't take long to make the appropriate calculations (especially since the number of depositors is relatively small at the moment).
This was a messy process, with some risk of inaccuracies because of the manual extraction process, fine as an interim whilst something better is investigated perhaps (hello DSpace developers?), and I do have to wonder if there is an easier way to get at the data. It also occurs to me that this time consuming process might work for one-off data extractions, but would be unsustainable over longer periods for regular collection of data. However, we do now have statistics on how many people are actually engaging with our repository (not to pre-empt the final report of the project but: July 2010 = 24, April 2011 = 61), and this gives us some evidence of how well we're doing; not only in terms of new users, but also with sustained engagement.
With many thanks to my IT colleague, David Goddard, for handling the extraction and cleaning up of the data.
Sarah Molloy (Queen Mary)