Blog posts

International Digital Curation Conference 2015

Monday 9 – Thursday 12 February 2015 saw data management and curation professionals and researchers descend on London for the 10th annual International Digital Curation Conference (IDCC), held at 30 Euston Square. Although IDCC is focussed on “digital curation”, in recent years it has become the main annual conference for the wider research data management community.

This year’s conference theme was “Ten years back, ten years forward: achievements, lessons and the future for digital curation”.

Key links:

Day 1: Keynotes, demos and panel sessions

Tony Hey opened the conference with an overview of the past 10 years of e-Science activities in the UK, in highlighting the many successes along with the lack of recent progress in some areas. Part of the problem is that the funding for data curation tends to be very local, while the value of the shared data is global, leading to a “tragedy of the commons” situation: people want to use others’ data but aren’t willing to invest in sharing their own. He also had some very positive messages for the future, including how a number of disciplines are evolving to include data scientists as an integral part of the research process:

How we worked/how we work

Next up was a panel session comparing international perspectives from the UK (Mark Thorley, NERC), Australia (Clare McLaughlin, Australian Embassy and Mission to the EU) and Finland (Riita Maijala, Department for HE and Science Policy, Finland). It was interesting to compare the situation in the UK, which is patchy at best, with Australia, which has had a lot of government funding in recent years to invest in research data infrastructure for institutions and the Australian National Data Service. This funding has resulted in excellent support for research data within institutions, fully integrated at a national level for discovery. The panel noted that we’re currently moving from a culture of compliance (with funder/publisher/institutional policies) to one of appreciating the value of sharing data. There was also some discussion about the role of libraries, with the suggestion that it might be time for academic librarians to go back to an earlier role which is more directly involved in the research process.

After lunch was a session of parallel demos. On the data archiving front, Arkivum’s Matthew Addis demonstrated their integration with ePrints (similar workflows for DSpace and others are in the works). There was also a demo of the Islandora framework which integrates the Drupal CMS, the Fedora Core digital repository and Solr for search and discovery: this lets you build a customised repository by putting together “solution packs” for different types of content (e.g. image data, audio, etc.).

Eprints Arkivum workflow

The final session of the day was another panel session on the subject of “Why is it taking so long?”, featuring our own Torsten Reimer alongside Laurence Horton (LSE), Constanze Curdt (University of Cologne), Amy Hodge (Stanford University), Tim DiLauro (Johns Hopkins University) and Geoffrey Bilder (CrossRef), moderated by Carly Strasser (DataCite). This produced a lively debate about whether the RDM culture change really is taking a long time, or whether we are in fact making good progress. It certainly isn’t a uniform picture: different disciplines are definitely moving at different speeds. A key problem is that at the moment a lot of the investment in RDM support and infrastructure is happening on a project basis, with very few institutions making a long-term commitment to fund this work. Related to this, research councils are expecting individual research projects to include their own RDM costs in budgets, and expecting this to add up to an infrastructure across a whole institution: this was likened to funding someone to build a bike shed and expecting a national electricity grid as a side effect!

There was some hope expressed as well though. Although researchers are bad at producing metadata right now, for example, we can expect them to get better with practice. In addition, experience from CrossRef shows that it typically takes 3–4 years from delivering an infrastructure to the promised benefits starting to be delivered. In other words, “it’s a journey, not a destination”!

Day 2: research and practice papers

Day 2 of the conference proper was opened by Melissa Terras, Director of UCL Centre for Digital Humanities, with a keynote entitled “The stuff we forget: Digital Humanities, digital data, and the academic cycle”. She described a number of recent digital humanities projects at UCL, highlighting some of the digital preservation problems along the way. The main common problem is that there is usually no budget line for preservation, so any associated costs (including staff time) reduce the resources available for the project itself. In additional, the large reference datasets produced by these projects are often in excess of 1TB. This is difficult to share, and made more so by the fact that subsets of the dataset are not useful — users generally want the whole thing.

The bulk of day 2, as is traditional at IDCC, was made up of parallel sessions of research and practice papers. There were a lot of these, and all of the presentations are available on the conference website, but here are a few highlights.

Some were still a long way from implementation, such as Lukasz Bolikowzki’s (University of Warsaw) “System for distributed minting and management of persistent identifiers”, based on Bitcoin-like ideas and doing away with the need for a single ultimate authority (like DataCite) for identifiers. In the same session, Bertram Ludäscher (University of Illinois Urbana-Champaign) described YesWorkflow, a tool to allow researchers to markup their analysis scripts in such a way that the workflow can be extracted and presented graphically (e.g. for publication or documentation).

Daisy Abbot (Glasgow School of Art) presented some interesting conclusions from a survey of PhD students and supervisors:

  • 90% saw digital curation as important, though 60% of PhD holders an 80% of students report little or no expertise
  • Generally students are seen as having most responsibility for managing thier data, but supervisors assign themselves more of the responsibility than the students do
  • People are much more likely to use those close to them (friends, colleagues, supervisors) as sources of guidance, rather than publicly available information (e.g. DCC, MANTRA, etc.)

In a packed session on education:

  • Liz Lyon (University of Pittsburgh) described a project to send MLIS students into science/engineering labs to learn from the researchers (and pass on some of their own expertise);
  • Helen Tibbo (University of North Carolina) gave a potted history of digital curation education and training in the US; and
  • Cheryl Thompson (University of Illinois Urbana-Champaign) discussed their project to give MLIS students internships in data science.

To close the conference proper, Helen Hockx-Yu (Head of Web Archiving, British Library) talked about the history of web archiving at the BL and their preparation for non-print legal deposit, which came into force on 6 April 2013 through the Legal Deposit Libraries (Non-Print Works) Regulations 2013. They now have two UK web archives:

  • An open archive, which includes only those sites permitted by licenses
  • The full legal deposit web archive, which includes everything identified as a “UK” website (including `.uk’ domain names and known British organisations), and is only accessible through the reading room of the British Library and a small number of other access points.

Workshops

Data Carpentry

Software Carpentry is a community-developed course to improve the software engineering skills and practices of self-taught programmers in the research community, with the aim of improving the quality of research software and hence the reliability and reproducibility of the results. Data Carpentry is an extension of this idea to teaching skills of reproducible data analysis.

One of the main aims of a Data Carpentry course is to move researchers away from using ad hoc analysis in Excel and towards using programmable tools such as R and Python to to create documented, reproducible workflows. Excel is a powerful tool, but the danger when using it is that all manipulations are performed in-place and the result is often saved over the original spreadsheet. This both destroys (potentially) the raw data without providing any documentation of what was done to arrive at the processed version. Instead, using a scripting language to perform analysis enables the analysis to be done without touching the original data file while producing a repeatable transcript of the workflow. In addition, using freely available open-source tools means that the analysis can be repeated without a need for potentially expensive licenses for commercial software.

The Data Carpentry workshop on Wednesday offered the opportunity to experience Data Carpentry from three different perspectives:

  • Workshop attendee
  • Potential host and instructor
  • Training materials contributor

We started out with a very brief idea of what a Data Carpentry workshop attendee might experience. The course would usually be run over two days, and start with some advanced techniques for doing data analysis in Excel, but in the interest of time we went straight into using the R statistical programming language. We went through the process of setting up the R environment, before moving on to accessing a dataset (based on US census data) that enables the probability of a given name being male or female to be estimated.

The next section of the workshop involved a discussion of how the training was delivered, during which we came up with a list of potential improvements to the content. During the final part, we had an introduction to github and the git version control system (which are used by Software/Data Carpentry to manage community development of the learning materials), and then split up into teams to attempt to address some of our suggested improvements by editing and adding content.

I found this last part particularly helpful, as I (in common with several of the other participants) have often wanted to contribute to projects like this but have worried about whether my contribution would be useful. It was therefore very useful to have the opportunity to do so in a controlled environment with guidance from someone intimately involved with the project.

In summary, Data Carpentry and Software Carpentry both appear to be valuable resources, especially given that there is an existing network of volunteers available to deliver the training and the only cost then is the travel and subsistence expenses of the trainers. I would be very interested in working to introduce this here at Imperial.

Jisc Research Data Spring

Research Data Spring is a part of Jisc’s Research at Risk “co-design” programme, and will fund a series of innovative research data management projects led by groups based in research institutions. This funding programme is following a new pattern for Jisc, with three progressive phases. A set of projects will be selected to receive between £5,000 and £20,000 for phase 1, which will last 4 months. After this, a subset of the projects will be chosen to receive a further £5,000 – £40,000 in phase 2, which lasts 5 months. Finally, a subset of the phase 2 projects will receive an additional £5,000 – £60,000 for phase 3, lasting 6 months. You can look at a full list of ideas on the Research At Risk Ideascale site: these will be pitched to a “Dragon’s Den”-style panel at the workshop in Birmingham on 26/27 February.

The Research Data Spring workshop on Thursday 12 February was an opportunity to meet some of the idea owners and for them to give “elevator pitch” presentations to all present. There was then plenty of time for the idea owners and other interested people to mingle, discuss, give feedback and collaborate to further develop the ideas before the Birmingham workshop.

Ideas that seem particularly relevant to us at Imperial include:

End of 2014 Open Access news

Just in time before the College closes for the Christmas break I have found the time to write my overdue summary of recent developments in the world of open access and scholarly communication. Below are a few of the headlines and developments that caught my eye during the last couple of months.

Cost of Open Access

Commissioned by London Higher and SPARC Europe, Research Consulting have published Counting the Costs of Open Access. Using data provided by universities, including Imperial College, it concludes that there was a £9.2m cost to UK research organisations for achieving compliance with RCUK’s open access policy in 2013/14. Main conclusions are quoted below – the estimated costs for meeting REF open access requirements are particularly interesting seeing as HEFCE do not provide any funding for their in some ways even more ambitious open access policy:

  • The time devoted to OA compliance is equivalent to 110 fulltime staff members across the UK.
  • The cost of meeting the deposit requirements for a post-2014 REF is estimated at £4-5m per annum.
  • Gold OA takes 2 hours per article, at a cost of £81.
  • Green OA takes just over 45 minutes, at a cost of £33.

Pinfield, Salter and Bath published: The ‘total cost of publication’ in a hybrid open-access environment. The study analyses data from 23 UK institutions, including Imperial College, covering the period 2007 to 2014. It finds that while the mean value of APCs has been relatively stable, ‘hybrid’ subscription/OA journals were consistently more expensive than fully-OA journals. Modelling shows that APCs are now constituting 10% of the total cost of ownership for publishing (excluding administrative costs).

EBSCO’s 2015 Serials Price Projection Report assumes price increase of 5-7%, not including a recommended additional 2-4% to allow for currency fluctuations.

John Ulmschneider, Librarian at the Virginia Commonwealth University, estimates that with current price increases the cost for subscription payments would “eat up the entire budget for this entire university in 20 years”. Partly in response to that, VCU has launched its own open access publishing platform.

UK Funder News

Arthritis Research UK, Breast Cancer Campaign, the British Heart Foundation (BHF), Cancer Research UK, Leukaemia & Lymphoma Research, and the Wellcome Trust have joined together to create the Charity Open Access Fund (COAF). COAF operates in essentially the same way as the WT fund it replaces.

An article summarising responses to the RCUK review of open access cites the Wellcome Trust saying that sanctions could accelerate the implementation of open-access.

The Wellcome Trust published a list of journals that do not provide a compliant publishing option.

International Funder News

A new Danish open access strategy sets the goal to reach Open Access to 80% of all publically funded peer-reviewed articles in 2017, concluding with 100% in 2022.

The Open Access policy of the Austrian FWF requires CC BY (if Gold OA) and deposit in a sustainable repository on publication. The FWF covers APCs up to a limit of €2500.

Research Information published a summary of international developments around open access: The Research Council of Norway is making funding available to cover up to 50% of OA publishing charges. The Chinese Academy of Sciences and the National Natural Science Foundation of China require deposit of papers in an OA repository within 12 months of publication. The Mexican president has signed an act to provide “Mexicans with free access to scientific and academic production, which has been partially or fully financed by public funds”.

Publishers and Open Access

In November, negotiations between Elsevier and the Dutch universities broke down following an Elsevier proposal that “totally fails to address this inevitable change [to open access]”. The universities have since reached an agreement with Springer; negotiations with Elsevier have resumed.

The launch of Science Advances, a journal of the American Association for the Advancement of Science (AAAS), prompted strong criticism of the AAAS approach to open access. Over a hundred scientists signed an open letter criticising AAAS for charging $1000 for the CC BY license as well as $1500 for papers longer than ten pages – on top of a $3000 base APC. This has been picked up by media including the New Statesman.

The Nature Publishing Group has had two major OA-related headlines. Generally well received was the announcement that NPG would switch the prestigious Nature Communications to full open access. On the other hand, the move to give, limited, read access to articles has been widely criticised as beggar access and a step back for open access: NPG allow those with a subscription to give others viewing (not printing) access to papers, through a proprietary software.

An open letter signed by nearly 60 open access advocates, publishers, library organisations and civil society bodies warns against model licenses governing copyright on open access articles proposed by the International Association of Scientific, Technical & Medical Publishers (STM). The letter says the STM licences “would limit the use, reuse and exploitation of research” and would “make it difficult, confusing or impossible to combine these research outputs with other public resources”. The STM licenses are seen as incompatible with Creative Commons licences.

Jisc and Wiley have negotiated a deal that provides credits for article processing charges (APCs) to universities that license Wiley journal content and have a Wiley OA account.

UKSG – Untying the knots and joining the dots – 20th November 2014

This year’s UKSG one day conference focused on how researchers are being supported in the changing scholarly communications landscape. The day brought together academics, librarians, publishers and funders to discuss how we can work together to achieve open access requirements as painlessly as possible. What follows is a summary of the event, and the whole day was filmed so you can catch up on the talks at the UKSG website.

Knot_bowline

The day began with Ben Johnson from HEFCE who told the story of how open access came to the attention of the UK government when David Willetts was unable to access the journal articles required to write his book. From Willetts to the Finch Report to the new REF policy, universities are now being pushed into action to ensure publications are made open access and impact of research is demonstrated. HEFCE and other UK funders are making it clear that if research is to have an impact on policy people within government need access to it.

Simon Hubbard from the University of Manchester spoke next about the complicated process of making a paper open access, reporting on research to your funder and storing your research data in the appropriate place. Even for a researcher who has an active interest in open access publishing, the burden of bureaucracy can be off-putting, especially when it feels like he’s entering the same information over and over again into different systems. Finally, Simon had a few recommendations to improve the open access workflow: remove academics from the process as they only slow things down; better and more unified systems; and a simpler message from funders and publishers.

A final highlight of the morning came from Ian Carter at the University of Sussex, who spoke from the perspective of university management and strategic planning. Ian started by summarising the pressures that researchers find themselves under, from conducting “world-class” research, to providing value for money to students paying much higher fees than ever before, to compliance with varying funder policies. To achieve all of this there must be behavioural change from researchers, for example making their work more accessible through open access, and additional support from institutions to ensure these goals align with their overall strategy. Dissemination, communication and impact were identified as some of the most important aims for both researchers and institutions.

The second half of the day saw the librarian’s perspective from Martin Wolf at the University of Liverpool; he believes librarians have a better understanding of the overall picture and how different stakeholders interact. Librarians often find themselves interpreting both funders’ policies and publishers’ open access options to researchers. However, in addition to this advocacy work, librarians seem to be getting increasingly stuck on the detail and are too risk averse when it comes to promoting open access, for example, over the minutiae of a publisher’s copyright policy. Comments from publishers after this session implied that early career researchers are asking very basic questions about open access, so there is still a lot of work to be done.

The last few sessions were lightning talks from providers of altmetrics tools; Digital Science, Kudos and Plum Analytics. These are just three of the many new products designed to capitalise on the impact agenda, and aim to help researchers increase and measure the impact of their publications.

Overall, the day was very useful and demonstrated the various perspectives on research and publication, including changing expectations from all stakeholders involved in the process. It’s clear that while the post-REF2014 policy has been a disruptive force, change was already beginning in the areas of open access, alternative metrics and demonstrating the impact of research.

You can find a summary of Tweets from the day here; collected by Ann Brew, our Maths and Physics librarian.

 

Lucy Lambe
Ann Brew
Philippa Hatch
Michael Gainsford

Open Access Button

Last night saw the launch of the Open Access Button to coincide with worldwide Open Access week. The team behind the Open Access Button aim to help researchers, students and the general public access research papers that are behind paywalls and beyond their means.

The idea came from two medical students who were frustrated at not being able to access all the research they wanted to read, and finding the average cost to read a paywalled article was $30. Although the team has expanded to include partnerships with Cottage Labs, Jisc and more, there are still a large number of students donating their time to the project.  Work began on the Button last year with a beta project that saw 5000 people hit almost 10,000 paywalls or denied access.

The new version of the Open Access Button is a plug-in for your browser that works as a button you click any time you cannot access an article due to a paywall. The system registers information about the article and your location to create a map of researchers who need access to information.

Open Access Button Paywall Map
Image credit: Open Access Button CC-BY-SA

The Open Access Button will try to find a free to access version of the article, for example a pre-print deposited to an institutional or subject repository. If an alternative version cannot be found, the Button will then email the author to let them know that someone wants to access their research but can’t, and suggests the author deposits a copy in a repository.

Upon clicking the button, users are asked to enter a few sentences about why they want to read the article and what they could do if the research was available open access. The creators hope to use this information for open access advocacy, and to create stories that connect researchers, their work and readers around the world.

Keep up to date with the project on Twitter @OA_Button

1:AM London Altmetrics Conference 25-26 September 2014

Held at the Wellcome Collection in London and organised by Altmetric.com and the Wellcome Trust, this was the very first conference to focus solely on alternative metrics and their use by funders, universities and researchers.

The first day began with an introduction from seven different altmetrics providers to their products. Although similar, they each do something slightly different in how they measure their metrics and present them.

Below is a summary of the event, with a more comprehensive blog available from the organisers here.

Altmetrics, by  AJ Cann https://www.flickr.com/photos/ajc1/6795008004 Licensed CC BY SA 2.0
Altmetrics, by AJ Cann. Licensed CC BY SA 2.0

How are people using altmetrics now?

During this session we heard from a range of stakeholders, including representatives from the Jisc funded project IRUS, a university-publisher collaborative project, and an academic who studies altmetrics as part of his research.

IRUS is using article level metrics to answer the question: are people using university repositories? The answer is yes, and IRUS can help repository managers to benchmark their repository contents and use. IRUS allows an institution to check the quality of its metadata, and also provides COUNTER compliant statistics that can be trusted.

Snowball Metrics is a university-driven and Elsevier-facilitated project that has produced a number of “recipes” designed to help universities use altmetrics for benchmarking. This takes metrics beyond the individual paper or researcher, and allows the university to assess a department as a whole. However altmetrics alone are not good enough to judge scholarly quality.

Finally Mike Thelwall, based at the University of Wolverhampton, presented his research group’s findings. Mike has been investigating how altmetrics relate to citation scores and overall has found a positive but weak correlation. Twitter seems to lead to more publicity for a paper, but doesn’t necessarily lead to more citations; Mendeley’s read count has a much stronger correlation with citations.

What’s going on in the communication of research?

This session gave us a great opportunity to hear from two active researchers on how they communicate their research to an academic audience and beyond. What was apparent was that Renée Hlozek, a postdoctoral researcher, had a lot more time to spend not only on actual research, but also on creative ways to communicate her research to a wider audience. For example, she is active on Twitter, blogs and is a current TED Senior Fellow.

As a professor, Bjorn Brembs spends more time on teaching and university administration. This means he struggles to find time to spend on promoting his research more widely, for example on social media. This is just one example of the importance of context when it comes to interpreting altmetrics. A researcher could find themselves with work of varying altmetric scores depending on the stage their career is at.

Impact assessment in the funding sector: the role of altmetrics

This session first heard from James Wilsdon, who is chairing the steering group on the role of metrics in research assessment for HEFCE. The group called for evidence from publishers, researchers and other stakeholders and received over 150 responses. There are loud voices both for and against altmetrics, and the full response would be published on the HEFCE website in early 2015.

Representatives from three different funders then spoke, including the Wellcome Trust, Science Foundation Ireland and the Association of Medical Research Charities. All three identified the need for researchers to show evidence of engagement with a wider audience and providing greater value for money. Altmetrics have the potential to give funders a lot more information about the research they fund by highlighting attention to articles before they are cited. However, Ruth Freeman from Science Foundation Ireland warned against using altmetrics in isolation, and Adam Dinsmore from Wellcome agreed that the altmetrics “score”  is less important than the conversations happening online.

Altmetrics and publishers

The publishers who spoke identified what they saw as the two primary uses for altmetrics in publishing. First, they allow the author to track how popular their work is; second, altmetrics can help with discoverability. Both PLoS and Springer are planning to use altmetrics to create cross-journal highlights for specific subject areas, for example Neurostars from Springer.

The open access publisher PLoS was the first publisher to introduce article level metrics. Jennifer Lin explained that PLoS plan to do more to reveal the stories behind the numbers. To do this they need to advocate for improvements to article metadata, and see ORCID as something that will help disambiguate author information

Workshops

During the final session of the conference, we attempted to reach some final conclusions and also to think about what developments we would like to see in the future. There were three main points:

  1. The need for standardisation was identified – there are a number of different organisations that are collecting and measuring alternative metrics. Some standardisation is necessary to ensure the results are comparable and trustworthy.
  2. A lot of data is being collected, but there are a lot of improvements to be made in the interpretation and use of the data. The use of altmetrics by funders, REF, etc. should be as transparent as possible.
  3. In all cases, the use of altmetrics should include a consideration of context, and should be used in creating a story of the impact that is followed from the lab to publication to policy implementation.

Altmetrics at Imperial

Symplectic and Spiral both feature altmetrics from Altmetric.com, displayed as a colourful “donut”. You can see an example in Spiral here. Clicking on the icon will take you to the Altmetric page for that article, where you can explore the Tweets and blogs that have mentioned it.

What is ORCID, and what is Imperial College doing about it?

Imagine you need to track down the author of an academic paper, and their name was “J. Smith”. If the area of research is specific enough or if J. Smith has referenced their article on their website it may not be too hard, but otherwise you might struggle. Are you looking at the Jane Smith from Computing, a J. F. Smith working on HPC or Professor James F. Smith – or are the latter two maybe the same person?

Now, you may think your name is unique enough in your area of research to suggest that others can easily find you. Unfortunately, even that does not always guarantee success. Take for instance Henry Rzepa, a chemistry professor at Imperial College London. When you search for his name on the DataCite site, you will find a “Henry S Rzepa”, “Henry S. Rzepa” “Rzepa, Henry” and “Rzepa, Henry S.” Is it safe to assume they are all the same Henry?

Problems like these are not uncommon when trying to identify creators of academic outputs, and different languages, typos, spelling conventions etc. add to the difficulty. ORCID, the Open Research Contributor ID, was designed to address this issue by making authors of research outputs easy to identify through a digital identifier – the ORCID.

ORCID logo

ORCID essentially does two things for authors: It gives them a unique identifier (say 0000-0002-8635-8390) which they can add to outputs to claim authorship. Secondly, ORCID provides a registry to which the outputs can link: http://orcid.org/0000-0002-8635-8390.  The author owns the profile in the registry and can decide what information to make publicly available – this is a personal identifier and the owner has full control over it, not the host institution.

ORCID does not only help academics to be identifiable as authors of an output, it also offers the promise to automate admin work for them. When funding bodies implement ORCID – something not only UK funders are currently looking into – it may be possible to just share the ORCID instead of generating publication lists and/or filling in forms that could be auto-populated from ORCID. ORCID is still fairly new, but there are already some practical implementations. For example, Symplectic Elements, the CRIS system used at Imperial, supports ORCID: when it comes across an article with an ORCID it can automatically add this article to a publication list, no need for the author to manually claim the article (this feature will soon be rolled out across the College).

Despite being a relatively new initiative, ORCID has already seen considerable uptake across the globe. There are now 850,000 authors registered, and about a thousand of them have used an Imperial College email account to do so – this demonstrates that authors see ORCID as a valuable service.

In order to support its academic community, Imperial College became a member of ORCID. ORCID is a not for profit membership organisation, so it is not at risk of being bought by another company. The College has also joined a pilot community of UK universities working with Jisc, ARMA and ORCID to develop best practice, share approaches and increase uptake of ORCID. This post briefly outlines what we are planning to do as part of this project. It forms part of the Jisc reporting requirements, so it also deals with some of the technicalities such as the budget.

Jisc logo

Aims, Objectives and Final Output(s) of the project

The aim of the project is to increase awareness and uptake of ORCID across the scholarly community at Imperial College. The objectives are:

  • Communicate the benefits of ORCID at Imperial College.
  • Roll out an updated version of Symplectic Elements that supports ORCID.
  • Work with Symplectic to improve the ORCID implementation, in particular setting of institutional affiliation and sharing of information during the registration process.
  • Plan and implement bulk generation of identifiers as a service to staff who haven’t already registered.

The aim and objectives will be achieved within the context the College’s Open Access project, and as part of the community of the Jisc-ARMA ORCID pilot project.

Wider Benefits to Sector & Achievements for Host Institution

Imperial College is keen to support its staff to make best use of digital technology in research practice and scholarly communication. ORCID is a solution that helps academics to claim authorship of scholarly outputs, and to be easily and uniquely identifiable as authors. It addresses the problem of the ambiguity of person names, and opens up the potential to improve sharing of research information across systems and organisations – in particular with funding bodies and publishers. This could have the potential to save all parties time and effort and to increase the quality of data relating to research outputs.

The expected benefits for the sector are as follows:

  • Documenting the experiences with bulk generation to enable others to decide whether it is the right model for them;
  • Contributing to improved ORCID support in Symplectic Elements;
  • Increasing awareness of ORCID with College partner organisations;
  • Strengthening ORCID by adding another institutional membership;
  • Contributing to ORCID’s momentum by increasing uptake in the scholarly community;
  • Sharing communications, guidelines, publicity and promotional materials;
  • Sharing experience of integration with institutional systems

Project Team Relationships

  • Project owner: ORCID Project Board, on behalf of the Open Access Publishing group and the Provost’s Board
  • Project Director: Steven Rose (Vice Dean of the Faculty of Natural Sciences)
    Responsible for the business case, benefits realisation and responding to escalated issues
  • Project Manager (Business Delivery): Torsten Reimer (Open Access Project Manager, Research Office)
    Responsible for communicating with the Project Board, assisting with eliciting business requirements and ensuring the proposed solution meets the business need. Also responsible for rolling-out the solution and organising communications.
  • Business Advisors: Ruth Harrison (Team Leader (Education & Research Support), Library), Henry Rzepa (Professor, Chemistry) & Ian McArdle (Research Systems and Information Manager, Research Office)
    Responsible for providing the requirements and business scenarios to help define and test the solution.
  • Senior Supplier: David Ebert (Programme Manager, ICT)
  • Project Manager (Technical Delivery): Sue Flockhart (Project Manager/Analyst, ICT)
  • Developers as required

Projected Timeline

  • Engagement with pilot programme: May-January 2015
  • Initial investigation: May-July
  • Technical planning: August-September
  • Communications: September-January
  • Technical delivery (bulk creation): October (estimate)
  • Review and final report: January 2015

Budget

The College covers most of the cost of ORCID implementation from its own budget, in particular the project management and the ORCID membership fee. The Jisc project budget is used for engagement with the pilot programme, including blogging and providing a case study, and participating in relevant events (£4K), and for supporting the technical development and roll out of bulk generation via the ORCID API (£6K).

Open Access Sector News, June-July 2014

It is time for another round-up of news relating to open access and scholarly communication – here is a summary of interesting things that caught my eye during the past few weeks. I would like to highlight one of the miscellaneous items: analysing its publications, Chalmers University found that the open access articles deposited in the institutional repository have a 22% higher field normalized citation rate than the non-OA articles. So if you would like your citation rates to have a similar increase, why not deposit in Spiral, Imperial’s repository?

Policy

David Willetts has been replaced by Greg Clarke as minister for universities and science. Whether this will have an impact on the government’s OA policies remains to be seen, but Willetts has been an active supporter of open access.

Harvard was one of the earliest universities adopting an open access policy where academics grant the university a non-exclusive licence to distribute publications – this allows deposit in the institutional repository regardless of publishers’ OA policies. Other American universities have implemented similar policies – CalTech recently made such an announcement – and we now also see universities outside of the US following Harvard as KAUST has now adopted a similar OA policy.

Research Fortnight published a summary of FOI requests on RCUK open access compliance. “The average rate across the 27 universities that responded was 49 per cent, just above RCUK’s target. However, at least 11 universities have not hit the target—and the real number may well be higher, given that 57 universities did not respond.” Regarding publishers actually delivering OA, Research Fortnight conclude that “on average, 8 per cent of articles that should have been made open access had not been, and that 12 per cent carried no clear indication of their open-access or publishing-licence status.” As RCUK have now made the details of their review public, we are in the process of bringing together the relevant data at Imperial College; a challenge is to identify which of the roughly 10K articles annually published by our academics fall under the RCUK policy, and also the articles where the authors have paid for open access from their own budgets or gone down the green route in an external repository without alerting the College.

Wiley Exchanges published an interview with Mark Thorley, who coordinates OA and RDM for RCUK. He is overall positive about progress and defends gold OA, but he is also concerned about fluctuating embargos and some universities “acting in an ‘anti-Gold’ OA manner”. He commends UCL on launching a new OA university press. It is interesting to compare his comments with those of the Wellcome Trust’s Robert Kiley who at a recent Jisc-CNI event was very critical of hybrid journals and publishers progress in switching to OA publishing.

Subscriptions, Gold OA and cost of scholarly publishing

A study on Evaluating big deal journal bundles finds that journal pricing is not necessarily related to size or research outputs of the subscribing universities. It also concludes that academia is receiving significantly less value from commercial publishers than from non-profits: “Among the commercial publishers in our study, Elsevier’s prices per citation are nearly 3 times those charged by the nonprofits, whereas Emerald, Sage, and Taylor & Francis have prices per citation that are roughly 10 times those of the nonprofits.” The study has been picked up by media, including the Guardian.

Notes of the RCUK International Meeting on Open Access (20th March 2014) have been made public. They contain some interesting information. For example, the Wellcome Trust has imposed sanctions in respect to its OA policy 62 times in the last twelve months (none at Imperial College), and the overall compliance rate is now 66%. Cameron Neylon estimates that gold OA is now 20-25% of the global market. The paper has a useful summary of global OA activities, by country.

Stuart Lawson suggests that the Finch Report may have missed evidence when it estimated the average price of APCs. According to his article, a comprehensive study that had not been acknowledged in the report set the average APC at about 1/3 of the price eventually published by Finch. As the current APCs from hybrid journals fall into the much higher bracket given by Finch, Lawson speculates that the Finch estimate may have become a “self-fulfilling prophecy”.

A lecturer in New Zealand has taken inspiration from Tim Gowers and has sent FOI requests about subscription payments to NZ universities.

UC Davis have set out a plan for a Mellon-funded project to investigate the institutional costs of gold open access, partnering with North American universities including Harvard. We are doing something similar, but on a smaller scale, as part of the OA project.

Other open access related news

A Swedish study shows that OA articles deposited in the repository of the Chalmers University of Technology have a 22% higher field normalized citation rate than the non-OA articles.

Taylor & Francis apologised for interfering with and delaying the publication of an article that criticised the profits of major academic publishers. The apology followed from the editorial board considering resignation over what could be perceived as censorship.

An analysis of data from the Directory of Open Access Journals (DOAJ) shows that the US has the highest number of OA journals, followed by Brazil and the UK. The UK is the country with the second highest percentage of OA journals that charge a fee – about 64%, compared to Germany’s 30% and Egypt’s 87%. Egypt ranking highly is probably due to the OA publisher Hindawi being based in Cairo, whereas the high percentage of OA journals that charge authors a fee in the UK may be due to funders’ focus on paid-for OA.

The Research Council of Norway is making available money to cover 50% of open access publishing costs in a new, five year funding scheme.

“A Subversive Proposal”, a message sent by Stevan Harnad encouraging academics to make their publications freely available online, has just had its 20th anniversary. It is generally seen as one of the founding documents of the OA movement. Being asked what he would change if he were to write it today, Harnad responded: “Knowing now, in 2014, that researchers won’t do it of their own accord, I would have addressed the proposal instead to their institutions and funders.”

Open Access Sector News, April-May 2014

It is time for another round-up of news relating to open access and scholarly communication – here is a summary of interesting things that caught my eye during the past few weeks.

FOI request reveals cost of scholarly communication

In late April, the well-known mathematician Tim Gowers published the responses to an FOI request to the Russell Group universities. In the request, he asked how much each university “currently spends annually for access to Elsevier journals”. Due to the terms of the agreements between publishers and libraries, universities are not allowed to make this information public or share it with our staff. Because of those legal concerns, some universities initially declined the request (here is some context as to why), but with data from the LSE added on 31st May we do now have numbers from each university. Rumour has it that FOI requests regarding payments for other publishers are in preparation, so it may very well be that we will have more data by the end of the year. This would give us a more holistic view on the cost of scholarly communication and allow us to assess the value the academic sector gets for the money. Following the publication of the data, concerns have been raised about the overall amount that universities pay, but to me the more interesting question may be about the relative value that publishers add to the process. Tim Gower’s blog post includes some information on how his colleagues at Cambridge view this; it may be one of the longest posts I have read, but it is worth having a look at if you are interested in these topics.

The publication of the FOI data has led to a broader discussion on subscriptions internationally. Zen Faulkes has correlated subscription payments in the UK and US to student enrolment numbers (no significant link) and income from students.
The Open Knowledge Foundation blog contains reflections on the data, including calculations resulting in the claim that switching to an open access model where all articles are paid for by the author/university would result in a 76% reduction of the overall cost of publishing. It should be noted that the calculations are based on the APC for one open access journal and some assumptions that may or may not be accurate. Therefore one could easily arrive at a vastly different result – both higher and lower. As I have discussed in a previous blog post it is clear though that the publication charges for OA journals are lower than those of journals that are funded from both OA charges and subscriptions (the so called “hybrid” journals).
Establishing the overall cost of Gold OA publications is surprisingly difficult as the money comes from many sources including individual research grants; if you want to delve further into this, have a look at a blog post by the Australian Open Access Support Group. They estimate that in 2013 Australian researchers may have spent US $9m on gold OA, as opposed to $4m across the Netherlands.

Other countries are ahead of the UK in collating subscription data. For example, German libraries spent €170m on books and €130m on subscriptions in 2011, with an average of €660k per library on subscriptions, according to Bjoern Brems. From the Gowers data it would appear that UK universities on average pay in the region of 40% more than the German libraries. However, the UK data only includes Elsevier and the Russell Group universities, so we are talking about the largest subscription deals, and you have to factor in the exchange rate and different tax regime – we will have to wait for data from further FOI requests that will allow a more systematic comparison.

One of the concerns in the sector is the so called “double dipping”, where institutions that already have paid for subscriptions then also pay open access publication charges (APC) for individual articles in “hybrid” journals. There is not yet a working model on how to address this, but SAGE Publications and Jisc Collections have announced that they are working together to develop one. SAGE is offering discounted APCs to subscribers, and from 2015 will globally discount subscription charges for journals with 5%+ gold OA articles. Journals below that threshold will be reviewed individually. While discounts will probably be seen as positive, global discounts effectively result in the UK using its research budget to subsidise subscriptions abroad. IoP are also launching offsetting schemes.

Further news

HEFCE have released an invitation to tender for an Economic analysis of business models for open-access monographs.

If you are interested in an overview of what UK universities are doing about OA, have a look at a series of OA case studies published by Jisc Collections.

The University of Edinburgh has released new data on its open access activities. 23% of publications listed in the university’s research information system are available as open access. The percentage of journal articles is higher – in 2013 for instance 51% of all humanities articles have been made available as OA. Since the beginning of this year, they have on average published around 50 paid-for (“gold”) Open Access articles per month with funding from the Wellcome or RCUK. MIT has recently announced that since 2009 37% of papers published by their academics have been made available through their repository, a number they hope to increase significantly.

If you want some reasons why open access publishing is positive, have a look at a summary of a presentation given by Alma Swan in Bournemouth: “The case for Open Access within a university”.

Not everyone is convinced of open access though. Scholarly societies in particular are concerned about the impact OA might have on their business models. EDP Open released a report on Learned Society Attitudes towards Open Access (PDF) that summarises these attitudes. A majority of societies think OA might put some of them in financial jeopardy and two-thirds are looking for help, especially with regards to funders mandates. Interestingly, about two-thirds would also like to offer gold OA publishing.

Open Access News, March-April 2014: HEFCE OA policy and Wellcome APC data

For the College’s Open Access Publishing group I put together a semi-regular digest of news and recent developments around to Open Access and related topics. As this might be of interest to others too, we have decided to make this available via the blog too. For more information on OA, take a look at the Open Access website of the College Library.

General News

HEFCE have released their Open Access policy. We will discuss this in more detail later, but this policy is likely to be a game changer as far as Open Access in the UK is concerned.

The Research Information Network have released a report on Monitoring Progress in the Transition to Open Access, including proposals for a framework of indicators to monitor progress towards open access. Jisc have, informally, confirmed that their OA Monitor project is likely to address at least part of this if institutions find this useful.

From April 2014 onwards, the National Institute for Health Research will expect peer-reviewed articles to be made available as Gold OA, expecting full compliance within four years.

Wellcome and NIH are withholding grant payments when OA obligations are not met (Imperial scholars have not been affected by this).

The University of Konstanz has broken off license negotiations with Elsevier and will no longer subscribe to any Elsevier content. “The publisher’s prices are too high, said university Rector Ulrich Rüdiger in a statement, and the institution ‘will no longer keep up with this aggressive pricing policy and will not support such an approach.’ […] Adding to tensions, the university hinted, was a feeling that academia is essentially paying twice for its own work. ‘Universities are in a way forced to purchase a good back in the form of expensive subscription fees – a good which is actually produced by their own scientists,’ said Petra Hätscher, a university administrator, in a statement.”

The Open Access Scholarly Publishers Association has suspended Springer’s membership because of systematic problems with the editorial process at Springer revealed by the so-called “Open Access sting”.

Jisc, RLUK, RCUK, Wellcome Trust and others published a report that examines the potential risks associated with the APC open access market (APC = Article Processing Charge for OA articles). The economic analyses undertaken provided a strong indication that the full open access journal market is functioning well in creating pressure for journals to moderate the price of APCs. On the other hand, the current hybrid market was found to be extremely dysfunctional, with significantly higher charges and low levels of uptake. Indeed, the average APC in a hybrid journal was found to be almost twice that for a born-digital full open access journal ($2,727 compared to $1,418). The authors suggest different approaches, including only paying APCs to hybrid journals that offer reductions for subscriptions payments or setting caps to APCs in relation to the quality and range of services offered by the journal.

Wellcome Trust releases data on Article Processing Charges for Open Access

The Wellcome Trust released the full data on the APC spend 2012-13. A community effort led to that data being cleaned up (Google doc spreadsheet) and analysed within a few days. The analysis revealed that the average APC paid by Wellcome is £1,820.

In her analysis, Michelle Brook from the Open Knowledge Foundation highlighted that most of the money goes to hybrid journals:

In Oct 2012 – Sept 2013, academics spent £3.88 million to publish articles in journals with immediate online access – of which £3.17 million (82 % of costs, 74 % of papers) was paying for publications that Universities would then be charged again for. For perspective, this is a figure slightly larger than the Wellcome Trust paid in 2012/2013 on their Society & Ethics portfolio. Only £0.70 million of the charity’s £3.88m didn’t have any form of double charging (ie, was published in a “Pure Open Access” journal) – with this total being dominated by articles published in PLOS and BioMed Central journals (68 % of total ‘pure’ hybrid journal costs, 80 % of paper total).

Ernesto Priego is concerned that high APC may effectively just shift the serials crisis from the library to the research budget and that arts and humanities researchers in particular might be priced out of publishing. He has created a visualisation of the lowest and highest APCs charged by 11 publishers (image licensed CC BY SA 3.0):

Lowest and highest APCs levied by 11 major publishers, by Ernesto Priego

Analysis of the Wellcome data has also identified issues with the licensing information on publishers’ websites:

  • Michelle Brook has shown that Wiley-Blackwell wrongly claim that CC BY licenses do not allow others to re-use the article commercially.
  • Peter Murray-Rust has identified several cases where Elsevier has put OA content behind paywalls, charged for the full text or mislabelled the license. This has been picked up by Times Higher and Elsevier have admitted that they mischarged 50 people for use of OA content; they are refunding money.

Building on the community effort, Wellcome have released a statement on the APC data. They thanked the community and criticised publishers for not delivering the quality of service expected. It is worth quoting this in more detail:

Inevitably, with a dataset of over 2000 articles, published by 94 different publishers, problems have been identified. These include:

  • Content remaining hidden behind a publisher pay-wall;
  • Content freely available on the publisher site, but not available in PMC/Europe PubMed Central;
  • Missing, incorrect, or contradictory licence information
  • CC-BY licensed articles still linked to sites such as the Copyright Clearance Centre, where readers may be charged for re-using open content.

In summary we contacted 20 publishers in relation to 150 articles (approximately 7% of the total number of articles for which an APC had been paid).

We expect every publisher who levies on open access fee to provide a first class service to our researchers and their institutions. […] Even though there are only a small number of articles that the Wellcome Trust has paid to be open access that have remained behind a pay-wall, this is not an acceptable situation in any instance.

The bigger issue concerns the high cost of hybrid open access publishing, which we have found to be nearly twice that of born-digital fully open access journals. We need to find ways of balancing this by working with others to encourage the development of a transparent, competitive and reasonably priced APC market.