Blog posts

Building Research Software Communities

Building Research Software Communities: Running a workshop on community building and sustainability for the research software community

Michelle Barker, Jeremy Cohen, Daniel Nüst, Toby Hodges, Serah Njambi Rono, Lou Woodley

On Wednesday 17th March 2021, around 50 individuals from a wide range of different countries and time zones came together for the first of two 2-hour sessions that formed our “Building Research Software Communities: How to increase engagement in your community” workshop.

Run as part of the SORSE Series of Online Research Software Events, this workshop brought together an organising team consisting of 3 members of the international research software community and a group of speakers including experts in community engagement and sustainability. In this blog post we provide an overview of the workshop and some of the key messages and outcomes.

Why run a communities workshop?

The workshop’s three organisers – Michelle Barker, Jeremy Cohen and Daniel Nüst – between them have experience of starting and running, or participating in a range of research software communities at local, regional, national and international levels. Observing that many research software communities face similar challenges when getting started or trying to sustain activities, the workshop was set up with the aim of helping to address these issues.

Scientific or research communities are often set up by enthusiastic individuals who are keen to help their peers, raise the profile of their field and provide opportunities for training, knowledge exchange and networking. After what is often an extremely promising start with many people engaging and lots of attendees at initial events, it’s quite common for a community to lose momentum and for numbers to reduce to a small but committed group of people. Community organisers may begin to wonder where they went wrong, what they could have done differently and why people are not participating in the same numbers. Many people in the research software community are now involved in developing or helping to run communities (such as national Research Software Engineering (RSE) organisations) or want to initiate grassroots activities, but they are often without the experience or training to do so. The aim of this workshop was to try and offer some ideas, guidance and training from a group of scientific community engagement and sustainability experts to help begin to address this.

The workshop

The workshop included a combination of lightning talks and longer sessions run by leading experts in community engagement and sustainability from the Center for Scientific Collaboration and Community Engagement (CSCCE) and The Carpentries. You can find the full agenda on the workshop webpage. It was attended by participants with varying degrees of responsibility for, or interest in, managing research software communities.

Starting with a group of 4 lightning talks to set the scene, we heard from both current and former community managers representing communities at different stages of development. This provided a great opportunity to hear about some challenges faced but also success stories. Following the introductory lightning talks we had our first collaborative session of the workshop with Daniel Nüst running a short group feedback session.

Your three biggest community challenges

In the feedback session, participants were split into breakout groups, each with their own collaborative document, and invited to discuss and note down the three biggest challenges they’ve experienced/observed as a community manager or member. The results from this session helped to guide the discussion during subsequent sessions. The session provided a wide range of interesting and helpful responses which were summarised into five core areas:

  • Engagement – Keeping community members interested and engaged; managing challenges around limited time availability and workload issues
  • Incentives – Different environments (e.g., online / in-person) provide different motivation or incentives to participate in a community; what benefits/opportunities/activities incentivise participation?
  • Expectations – Be realistic about what a community can offer or what to expect from a community
  • Communication – Keeping community members informed; reaching out to potential new members; highlighting community aims and activities, etc.
  • Participation – Will people participate? How long will they participate for? How do you maintain participation?

Describing member engagement with the CSCCE’s Community Participation Model

After a chance for the workshop participants to discuss their community challenges, Lou Woodley, Director of the CSCCE, ran a session looking at “Describing member engagement with CSCCE’s Community Participation Model”. One of the key elements of this session was the presentation of the CSCCE’s Community Participation Model which defines four modes of member engagement that can take place within a community and one meta-mode – the champion mode discussed on day two. Community participants generally begin to engage with the community in the consume mode, taking in the materials that are made available through, for example presentations at events and online content such as newsletters. Levels of engagement can build through contribution and scaffolded collaboration to the highest level of engagement – co-creation – where participants work within the context of the existing bounds of community activity to create something new.

Download a guidebook describing the model in full here.

Community Champions

After a quick recap of the previous day’s material, the workshop slot on day 2 kicked off with a session from the CSCCE on Community Champions. The champion mode is the fifth mode in CSCCE’s Community Participation Model and highlights member engagement by emergent leaders within a community, who take on roles to maintain, grow and evolve the community’s activities. This might look like co-chairing working groups, serving on a code of conduct committee or spreading the word about the community to recruit new members. Lou Woodley highlighted the principles behind developing community champions and the important role that they can play in supporting community sustainability and ongoing engagement – something a community manager is unlikely to be able to do alone.

Community Sustainability

The final session of the workshop was a collaborative session on community sustainability run by Toby Hodges and Serah Njambi Rono from The Carpentries. Toby and Serah highlighted the challenges in ensuring community sustainability and presented various ideas to help address them. Using a collaborative document, a number of thoughts and comments were gathered from workshop participants in response to some important questions around the topic of sustainability. You can read more details about this workshop session in Toby and Serah’s “Pondering on the Question of Community Sustainability” post on The Carpentries blog and see a video of the session on the workshop’s SORSE event page.

Useful Links and Further information

This workshop was run as part of the SORSE “Series of Online Research Software Events. The SORSE series has now finished but you can take a look back at other SORSE events, many of which cover related topics, and see videos from many of the events via the SORSE Programme page.

Videos from parts of this communities workshop are available on the workshop’s SORSE event page and further details including the full agenda and session descriptions are available on the workshop website.

Why not join the CSCCE’s Community of Practice on Slack? It’s a great place to gain new knowledge about community development, engagement and sustainability and to share your experiences and questions.

 

The content of this blog post is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence. The post has also been published on the de-RSE and CSCCE blogs.

Research Software Directories

This is a summary of a SORSE discussion session, presented by:

  • Mark Woodbridge, Imperial College London
  • Vanessa Sochat, Stanford University
  • Jurriaan Spaaks, Netherlands eScience Center

And featuring contributions from:

  • Malin Sandström, INCF
  • Alexander Struck, Humboldt University of Berlin

Introduction

The discussion session “Research Software Directories: What, Why, and How?” was held on September 16 during SORSE, an International Series of Online Research Software Events. As presenters, we each shared efforts to develop and maintain software directories: catalogues to showcase the software outputs of an institution or community. The directories presented were:

Each of the above offered several advantages and disadvantages, or were scoped for particular use cases. For example, research-software.nl provides a robust application for serving detailed metrics and metadata for software, however it requires more manual entry. The Research Software Encyclopedia is automated and does not require hosting, but it lacks the same level of metadata. The Imperial College London and GitHub Search research software directories offer much quicker to deploy solutions, but might be too simple for some use cases. The directories are discussed in detail in the following sections. In addition to this set, we suggest the reader take a look at the Awesome Registries list to find additional examples.

How many participants use software directories?

We were quite surprised at the results of asking attendees the extent to which they have contributed or used software directories. For a total of 27 participants, 43% have used a directory for a relevant project, 27% have submitted software to a directory, and 58% indicated neither of the above.

Presentations

The Research Software Directory by Netherlands eScience Center

Jurriaan’s presentation started off by explaining why the Netherlands eScience Center had a need for what eventually became the Research Software Directory. Primary reasons were that as the Netherlands eScience Center grew beyond say, 20 or so engineers, tracking what software was available in-house really became too difficult a problem to do ad-hoc, despite the fact that all of their repositories are publicly accessible on GitHub. Secondly, the eScience Center strives to be as open as possible, and they thought it was important to be able to show the outside world where the taxpayer’s money had gone. Lastly, the eScience Center has a continuous need to keep track of various metrics, both for reporting to their funders (SURF and NWO), but also for helping management make informed business decisions.

Jurriaan then demonstrated the eScience Center’s instance of the Research Software Directory. While walking the viewers through the design, he explained how the product pages’ design was helping site visitors on their way towards adoption of the software presented on the product page.

When designing the Research Software Directory, specific attention was paid to how an instance is filled with data, how this data is curated, and how to do this in a way that can be sustained over time. To this end, the Research Software Directory harvests much of its information automatically, for example using APIs to GitHub (code development platform), Zenodo (archiving service), and Zotero (reference manager). This setup allows engineers employed by the Netherlands eScience Center to stay mostly in their comfort zone (i.e. GitHub). They just need to make sure to follow best practices such as having publicly accessible repositories, making releases on Zenodo using the automated integration, and including software citation metadata (CFF) in their repositories. Given that they already do much of that anyway, making an entry in the Research Software Directory can be achieved in a few clicks via the Admin interface that the Research Software Directory provides.

The Research Software Directory has proven to be a great resource for managing the organization, for providing funders with relevant metrics, and for increasing the visibility of tools. Despite these upsides, of course there are some downsides as well, for example it has proven difficult to carve out enough time to curate prose on the product pages, leading to text snippets that are sometimes too difficult to read for visitors not yet familiar with the software that the product page presents. A second problem is maintenance of the Research Software Directory software itself: the software stack includes more than 40 techniques, methods, and tools, in various languages and using a variety of frameworks. It has proven difficult to find developers that are familiar enough with all of these to be effective at maintaining the site. While this has not led to any significant downtime in the 3 years research-software.nl has been running, eScience Center intends to start reducing the software stack in the very near future. Furthermore, they are investigating whether it’s feasible to provide Research Software Directories as a service.

The Research Software Directory by Imperial College London

Mark Woodbridge demonstrated Imperial College’s Research Software Directory, explaining how it was developed to present a manually curated list of GitHub and GitLab repositories – motivated by a desire to rapidly catalogue and demonstrate the breadth of software developed at Imperial. It is also intended to encourage collaboration by assisting researchers to identify existing expertise and projects at Imperial.

The chosen approach has resulted in a system which is easy to maintain – both in operational complexity and in adding entries to the directory (even if the latter does depend on some familiarity with git and GitHub i.e. making a commit and pull request). This simplicity comes at a price: it depends on Algolia (a freemium service), has limited features, and is not easy to customise. It also relies on manual curation and repository metadata: due to limited bandwidth and lack of incentives, developers rarely submit or annotate software themselves. Finally, it lacks the polish and level of detail that you might expect of a public-facing showcase.

The system has however achieved its aims in effectively showcasing research software and developers at Imperial, and has provided a set of metadata enabling the identification of preferred languages to fast-growing fields of research. A suite of standalone utility scripts ensures that the contact details and project web pages remain up-to-date, and that new repositories by known developers are added to the directory in a timely manner.

The Research Software Encyclopedia

The Research Software Encyclopedia (RSEPedia) is a community-driven, open source directory that provides a means to communicate about software. It consists of three components – a set of criteria and taxonomy items used to describe or otherwise communicate about software categorization preferences, a database, and a command line client to interact with those components. The criteria and taxonomy items are maintained in their own GitHub repository, https://github.com/rseng/rseng, and render to an interface to allow for exploration and visualization. Importantly, the site for these items hosts a weekly software showcase, allowing the community to learn more about open source libraries that might be useful for their work. The terms are also served programmatically to a RESTful application programming interface (API) that makes them readily available for the RSEPedia software, which is also provided on GitHub (https://github.com/rseng/rse). Using the software, an individual or institution is empowered to easily generate a database and interface for a set of software they care about. They can inspect, add, search, or otherwise interact with metadata. While relational databases can be created, the community maintained database is a flat file database hosted on GitHub (https://rseng.github.io/software) that allows an interested contributor to browse, and annotate software with criteria and taxonomy items in an online interface. Annotation only takes a few clicks, and the process to make changes and update the database is fully automated via GitHub actions. Annotation in bulk is also easy to do locally after cloning the software repository, starting the annotation interface, and opening a pull request with changes. Importantly, although annotation can help to share ideas about software, it is not required to make the RSEPedia useful. By way of being able to communicate about software via asking questions, and by way of the software showcase, the RSEPedia can be successful for your needs if you just need a way to describe what you are looking for (e.g., for a grant or journal) or just want to share your set of software to be easily searchable.

GitHub Search is a derivation of the Research Software Directory by Imperial College London, but it removes the Algolia dependency, and derives software repositories directly from the GitHub API list of repositories for an organization directly on GitHub pages. This means that deployment is easy, coming down to simply creating the repository with a GitHub action to build it at some frequency to update the pages.

Discussions

After the presentations, attendees were divided over three groups for a 20-minute discussion session. All three groups saw lively discussions and discussed a plethora of relevant subjects, a selection of which is included below.

How do software directories interact with high performance computing (HPC)?

With several attendees that work as administrators for HPC, the question quickly came up about the relationship between software directories and HPC centers. Indeed, these centers typically maintain a large catalog of software for a user base, and it could be beneficial to link this software catalog or strategy to maintain it with a software directory. For example, if you are familiar with spack or easybuild you could imagine having integration to use a software directory to look up metadata, or generate user-friendly documentation pages. The pages might have install instructions, examples, and optimization hints for different architectures.

Guix-HPC is a package manager for a variety of software that is developed to allow reproducible HPC environments. It may interact with existing instances of Research Software Directories.

Curation policies

The main concern related to the “curation” of software directories were criteria for inclusion. A lively discussion related to the definition of “research software”, particularly in relation to scale and licensing. In the broadest sense there was agreement in principle that it could refer to any tool or library used to produce scientific results.

In terms of scale, attendees working in life sciences research emphasized that research software in their context could be a standalone script, and software directories should therefore “scale-down” appropriately.  Scripts of this type may be less substantial but their quality could well be assessed similarly to more prototypical projects in terms of documentation, design for re-use and version control.

Licensing was a more challenging topic – an argument was made for directories enabling users to find any tool that might accelerate research, including commercial software  – as long as an appropriate licence was available.

In broader terms, there was consensus that curators should avoid making assumptions about software applicability and relevance, even if they do have domain knowledge. More important than strict policies is effective annotation and filters so that users can apply their own criteria when searching for relevant software.

Searching for software

Searching for software presents its own challenges as an RSD only presents local results and many other platforms would need to be consulted for an exhaustive overview of relevant packages. Here, some registry lists prove to be helpful, for example Awesome Research Software Registries.

The purpose and minimum features of Research Software Directories

Participants identified discoverability as a major issue in relation to research software, particularly for domain specialists (i.e. end-users). This led to the following features being considered of primary importance:

  • Metadata clearly explaining the purpose and value of individual software tools in non-technical terms. The community is currently working on metadata standards like CFF or CodeMeta.
  • Contact details for the authors of the software in case further advice or support is required
  • Installation and getting started instructions
  • Guidance on how to cite the software
  • Licensing terms. This was discussed not only in relation to terms of use but also, for non-free software, ensuring cost-efficiencies by avoiding unilateral purchasing decisions and promoting the use or procurement of shared/group licences.

Many other features may benefit researchers, for example, linking from an RSD entry to its accompanying paper and data, as suggested in “Generalist Repository Comparison Chart” or listing received software citations, as implemented in swMATH.

Organization-based registry vs community-based registry

Some registries out there are scoped to serve an organization, whereas other registries like ascl.net or bio.tools aim to serve an entire research community. An advantage of the latter is increased traffic to the registry, and real benefits for users to browse the registry to see if somebody else in the community already created a solution. However, because the social structure across the community is quite loose, it will be more difficult to keep people involved, to discover new tools that could be added to the registry, and to make sure that the language used on the registry’s pages is understandable by everyone in the community. Furthermore, governance of the instance will be more difficult. For example, within the community there may exist different opinions on what metadata should be kept, and weighing these opinions will be more difficult in a larger community than a small one.

In contrast, organizational registries are more easy to run and govern — discovering tools that could be added is (or used to be) a matter of hanging out at the coffee machine and asking your colleague what they are working on right now. Helping your colleague enter their data, and making sure they do it correctly, is easier as well, and some good old-fashioned peer pressure can be applied if needed. Funding policies currently do not mandate the publication of research software, as Horizon 2020 required for research data (if possible).

Further resources

Recommendations and Next Steps

By discussing topics of curation, federation, technology and sustainability of research software directories with a wider audience, this discussion section hoped to not only promote the benefits of such directories and encourage their deployment, but also to identify issues and gather ideas to address them. From discussion above, it’s clear that there are interesting projects and updates to existing directories that might be pursued.

Remote working for researchers and developers

This post was compiled by Mark Woodbridge, Jeremy Cohen and Tony Yang of Imperial College’s Research Software Community.

As COVID-19 drives us into uncharted territory, many of us at Imperial will be having our first ever experience of working off-campus for an extended period of time. It, of course, depends on our role, but many members of the College community will be no stranger to mobile working – pitching up at one of the many campus cafes, breakout spaces or a coffee shop, getting out our laptop or mobile device and switching very quickly into a state of focused work. Maybe finishing those next couple of paragraphs of a paper or report, fixing that annoying bug in our scientific code that someone just reported, or responding to an urgent technical query from a collaborator. Sometimes a change of space or environment provides just that little shift in perspective that you need to help solve that challenging technical problem, or get the right wording for that difficult section of the paper, much more quickly than if you’d sat in your office staring at your screen for hours!

Over the coming weeks, we’ll be facing a rather different reality of remote working which is likely to involve spending a significant amount of time working in one space, without the flexibility that comes from being on a large campus. While our primary concern is going to be for the health and safety of our family, friends and colleagues, many of us will also have concerns about how we’ll manage to work effectively in these difficult times. We may have worries about feeling isolated, about maintaining our research efficiency and quality, about meeting deadlines, or more generally about how things will change in our day-to-day working lives as our routines are uprooted completely!

Within Imperial’s Research Software Community, many of us are software developers (Research Software Engineers) or academics/researchers who spend a significant amount of time writing software. The software developer community has embraced remote working over recent years and there are now many examples of companies that operate an entirely remote model with individual developers distributed around the world. If you’re a developer with a laptop and a good internet connection, location is no longer a barrier. In the research community, things are a little different and while many of us will be aware of cases where individuals spend the bulk of their time working remotely, discussion, collaboration and the opportunities posed by ad hoc meetings in the common room make working in a campus environment important and beneficial. Nonetheless, one huge benefit of the wide-ranging use of remote working in the software community is the wealth of tools, advice and examples now out there that make lone, remote working much easier.

A few members of Imperial’s research software community have come together (remotely!) to provide some tips, examples and advice that we hope might be helpful if you’re working remotely. There are many similar articles online but here we’ve tried to provide some thoughts and examples from our own experiences and we hope that these will be particularly relevant to members of the College and its research community. We’ve marked resources only accessible to Imperial members with an asterisk.

1) Communicating with colleagues

Even if you don’t consider yourself to be the most outgoing person, you shouldn’t underestimate the importance of communication with colleagues or collaborators when you’re working alone. If we’re on campus most of the time, we probably have many informal chats with others in our office, people we bump into in the corridor or coffee room, etc.

Think about perhaps scheduling at least one 30 minute catch up with one or two colleagues each day. It doesn’t need to be time wasted through unstructured chat, although even this sort of meeting can be really valuable in helping you to feel connected and ultimately helping to improve your wellbeing. The College recommends and supports the use of Teams but many other solutions are available.

Some other suggestions:

  • Deliberately check in with others and ask how they are – especially if you know they are isolated.
  • Video calling, however uncomfortable to start with, can go some way towards replicating the interactions we’re used to in the campus environment.
  • Celebrate and share achievements, however large or small – from bug-fixes to new releases of your code!
  • Try remote pair programming or debugging: e.g. Live Share
  • Take part in an online community. Our local Research Software Community is on Slack*. There is a new Remote Working Wellbeing* group on Yammer. Outside Imperial many Meetup groups and other events (e.g. CW20) are now going online.
  • Reach out to others: whether housemates, or your local community via Facebook or other virtual groups. Consider volunteering where it is appropriate to do so.
  • Contribute to an open source project. Open source projects (such as The Turing Way) tend to have an established and inviting online community. If it is complementary to your work, and you have the capacity to do so, then making a contribution – even fixing a typo – can be a very fulfilling experience and introduce you to a broader community.

2) Maximising focus

Some people will be used to working from home for at least one day a week – perhaps in an environment that enables us to concentrate at least as well as in the office. But many of us won’t have anything resembling a home-office (or even a desk!) and may have caring or other responsibilities that are difficult to combine with sustained focused work. Generic advice is therefore almost impossible to provide, but here are some ideas:

    • When working in isolation without scheduled meetings or other engagements it can be easy to confuse time spent working with actual productive hours. Try setting alarms or using a timer for focused periods.
    • Messaging apps are great for keeping in touch but can also provide a stream of interruptions. Decide when you’ll be online and offline and set/indicate your status appropriately. And conversely, be mindful about how and when you contact others.
    • If you’re able to control your hours and environment then take advantage: work when you’re most productive, listen to music for programming, ambient sounds… or simply concentrate in potentially unfamiliar (but welcome!) peace and quiet.
  • Delineate your working day (and your workspace) – consciously decide when you’re working and when you’re not, and somehow communicate this to those around you.
  • You may need to be especially creative if you do have caring responsibilities. Don’t be afraid to adopt a working pattern or shifts different to those of your colleagues – as long everyone is aware and can continue to communicate effectively. You may find these Parent Scheme resources helpful.
  • Take breaks, rehydrate, try to eat healthily (especially considering the reduced physical activity you may be getting), and try to get some fresh air. Imperial has a virtual running club if you’re looking for some motivation!
  • Take advantage of the time saved by not commuting: perhaps by taking up a new hobby – ideally something that exercises a different part of your brain! Consider trying meditation: the College Chaplaincy is offering remote sessions and there is also a Mindfulness group* on Yammer.

3) Working comfortably

Without a home office and the availability of the usual alternatives such as libraries, shared workspaces or even coffee shops it can be difficult to find a comfortable place to work for prolonged periods. Ideally find more than one place where you can work and then alternate – even if one is the sofa! Experiment: improvise a standing desk (maybe putting your stockpile to good use…). Take breaks to relieve any tension and give your body a break by stretching or trying some beginner’s yoga. However, if you feel that your health and/or productivity is affected then don’t hesitate to talk to your supervisor or to Occupational Health, who have published some tips for remote working.

A new way of working

We hope that some of these ideas can ease the transition that will undoubtedly be challenging for some of us. But it’s also an opportunity to reassess how we work and how it fits around the rest of our lives. So try to establish clear boundaries between work and relaxation time and spaces, make yourself comfortable, and connect with colleagues, friends and family where you can. Also remember to take enough time off and do not work for prolonged periods without breaks in order to avoid burnout. Transitioning into remote working is a process and a reduction in productivity initially can be expected to happen. Aim to develop a routine, but in the meantime be patient and experiment. Don’t worry, you will soon learn how you work most productively, and hopefully pick up some good habits for the longer term! But if you do struggle then be sure to communicate, take advantage of the many resources out there that can provide help, and ask for advice and assistance if necessary.

Keep safe and we wish you lots of productive (remote) coding, paper writing or research!

Further reading

Did we miss any useful resources? Join the discussion on Slack* or let us know @ImperialRSE!

* As a post targeted primarily at members of the Imperial College London community, this article includes some links that will be accessible only to members of the College. These are marked with an asterisk. Nonetheless, we have included many publicly accessible links and if you are not a member of the College community, we hope you’ve found the content interesting and helpful.

 

The content of this blog post is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence.

Running Jupyter notebooks on Imperial College’s compute cluster

We were really glad to see James Howard (NHLI, Faculty of Medicine) announcing on Twitter that he’d published a Kaggle kernel to accompany his recent publication on MR image analysis for cardiac pacemaker identification using neural networks via PyTorch and torchvision. Sharing code in this way is a great way to promote open research, enable reproducibility and encourage re-use.

Figure 3 from Cardiac Rhythm Device Identification Using Neural Networks

We thought it might be helpful to explain how to run similar notebooks on Imperial’s cluster compute service, given that it can provide some benefits while you’re developing code:

  • Your code and data remain securely on-premise, thanks to the RCS Jupyter Service and Research Data Store
  • You can run parallel interactive and non-interactive jobs that span several days, across multiple GPUs

With James’ permission we’ve lightly modified his notebook and published it in an exemplar repository alongside some instructions to run it on the compute cluster. We hope this can help others to use a combination of Conda, Jupyter and PBS in order to conduct GPU-accelerated machine learning on infrastructure managed by the College’s Research Computing Service – without incurring any cost at the point of use.

Many thanks to James Howard for sharing his notebook and reviewing our instructions

RSLondonSouthEast 2020

RSLondonSouthEast 2020, the annual gathering for Research Software Engineers based in or around London, took place on the 6th February at the Royal Society. The College was strongly represented by contributions from RSEs based at Imperial.

Full talks:

Lightning talks:

Posters:

Jeremy Cohen introduces RSLondonSouthEast 2020 at the Royal Society

Jeremy Cohen (Department of Computing) was the chair of the organising committee. Stefano Galvan (Department of Mechanical Engineering), Alex Hill (Department of Infectious Disease Epidemiology) and Jazz Mack Smith (Department of Metabolism, Digestion and Reproduction) served on the programme committee.

Many thanks to all the committee members and everyone who presented, submitted proposals or attended on the day, and to EPSRC and the Society of Research Software Engineering for their support. For more information from the event check Jeremy’s full report, RESIDE’s blog post or #RSLondonSE2020 on Twitter.

Quilting with Julia, or how to combine parallelism and derived types for high performance computing

Research and quilting have a similar Zen in that both combine and build upon multiple prior works. But the workflow is difficult to reproduce in research software: how can we combine group X’s state-of-the-art ODE solver with group Z’s state-of-the-art parallel linear algebra to create Y’s new biology model when they all use different libraries and conventions? This is the problem that Julia tackles head on, thanks to it’s innovative type system and multiple dispatch. In “Shared Memory Parallelization of Banded Block-Banded Matrices” we describe how to combine the parallelization capabilities from one package (SharedArrays) with the specialized matrix  of another (BlockBandedMatrices.jl) – without modifying the internals of either.

This work follows on from a NumFOCUS sponsored collaboration at Imperial College between the Research Computing Service and Sheehan Olver in the Department of Mathematics.

A review of the RSE team’s activities in 2019

2019 has been another very busy and productive year for the RSE team in the Research Computing Service at Imperial College. Our core mission is to accelerate the research conducted at Imperial through collaborative software development, and we have now completed 24 projects since our inception 2 years ago with 75% of our first-year projects resulting in follow-on engagements. We’ve highlighted 5 of our most fruitful collaborations on our new webpages, which also provide more information about the team and the services we offer. We are about to appoint our fifth team member, reflecting the value we’ve offered to research projects (and proving that there is a career pathway for RSEs!).

In addition to our project work we’ve assisted researchers at over 40 RCS clinics this year and played a strong supporting role in Imperial’s Research Software community, from Hacktoberfest to departmental events. We’ve developed two brand new Graduate School courses in Research Software Engineering (to be delivered next term) and have helped deliver 4 Software Carpentry workshops. We’ve also played an increasingly active role in promoting the benefits of RSE (and the role itself) to relevant stakeholders in the College. This has complemented our broader engagement activities: acting as expert reviewers for JOSS submissions, contributing to numerous OSS projects, presenting at 3 international RSE conferences (deRSE19, UKRSE19 and NL-RSE19), and promoting our work via blogging, social media and attendance at several other relevant events – locally (e.g. RSLondonSouthEast 2019) and nationally (e.g. CW19, CIUK).

RSE19 conference photograph
The team (amongst amongst many other RSEs!) at UKRSE19. Photo courtesy @RSEConUK.

We continue to develop tools and infrastructure to support RSE within in the College. The nascent Research Software Directory aims to showcase the breadth of software developed at Imperial, encouraging collaboration, re-use and citation. We’re also attempting to give software a stronger position amongst research outputs through our current work on the Research References Tracking Tool (R2T2) and helping researchers submit their software to Spiral via Symplectic. Finally, we continue to share advice and guidance on how to adopt better RSE practices, such as QA and CI.

As we look forward and further develop the Research Computing Service’s RSE capacity and expertise we’d like to thank all the academics who have trusted us with their projects, and all the researchers who’ve taken the time to explain their work and have enthusiastically embraced good software engineering practices. We’re looking forward to another 12 months of strengthening RSE at Imperial!

1st Research Software Winter Seminars and Roundtable

On Thursday 12th of December the Research Computing Service joined the College’s Research Software Community in celebrating the 1st Research Software Winter Seminars and Roundtable, the final event of another great year of building research software at Imperial. The event had two goals: first, to celebrate the research software-related achievements of the RS Community during 2019, and second, to plan the activities and goals for the year that is about to start.

The seminar session featured nine exciting talks, ranging from a review of the activities of the Community during 2019 and the training opportunities in computing and data science skills, to technical talks on the use of complex analysis pipelines for RNA sequencing and the extension of open source software with custom features.

This is the full list of talks, including several relevant links:

After the talks, there was a roundtable discussion chaired by Diego Alonso, with a panel including Elsa Angelini, Jeremy Cohen, Phoebe Pearce and Mark Woodbridge, to help answer some questions about what the audience would like to see from the Community next year, how we can communicate with each other better and who can get involved to make those things happen. There were many excellent contributions from the audience, who were also very engaged and eager to see the community grow and take an active role on it.

Among the activities that were discussed – and that gained volunteers to help make them a reality – were the creation of a Slack workspace as an instantaneous, bidirectional communication channel within the community (already up and running; sign-up now!) and the recruitment of RSE Champions in the different communities (PhD students, postdocs, etc) to promote Community events and bring more people aboard or to assist with the organisation of departmental events.

The event concluded with informal drinks and nibbles in the ICT Kitchen – including mulled wine! – where the enthusiastic attendees and speakers mingled together and shared experiences and plans for the future.

There are plenty of things going on and 2020 is due to see a very bright RS Community at Imperial!

NL-RSE19

On 20 November 2019 Mark Woodbridge and Jeremy Cohen represented Imperial College at NL-RSE19, the first annual conference of the Netherlands Research Software Engineer community.

NL-RSE19 poster session

Their presentation, Strength in Numbers: Growing RSE Capacity at Imperial College London (10.5281/zenodo.3548308) described the expanding groups involved in RSE at Imperial, their respective activities, and how examples of these are fostering collaboration and awareness across the College. They also took the opportunity to display a poster first shown at UKRSE19 that highlights key aspects of these initiatives. The talk and poster generated much interest and resulted in productive discussions with members of the NL-RSE community in relation to building inclusive communities, long-term support for research software, personal development opportunities for RSEs, and how best to support the broad range of research typically carried out in larger institutions.

NL-RSE19 poster session

Many thanks to the organisers (in particular Niels Drost and Ben van Werkhoven of the Netherlands eScience Center) for the opportunity to engage with the vibrant and rapidly growing RSE community in the Netherlands.

From Researcher to RSE: My Career Path

Diego Alonso Álvarez is a Senior Research Software Engineer in the Research Computing Service at Imperial College London. In this post he reflects on his career so far, from post-doctoral researcher to working as a full-time software engineer since joining Imperial’s RSE team in November 2018.

1. Setting the scene: who I am and why I am writing this

I am a research software engineer (RSE) but until just one year ago, I was a post-doctoral researcher in the Department of Physics at Imperial College London. Before I forget how being a researcher was, I am writing my experiences on both career paths and the pros and cons of each of them. This has been an exciting task for me to reflect on my own career and why I made the decisions I made. Hopefully it will also be something interesting for others to read and, possibly, benefit from.

It is worth to emphasise that this blog post is about me and how I have experienced both roles. This is not, by any means, an unbiased description of the academic and the RSE careers neither it is an attempt to describe what being a researcher or an RSE generally is, the latter being a hot topic of discussion in the RSE community, anyway. Some people will find my experiences mirroring some of their own; others will feel completely identified with everything; others will consider my whole story completely alien and nothing to do with their own.

Either way, let’s begin!

2. My career as a researcher

2.1. The context

It took me a while to realise I was a researcher. Indeed, I do not think I thought of myself as researcher until after finishing my PhD and starting my first postdoc in Edinburgh, back in 2012, probably because I had not experienced the whole “research world” until then.

However, I certainly was a researcher before that. For 6 years, since 2006, I carried innovative research in the field of quantum semiconductor nanostructures for novel infrared photodetectors and solar cells. I am not sure if my PhD supervisors were very permissive or if I was very independent, but in any case, I generally worked a lot on my own and did things my way, normally quite successfully.

Going to Edinburgh immediately after finishing my PhD was an intermediate step. As with the PhD, I was pretty independent there and could work anyway I wanted, whenever I wanted, as long as I produced scientifically sound results. But others were not so independent. I could see around me (within the same group and in others) much more demanding constraints and bitter discussions on who should author what and in what order, on how many hours someone had been using some equipment, etc. I did never experience any of that myself.

By the time I went to Edinburgh, I had already submitted a Royal Society Newton International Fellowship application (failed!) and a Marie Curie Fellowship application (success!) to come to Imperial College London. This was my first crash course on research: impossibly long applications; incredibly long waiting times; zero or very limited feedback if not successful. In any case, I was successful in the end, so here I came. I worked happily as part of the Quantum Photovoltaics group, first as Marie Curie Fellow, then as research fellow associated to a European project and finally as a plain postdoctoral researcher associated to an EPSRC project. Not exactly going up the academic ladder.

During these years, I followed the “book of the researcher”: I enjoyed facing the challenges of creating new experiments and shedding some light into novel, potentially ground breaking data, I published a few tens of papers, collaborated with many institutions and travelled worldwide presenting my work in the top conferences of the field of solar energy. I also did some undergraduate teaching, student supervision, lab management, a lot of coding – both for research and also related with outreach – and wrote applications for fellowships and lectureships. Often, all of it at the same time, multitasking.

Cutting the story short, I was not successful with any of the lectureship applications, neither with the fellowships, so my career was not really going anywhere. All of them were very time consuming to prepare – the last fellowship took me a whole year -; all of them took a lot of time to be resolved – in one case I had to write to find out what was going on -; in none did I receive any feedback beyond “it was very competitive.” Well, I already knew that. What I wanted to know was where I was weaker, to further develop that area and have better chances the next time.

The bitterness in all of this is not so much for failing but for the complete absence of any gain from those failures. There was no learning experience. They were very time consuming just to reach dead ends. And the same applies to rejected papers or collaborations that end up going nowhere.

2.2. Pros and cons

So, after this dissertation to put into context my opinions, I have come up with the following list of pros and cons of life in research. They are not in any particular order, but it should be pretty obvious by now to which I give more weight.

2.2.1. Pros

– Freedom of working any time of the day and day of the week: Results matter more than hours worked

For me, this is one of the biggest benefits of life as a researcher, but also a double-edge sword. It requires for you to be honest about what to do and by when, and then do it. And also, for your supervisor or line manager to demand and value those results appropriately. Otherwise, no one will do anything. Or you will need to work many more hours in order to have the work done.

– Work for your own benefit and reputation

This is a bit vague, as it could be “for the benefit of humankind”, but I think there is a bit of selfishness and desire to be recognised in any researcher.

– Limited supervision and/or accountability

Clearly dependent on who is your supervisor, but in my experience, I rarely had to give any explanation on what I had been doing beyond the outcomes (aka papers, conferences, etc.) we had agreed and that were expected.

– Very clear career progression path

PhD Postdoc Research Fellow Lecturer Reader Professor. Some steps slightly different depending on the institution, but roughly speaking, the same anywhere, and with more or less clear responsibilities and benefits.

– A lot of opportunities to learn soft skills

Soft skills being anything that is not in your job specification, that you can use somewhere else and that, for some reason, you spend most of the time doing. It is important to note that soft skills become relevant only when you think on changing roles.

2.2.2. Cons

– Unhealthy competitiveness between researchers for publishing first, accessing or controlling a laboratory, order of authors in a paper…

I did not experience this personally, but I saw it happen to friends and it is one of the most counterproductive and damaging things for anyone’s mental health. An absolute motivational killer.

– Extreme pressure to publish and get grants

What to say about this? The vicious circle of publishing to get grants to keep publishing to do… what? Very often research misses the point completely: papers and grants are a means to an end, not the end themselves, and in doing otherwise, the result is poorer, emptier research, and a waste of resources and money.

– Very long feedback loops between doing something until having a response to it

My favourite and probably the reason I lost interest in research. I cannot emphasise more the absolute waste of time and anxiety that all of this lead times produces in a researcher:

– Grant/fellowship submission Resolution of the call
– Submitting a paper Having the paper published
– Publication of papers Anyone actually benefiting from them

– Too narrow research topic resulting in limited scope for learning new things

This is hard to spot while you are inside, but the truth is that we become experts of things so absolutely specific that if we want to learn anything else slightly off track, we cannot. Two things happen: (1) you rarely have time to do it because you already have plenty of things in your plate and (2) the community of that other field will not accept you because you have not been working on that topic for ages and, therefore, are not an expert. I tried to do it, moving from solar cells to batteries and energy storage. It did not work.

– Often required to spread too thin

Affects all levels of the academic ladder. The upper steps more related to managing too many people, too many project proposals and too many connections and potential partners; the lower steps trying to pursue side research lines and activities beyond the real topics of their jobs because they cannot say “no” to whatever comes from above. Another source of stress (on top of everything else) preventing you to focus on having things done.

– Often requires working many hours outside normal working hours

The dark side that comes with the freedom of working hours. Things have to be done, for the good and the bad.

3. My career as research software engineer

3.1. The context

The first question to answer will be how I ended up being a research software engineer. Sure, I applied to an offer I saw somewhere, but it is interesting to describe how I found out about the offer in the first place, because it is a clear example of where RSE might be coming from in many cases.

I was presenting the solar cell simulation package I had been working on, Solcore, to some potential users at Imperial’s Department of Materials. After the presentation and the discussion, one of the attendees told me that Imperial’s Research Software Engineer team could help me polishing the software and solving some of its issues and limitations. I had never heard of such team, but it sounded useful. I took note of the web page and a few days later join the Imperial RSE community mailing list. I have to admit I never followed up that lead and ignored any communication from that mailing list. Until a few months later, when I had a look at it by chance and saw a vacancy for a research software engineer position.

Reading through the job description was quite an eye-opener. This job was not only very close to things that I had been doing, informally, as a researcher; it was about things I really enjoyed doing! Sure, there were a few technical skills I did not had – and I still do not – but overall, it seemed an amazing fit for me. And it was a permanent position. This had an enormous weight, considering my personal situation of having a few-months-old baby and having spent the last decade on relatively short (1-3 years each) fix term contracts. So, I applied… and got the position!

The job as RSE could not be more different to the one as researcher, at least from the point of view of the working environment and daily routine. Imperial’s RSE team is part of the Research Computing Service, in turn part of Information and Communication Technologies, a massive department in charge of maintaining and improving all of Imperial’s computing infrastructure. We all work in a large open plan office and the look and feel is way more professional than the – often – messy researcher’s offices. Everyone there – including us –have a pretty regular and consistent schedule, being the office mostly empty at 5 pm.

The work itself is faster, much, much faster: we have concrete goals to achieve, concrete steps to get there, concrete deliverables. It does not matter if we are talking about developing a new code to support the research of a certain group, refactoring an old, hard to maintain piece of software, preparing a workshop for a conference (there are indeed great RSE conferences!), or the materials for a training event. We are paid to provide a specific service to a client under some constraints (money, time, scope) and we have to deliver, be efficient and straight to the point. This dynamism is not stressful at all, much to the contrary, it is quite relaxing to have specific steps to take to go to a specific place in a specific time. Tasks are short, feedback comes fast, and reviewing performance (your own or the one of the pieces of software you have been working on) is also very fluid.

Also contrary to what I would have thought before, there is plenty of scope for learning new things and to be creative when applying solutions to the problems you have to face. Indeed, I have certainly learnt way more in the last year as RSE than in the previous few years as researcher.

Not everything glows, of course. Specially being a beginner in the field without any formal training whatsoever in computing, I sometimes struggled with concepts or tools that were taken from granted. Software design patterns could be one of them, correct use (and understanding) of git could be another one, code debugging using proper debugging tools and not “print” statements, basic concepts of parallel computing… All of that comes with practice, of course, but when things move so fast and time is so precious, you certainly fear not being up to the expectations or wasting other’s time when they have to solve your own issues.

I have just become Senior Research Software Engineer. That suggest I have done my job well – of which I am really proud! – but also points to how fast and different things might happen outside the academic ladder.

3.2. Pros and cons

Pros and cons have been mostly described already, but to be consistent and add a few more on each category, here is a more exhaustive list.

3.2.1. Pros

– Still enjoying the academic environment and life on campus

I still work at the University, in touch with researchers, embedded in the academic environment, the students, the food outlets… It is the comfort zone, familiar to me, and that makes things much easier.

– Fast pace, with short reporting times and feedback from clients or colleagues

As described above, this is the absolute opposite to life as a researcher and, therefore, my favourite point in favour of the work as RSEs. You can feel that things happen and change in real time, that there is a real impact and specific feedback guiding you to the next steps that week, or the next, or the following month, at most.

– A new, growing community with limitless possibilities to stand out

There are many RSEs but the community itself is quite young. The professional bodies are being formed right now, the conferences are just a few editions old, the structure of the RSE career path is… fuzzy. There are plenty of things to be done and to make a difference, to be pioneer.

– Broad field with many tools, techniques and practices to learn (and growing)

The field of information technologies is huge and growing. Even if you constrain to those things specifically useful for the projects or tasks you are involved at any given time, you will not get bored of options for learning.

– Very open and collaborative community with limited competitiveness

While researchers certainly collaborate with each other, there is always a sense of competition, of being the first in publishing something or getting new results. RSEs seem to be much more relaxed on that. They are enthusiastic about sharing their ideas and expertise in different formats and contexts. They like concepts like sustainability, transparency, open software, open research, collaborative events like hackathons, online forums… In this respect, RSEs are what researchers should have been in the first place.

– 9-to-5 job

As much as I valued the freedom of working in academia, I have come to value more the rigorous 9-to-5 job I am enjoying as an RSE, without any need to work during weekends, in the evenings or to mull work-related issues while commuting.

– Comes in many flavours

The job of RSEs is quite broad and you can easily focus on those aspects that are more fulfilling to you, like teaching and training, coding, HPC or community engagement, for example. Most likely, you will also have to cover some of the other aspects but, at least in my case, I certainly have scope for customising the work I would like to do.

3.2.2. Cons

– Rigorous criterium on what projects one can work on, with limited scope to pursue personal projects or exploratory ideas

This is one of the catches of the job. You are very involved with research and what researchers do… but you are not one of them. Even if you have brilliant software ideas that you will like to explore and put into practice – even if they fall into the remit of what an RSE will do – you cannot do them because that is not what you are paid to do. This is particularly annoying for me now that I know a million ways of improving the software I developed as a researcher and I simply cannot devote time to do that.

– Rigorous account of the working hours and the exact activities carried along the week

This is more an annoying thing that an actual negative aspect of the job. Given that you work as a service to others, the time you spend doing each of the tasks have to be carefully accounted for. Sometimes, this is easy, but others – specially days you are less productive for whatever, perfectly sensible reason – accounting for all your time might cause some anxiety.

– Salaries equivalent to those of academic researchers, but much lower than those of similar positions in industry

This is a general issue in academia, including for researchers: we are often paid much less than our counterparts working in the private sector. And probably there is not much to do about it. For RSEs this difference might be more outrageous when you see the starting RSE salaries in companies like Google, but I think that we, in academia, have some other maybe less tangible, benefits.

4. Summary

To conclude, I think it is clear by now that I am very happy in my role as an RSE. I did enjoy – massively – my time as a researcher; I learnt a lot of things, some useful, others not so much; it gave me the opportunity to travel all around the world, presenting my work in amazing places I would have never visited otherwise; meeting great, very clever people…

But in the end, the lack of progression in my career, the cumulative negative aspects I was putting together and – by all means – my own personal situation, made me move on and take that opportunity that popped up out of the blue. This first year as an RSE has convinced me it was the right decision.

 

To find out more about Research Software Engineering at Imperial College and opportunities to join RSE team visit our webpage or follow us on Twitter.