Last active August 29, 2015 14:17
Software Sustainability Collaborations Workshop, Oxford, 25-27 March 2015

#Software Sustainability Collaborations Workshop, Oxford, 25-27 March 2015

Live notes, so an incomplete, partial record of what actually happened.

Tags: collabw15

My asides in []


Top tips at start of @SoftwareSaved unconference #collabw15 - my kind of event!

— Alys Brett (@alysbrett) March 25, 2015
Andrew Hudson-Smith (CASA), Discpline Hopping - Interdisciplinary Working in a World of Big Data


"But What Do You Actually Do?" - addressing the really big questions at #collabw15

— SSI - (@SoftwareSaved) March 25, 2015
The city as a lab - taking live feeds from the world.

Deeply interdisciplinary and aimed at doing some good. Learnt to do quick soundbites, to get a policy...

The problem of copyright...


Andrew Hudson-Smith on the power of getting people to gather around physical representations of information #collabw15

— Alys Brett (@alysbrett) March 25, 2015
Learnt to ignore those who think blogging is pointless.

Fine line in creating public understanding in how emphasis of the work is shifted.

Anyone in the #IoT space knows about this? Allowed to attach 'memory' (story) to @oxfamgb donations. #collabw15

— Boris Adryan (@BorisAdryan) March 25, 2015
Fun talk from @digitalurban at #collabw15 Ignore your older and not-betters, make things, break the rules.

— Mark Hayes (@mah1002) March 25, 2015
I believe that's @digitalurban's final slide. "Break down the silo culture." - true on so many levels! #collabw15

— Boris Adryan (@BorisAdryan) March 25, 2015
Lightening Talks

Titles and topics of @SoftwareSaved #collabw15 lightning talks are here, including 1 slide per presenter:

— Boris Adryan (@BorisAdryan) March 25, 2015
Automated, version controlled lecture notes. #collabw15

— Mike Croucher (@walkingrandomly) March 25, 2015
Haines: trying to teach software engineering process and then do some programming to show they've got it... But if you assess the programming side they focus on the programming not the process

Varma: running Docker in production is hard...

Larisa: FLOSS Manuals

Computational evolutionary simulation

Marta Ribeiro: autogenerate data management plan!

Netherlands eScience Centre: cross-project pairing to facilitate learning

Promoting knowledge sharing: more than 1 engineer per project, more than 1 project per engineer. Mateusz Kuzak #collabw15

— Alys Brett (@alysbrett) March 25, 2015
David Perez-Suarez: “sharing is broken - we know how to connect the data, but not the scientists” +1 #cw15

— Lonely Joe Parker (@LonelyJoeParker) March 25, 2015
Jim Hensman - software as a catalyst for interdisciplinary

Hot things emerging from lightening talks: docker; IPython Notebook; community.

Rita Hendricusdottir is introducing us to Elixir UK and its bioinformatics training programs. #collabw15

— SSI - (@SoftwareSaved) March 25, 2015
A big hand for @normanmorrison’s presentation #cw15 #collabw15

— Lonely Joe Parker (@LonelyJoeParker) March 25, 2015
Collaborative Ideas session

Intro -- discussion -- writeup (which is a pitch for hackathon)

Alan Williams Representing data Caitlin Bentley Open development Nanlin Jin Data mining Richard Johnson Material science - xray things Joannis Bistinas Climate data

Citeable relationships -- capturing relationships between users, data, software --

Name: The Gitometer

Context/domain: Version control and knowledge of software and data creation/iteration process.

Problem: We don't know the biases that exist around changes to software. These biases are important.

Solution: Additional visualisations that show...

  • Geographical and gender data from Git profiles for software
  • Percentage of user requests resolved (and who is doing the resolving)
  • commit comments that are useless/unhelpful phrases and flag code/perpetrator appropriately

Discussion Session 1

What software do you use to help you collaborate, and how? What tools and commons do I use in my team to help share data and results?

Diaspora a social network you control.

dat: sharing and versioning data

Discussion Session 1 Reporting Back

Discussion Topics

Audience Questions

Should we consider Software Engineering as a discipline within interdisciplinary research? Yes!

Can we crowd source materials for teaching Software Engineering? Yes! - but using other peoples stuff if hard - helped if we create materials in small atoms - Software Carpentry a useful model - we are well below a tipping point - universities might resist this due to IP...

What would a good funding scheme for interdisciplinary research which uses software look like? Different...

What are the skills required to automate a workflow which uses different components? Best practices -- lots of thinking up front.

What are the best ways of communicating knowledge between disciplines in interdisciplinary research? What venues for presenting interdisciplinary research (journals, conferences, etc) exist? Problems of communication, trust, team units. Solutions: an opportunity to choose to be interdisciplinary. Don't just work in pairs but work through hubs.

How do you work with legacy code where implementation is tied to application? Common because of not thinking about the future, grants don't encourage not developing things that become legacy.

What tools and commons do I use in my team to help share data and results? Decide what to collaborate with, pick a set of tools, stick with them!


Panel Q&A on interdisciplinary working

Today's subject for the Q&A - "What IS Interdisciplinarity?" #collabw15

— SSI - (@SoftwareSaved) March 26, 2015
A little bit of a grasp of a lot of areas. Ground-breaking nature comes from the piecing together not the work in the individual areas. Problems and interests king, following them to an end wherever that takes you. There is an expectation that you have a discipline. But of course these are constructs. Some subjects develop techniques, some subjects apply them.

"Disciplines are these strange constructs that have grown up over the centuries. I'm post-disciplinary." #collabw15

— SSI - (@SoftwareSaved) March 26, 2015
Narrative of infrastructure is often around something being used rather than starting with the research problems that need supporting. Some disciplines like solutions, some like problems. This makes collaboration hard.

"Computer scientists have a very deep urge to provide a solution to things. Social scientists don't." #collabw15

— SSI - (@SoftwareSaved) March 26, 2015
Being an imposter can be empowering. Interdisciplinary folks like teaching but are often unsupported in that (are silos imposed by the teaching rather than the research?)

"There are a lot of people here who like teaching but aren't supported in this by their institutions." #collabw15

— SSI - (@SoftwareSaved) Useful File Name Maker

Discussion Session 2

How does software support reproducibility of results without requiring expert domain knowledge?

What are the five most important things learnt during this discussion:

  1. Climate Science is hard

  2. What is reproducibility? (validating or repeating or trying to understand enough to reuse, for example?)

  3. What is a non-expert domain?

  4. Documentation!

  5. Software lessons the burden

What would an open access mandate for software look like? Yes! Though despite pace of change re OA software has lagged behind...

What are the best ways of communicating technical requirements from researchers to developers? Are there examples of good common vocabularies that have been created to aid interdisciplinary projects? Researchers requirements may not be the best solution for the problem.

What are the management challenges in interdisciplinary research, how do we structure goals and rewards to involve the whole project Need soft skills. Often hard to tell if a collaborator is able to get over a problem. So talking important.

What are the best formats for sharing data and results across disciplines? Important thing is metadata.

"We should retire the idea that published papers are the fundamental output mode for research" @dorchard @SoftwareSaved #collabw15

— Tom Pollard (@tompollard) March 26, 2015
Demo Sessions

Here are my slides from the Demo Session on Paper Hackathons! #paperhack #collabw15

— Derek Groen (@whydoitweet) March 26, 2015
96 attendees. 42 talks. 48 breakouts.

Searchable twitter archive for #collabw15 /via @mhawksey

— Jez Cope (@jezcope) March 26, 2015
Main themes from #collabw15

— Torsten Reimer (@torstenreimer) March 26, 2015
Embrace imposter syndrome.

The paper is dead, long live the research object!... Or we are not at a tipping point yet but something is up in this space.

Reproducibility is hard.

All the tweets from day 2 of the Collaborations Workshop 2015 #CollabW15:

— SSI - (@SoftwareSaved) March 26, 2015
Hackathon Intro

Useful File Name Maker


Be clear about the problem you aim to solve and the way in which the solution will be realised

  • The Problem. The easiest way to describe the relationship between data is using clear, semantics filenames and directory structures. But people are busy and/or lazy. I am busy and/or lazy. This means that the relationship between the data, data derived from that data, and the decisions made during deriving are not always captured (which is a problem for both the person generating the data and people who want to reuse that data).
  • The Solution. An automated file name generator that takes in an original name, a description of the how the derived file you intend to create relates to the original file, mines that description to suggest an appropriate human-readable file name, and puts everything in a readme. Be clear about the skills you think the group needs
  • No idea. But more than just me Be clear about the benefit and impact of the idea to attract people to join your group!
  • Benefit. It flags the problem of provenance between data. I envisage this as an educational tool more than a useful everyday tool (given that it would intervene in or slow down an automated process). That said, David De Roure wants it.

The Other Pitches

Bank of England Visualisation challenge -- W3C provenance model


Project Website

Find distractions (both silly and modelled on life). Find microgames. Find memory games.

Moving between git branches

From master git directory:

git checkout -b gh-pages origin/gh-pages [makes branch visible]

git checkout master [moves back to the master from gh-pages]

git checkout gh-pages [moves to the gh-pages bit]

[Annoying person gif[(

300 x 300

Hackday Pitches

Robin/Janneke/Raquel 'Recipy'

Oliver/Alison/Mikhail 'Twacademia'

Aly/Jo/Rob/Phil 'Data on Acid'

Neil/Alexandro 'Showing impact of software

Bruno/Boris 'Bioinformatics workflows in Node-RED'

Gerard 'Scientific Wrappers'

Graham 'Exploring W3C provenance model with Annalist'

Alexander, Saral, Sarah 'Docker-based GAP distribution'

Peter 'Citable IPython notebooks running in Docker containers for reproducible computation research' Using docker, jupyter, IPython Notebook

Some admin...

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Exceptions: embeds to and from external sources, and direct quotations from speakers

