Skip to content

Instantly share code, notes, and snippets.

@nniiicc
Last active August 29, 2015 14:14
Show Gist options
  • Save nniiicc/9e85073b2e4952d4cc48 to your computer and use it in GitHub Desktop.
Save nniiicc/9e85073b2e4952d4cc48 to your computer and use it in GitHub Desktop.

TL;DR - Citations are a terrible technology for evaluation ... equally important is the design of novel alternatives

Why do people cite papers? Eugene Garfield (1996) offers a broad list of reasons:

  • Paying homage
  • Giving credit
  • Identifying methodology
  • Background reading
  • Correcting own work
  • Correcting other’s work
  • Criticising previous work
  • Alerting re: forthcoming work
  • Providing leads
  • Authenticating data
  • Identifying the original paper
  • Arguing with others
  • Disputing the work of others
  • etc.

Many have given similar taxonomies. The Citation Typing Ontology (CiTO) by David Shotton is a fabulous recent effort. For simplicity's sake, I like Small's five basic motivations for citing a resource:

  • Refute (negate previous knowledge claim)
  • Note or acknowledge work of others
  • Review - either by comparison, or summarization
  • Apply - through method, theory, etc.
  • Support - for findings or claims being advanced

My own work in the earth sciences has shown that data citations rarely have a sentiment attached to them (Weber and Mayernik 2014) ... instead they are documenting the use of a resource, or acknowledging the existing of a comparable resource that could have been used. In short, of the 1000 citations we reviewed they fit one of these five simple criteria VERY easily.

So, that is to say I am very comfortable with the idea that data (and software) citations have a role that is documentary - they are the acknowledgment of a resource rather than a recipe for reproducibility or even an extended commentary on the value / utility of that resource.

--

Here is the overall argument that I want to make - data / software citations are terrible at bestowing credit upon individuals and institutions [1]. THEY ARE OPAQUE, DIFFICULT TO INTERPRET AND EXPENSIVE TO OBTAIN

Mapping credit and attribution (or tenure and career advancement) to such an antiquated technology only further entrenches citation analysis as a viable form of scholarly evaluation.

For all of the success of altmetrics (and I am in now way arguing against the very smart innovations that have been made in that space) there have been relatively few new metrics.

And that may be just a matter of time.... but what we need to spend time and effort doing is designing those new metrics ... not simply using new data sources in the same ways.

Joel Hirsch did not design the H-Index for nefarious reasons - he designed it out of frustration.

Liz Wallace didn't theorize the Bechdel Test (one of the greatest metrics of contemporary culture) because she was bored - she did it to make an important point about gender representations in contemporary film].

The challenge isn't to create a culture of "data / software citation" - it is to design NOVEL alternatives.

Footnote

[1]: They aren't great for reproducibility either - but that is a different post.

Works cited

Shotton (2011) CiTO: The Citation Typing Ontology http://www.jbiomedsem.com/content/1/S1/S6

Small, H.G. (1982) Citation context and citation analysis. In Dervin, B., Voight, M. (Eds), Progress in Communication Sciences. Norwood, NJ

Garfield, E. (1996) When to cite. Library Quarterly, 66 (4): 449-458

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment