vanatteveldt/example.jats.xml

## example.jats.xml
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.2 20190208//EN" "JATS-archivearticle1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="1.2" article-type="other">
<front>
<journal-meta>
<journal-id/>
<journal-title-group>
<journal-title>Computational Communication Research</journal-title>
</journal-title-group>
<issn publication-format="electronic">2665-9085</issn>
<publisher>
<publisher-name>Amsterdam University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5117/CCR2019.1.001.VANA</article-id>
<title-group>
<article-title>A Roadmap for Computational Communication
Research</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<string-name>Wouter van Atteveldt</string-name>
<aff id="aff-1">
<institution-wrap>
<institution>Vrije Universiteit Amsterdam, LJS
Nieuwsmonitor</institution>
</institution-wrap>
</aff>
</contrib>
<contrib contrib-type="author">
<string-name>Drew Margolin</string-name>
<aff id="aff-2">
<institution-wrap>
<institution>Cornell University</institution>
</institution-wrap>
</aff>
</contrib>
<contrib contrib-type="author">
<string-name>Cuihua Shen</string-name>
<aff id="aff-3">
<institution-wrap>
<institution>UC Davis</institution>
</institution-wrap>
</aff>
</contrib>
<contrib contrib-type="author">
<string-name>Damian Trilling</string-name>
<aff id="aff-4">
<institution-wrap>
<institution>University of Amsterdam</institution>
</institution-wrap>
</aff>
</contrib>
<contrib contrib-type="author">
<string-name>René Weber</string-name>
<aff id="aff-5">
<institution-wrap>
<institution>UC Santa Barbara</institution>
</institution-wrap>
</aff>
</contrib>
</contrib-group>
<pub-date date-type="pub" publication-format="electronic">
<year>2019</year>
</pub-date>
<permissions>
<copyright-statement>© The author(s)</copyright-statement>
<copyright-year>2019</copyright-year>
<copyright-holder>The author(s)</copyright-holder>
<license>
<ali:license_ref xmlns:ali="http://www.niso.org/schemas/ali/1.0/">https://creativecommons.org/licenses/by/4.0/</ali:license_ref>
<license-p>CC-BY 4.0</license-p>
</license>
</permissions>
<abstract><title>Abstract</title><p>
Computational Communication (CCR) is a new open access journal dedicated
to publishing high quality computational research in communication
science. This editorial introduction describes the role that we envision
for the journal. First, we explain what computational communication
science is and why a new journal is needed for this subfield. Then, we
elaborate on the type of research this journal seeks to publish, and
stress the need for transparent and reproducible science. The relation
between theoretical development and computational analysis is discussed,
and we argue for the value of null-findings and risky research in
additive science. Subsequently, the (experimental) two-phase review
process is described. In this process, after the first double-blind
review phase, an editor can signal that they intend to publish the
article conditional on satisfactory revisions. This starts the second
review phase, in which authors and reviewers are no longer required to
be anonymous and the authors are encouraged to publish a preprint to
their article which will be linked as working paper from the journal.
Finally, we introduce the four articles that, together with this
Introduction, form the inaugural issue.
</p></abstract>
<kwd-group kwd-group-type="author">
<kwd>computational communication science</kwd>
<kwd>computational social science</kwd>
<kwd>open science</kwd>
<kwd>research transparency</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="what-is-computational-communication-science">
  <title>What is Computational Communication Science?</title>
  <p>An increasing part of our daily life is organized and experienced
  online, from connecting with friends and reading news to shopping,
  entertainment, and even dating. Most of these online actions leave
  ‘digital traces’ that offer unprecedented opportunities for scholars
  to explore, theorize, and test hypotheses about the way humans think,
  behave, and interact . In addition, human artifacts and knowledge such
  as scholarly and non-scholarly articles, records of historical events,
  song lyrics, stories, etc., that provide rich information on the
  context of human behavior, are increasingly available in digital form.
  Most of these online ‘digital traces’ are communicative in nature.
  Therefore, communication science, perhaps more than any other social
  science, is in a promising position to leverage these rich data
  sources to form a better understanding of human communication and
  behaviour .</p>
  <p>Computational Communication Science (CCS) is the label applied to
  the emerging subfield that investigates the use of computational
  algorithms to gather and analyse big and often semi- or unstructured
  data sets to develop and test communication science theories . In
  recent years, scholarly interest in this subfield increased
  dramatically, as evidenced, for instance, by the strong growth of the
  Computational Methods Division within the International Communication
  Association (ICA), the largest international representation of
  communication scholars. One testament of this interest is the new open
  access journal Computational Communication Research, in which this
  article is published, and the many recent and upcoming special issues
  on computational communication science and related topics .</p>
  <p>Method and theory development are necessarily synergistic . New
  methods, from the telescope to DNA sequencing, have often been
  instrumental to scientific progress by changing our perception of
  reality and allowing new questions to be asked . New methodologies and
  analytical approaches can lead to new findings which in turn can be
  used to formulate or refine theories. At the same time, theories
  suggest research questions that inspire the development of new
  methodologies. Neither methodological nor theoretical development is
  superior in science . With its unique set of strengths and weaknesses,
  CCS is in a position to complement the traditional methodological
  toolkit and enhance the paradigm of method-theory synergy in
  communication science. For instance, going from self-reports in lab
  settings to modeling actual behavior in its natural social setting can
  alleviate many of the external and ecological validity issues of
  experimental studies. Moving from small-N cross-sectional surveys or
  panels with long time intervals to large-N real-time measurements can
  help overcome the internal validity problems of current observational
  studies. Finally, although large data sets do not guarantee high
  quality data, more data points can help overcome problems of low
  statistical power and allows the researcher to zoom in on specific
  subpopulations or test more complex models than is possible with
  traditional behavioral studies.</p>
  <p>That said, there are a number of specific challenges that will need
  to be addressed in a vibrant and critical community of computational
  communication scientists if CCS is to fulfill its full potential .
  First, the ownership of many of the required data sets by (social)
  media companies and other commercial entities threatens the
  accessibility of data and the reproducibility of studies. Second,
  “big” data sets are often a by-product of naturally occurring
  behaviour, and may not be representative for the actual behavior of
  interest: expressed attitudes on, for instance, Twitter, review
  websites, or dating apps might be quite different from the attitudes
  in the general public. Third, computational methods are not immune
  from replicability problems. A high number of researcher degrees of
  freedom combined with a lack of currently established standards for
  many new methods can jeopardize the scholarly scrutiny which is
  essential in assuring additive science and replicability. Finally, CCS
  requires unique skill sets (e.g. programming, data handling) which may
  lead to a rethinking of our educational programs and the institutional
  incentives for developing and maintaining these skill sets.</p>
  <p>These considerations show that to be successful, CCS will have to
  emphasize research transparency, reproducibility, and collaboration .
  Research transparency and reproducibility is needed to generate
  long-term trust in this new paradigm. Collaboration among a diverse
  set of stakeholders is needed to create synergies between
  methodological and theoretical progress, develop and maintain complex
  computational software, update criteria for hiring, tenure, and grant
  approvals, and provide researcher with access to proprietary data
  sets.</p>
</sec>
<sec id="why-do-we-need-a-new-journal">
  <title>Why do we Need a New Journal?</title>
  <p>Why do we need a new journal to tackle these challenges? While some
  may view computational research as simply a methodological extension
  to existing communication research techniques and topics, we believe
  it creates a broad and integrated set of opportunities and challenges
  for the field that include debates over epistemology, ethics and the
  role of publication in the scientific process . To address these
  opportunities and challenges an integrated, communal effort is needed
  to develop, debate, and demonstrate best practices–that is, to develop
  relevant paradigms–that guide future research .</p>
  <p>Such work can continue, as it has over the past decade, in articles
  scattered among the top communication journals and computational
  social science conference proceedings. However, we believe there are
  important advantages to providing a specific outlet that addresses all
  facets of this conversation. First, many papers can contribute to
  important conversations within the computational community but,
  understandably, are not recognized as valuable by general interest or
  other, topic specific journals. Thus, the best judges of their
  contribution are editors and reviewers who share an interest and
  understanding of the relevant issues. Second, as much as computational
  communication studies provide unique opportunities, they also face
  unique challenges. As a consequence, the evaluation criteria applied
  to computational communication studies can differ significantly from
  those applied in other sub-fields . Some traditional criteria may be
  not strict enough for computational work. For example, obtaining large
  samples with sometimes hundreds of thousands of observations is
  usually not a problem for computational studies, but renders classical
  hypothesis testing as problematic (“everything is significant”). Yet
  other criteria may be too restrictive, such as the still widespread
  tendency not to publish null findings. Reviewers selected mostly on
  substantive expertise may not appreciate these unique challenges in
  computational studies. This can lead both to methodologically flawed
  articles being accepted, and to good computational work being rejected
  because it is held to the standards of classical methodology.</p>
  <p>The third motivation for the journal is to actively promote a
  consistent and coherent set of standards for addressing these unique
  challenges. The challenges of computational communication research
  apply across theoretical topics, methodological best practices, and
  ethical commitments. Inevitably, some of the ideal best practices will
  come into conflict. For example, accessibility and reproducibility can
  often conflict with ethical concerns. Here the journal can serve as
  both a forum to organize the conversation around these topics as well
  as a place to work towards and implement an emerging consensus.
  Finally, we recognize that the research topics of a computational
  communication research journal are intrinsically tied to a set of
  computational technologies that are rapidly developing. We thus
  believe it is important that a computational communication research
  journal invites and welcomes innovations and discoveries that have the
  potential to push the envelope in state-of-the-art communication
  science, but also come with an elevated risk of failure. Scientific
  research is driven by a sound rationale and method, and should be
  inherently risky. We envision CCR to be on the leading edge of risky
  proposals to social scientific practice, with the hope that our
  collective successes (and failures) can inform the communication field
  more broadly.</p>
</sec>
<sec id="what-kind-of-research-does-ccr-seek">
  <title>What Kind of Research Does CCR Seek?</title>
  <p>A journal needs to develop and articulate a clear picture of what
  it is looking for to guide the decisions of authors, reviewers, and
  editors.<xref ref-type="fn" rid="fn1">1</xref> CCR welcomes research
  that contributes to our theoretical understanding of human
  communication. We define a theoretical contribution as one that is
  additive to prior work by altering the field’s existing understanding
  of and expectations for communication phenomena. These contributions
  are best achieved by formulating hypotheses and research questions
  that are risky, that is, include claims that are not self-evident and
  in fact are likely to be wrong. In this context, finding support for
  well argued, unlikely claims is a good strategy to make a theoretical
  contribution. Replications and studies that test the soundness and
  boundary conditions of existing theory also qualify as good
  strategies. Of course, a logical consequence of pursuing risky
  research is that computational scholars will see rejections or
  null-findings of their claims more often than their support. Given a
  well argued claim, reliable and valid measures, as well as a sound
  analytical methodology, CCR is committed to value null-findings as a
  contribution that increases knowledge. If computational scholars
  honestly report what – against their expectation and best-practice
  efforts – has not worked, then other can learn, build on these
  efforts, and thereby contribute to additive science. This said, there
  are three primary ways in which articles can contribute:</p>
  <list list-type="order">
    <list-item>
      <p>By applying computational methods to new or existing
      theoretical questions. Importantly, CCR’s emphasis on additive
      contributions means that research need not exclusively test
      hypotheses nor feel compelled to produce significant results.
      Nonetheless, whether deductive or inductive, analysis should be
      clearly linked to substantive theoretical questions and what is
      already known, or suspected to be known, with regard to them.
      Claims and conclusions should be explicit – naming boundary
      conditions and alternative explanations – and, of course, well
      supported by the data. Showing that a theory is at odds with data
      is a relevant finding, but only if alternative explanations can be
      reasonably ruled out, and if accompanied by a clear argument
      indicating why the theory should have been applicable.</p>
    </list-item>
    <list-item>
      <p>By developing, adapting, and/or validating methods. For this,
      the researcher needs to show that the method/tool is reliable and
      valid; that it is useful for understanding communication; and that
      it is better (by some measure) than existing tools that do that
      task. In most cases, tools or method papers should include
      quantitative validation on a gold-standard data set that was not
      used for development and that is representative of some use case
      relevant to communication research.</p>
    </list-item>
    <list-item>
      <p>By creating or adapting datasets and making them accessible and
      searchable. Shared datasets are important because it makes it
      easier to compare and replicate research by offering a common
      point of reference. In publishing a description of a data set, it
      should be clear how it was gathered and preprocessed. Where
      possible, the raw data and cleaning procedure should be published
      alongside the final data set. Data should be as open and
      accessible as possible. For data that cannot be fully shared for
      legal or privacy reasons, as much as possible of the data should
      be shared openly (i.e. metadata, annotations, and/or anonymized
      versions), and where possible a procedure for acquiring the
      sensitive data should be given that is in principle accessible to
      all researchers.</p>
    </list-item>
  </list>
  <p>CCR demands transparent and reproducible research. Computational
  analyses require many choices regarding design, preprocessing, and
  parameter tuning, and transparency are needed to allow scrutiny of
  these choices. As digital data and analysis code can be shared easily,
  computational research can be at the forefront of the open science
  philosophy . Most articles in CCR should be accompanied by an online
  appendix in a form that encourages reproducibility and reusability.
  For tool and software contributions, we expect software to be
  published open-source on GitHub or an equivalent service and in the
  repository that is normal for the programming language used, e.g. Pypi
  or CRAN. For articles presenting substantive and/or methodological
  analysis results and data contributions, we expect an online research
  compendium published on GitHub or an equivalent service. Such a
  compendium contains the data, code, and results, and makes it explicit
  how the code is used to derive the results from the raw data . By
  publishing this on GitHub rather than depositing it in a service such
  as DataVerse, the code can be a living document rather than just a
  snapshot. Reproducibility and persistence is guaranteed by storing the
  final (and if applicable, raw) data on DataVerse in addition, and
  archiving the named release of the repository corresponding to the
  publication. An optional template for such a compendium, including
  code for automatically testing and generating containers, will be made
  available through the CCR website.</p>
</sec>
<sec id="the-ccr-review-process">
  <title>The CCR Review Process</title>
  <p>Like most journals in our field, CCR will publish articles only
  after a rigorous peer-review process. However, in addition to
  employing a new substantive scope, open access publication, and
  openness for data and tool publications, CCR is also introducing a
  procedural innovation: a “two-phase review process” in the way
  articles are published.</p>
  <p>In the first phase, a traditional double blind ‘adversarial’ review
  takes place, where the central task for the reviewer and editors is to
  judge whether a manuscript is (potentially) publishable: is it
  high-quality, novel (including direct replications), and relevant. The
  outcome of phase one is either rejection or an <italic>intent to
  publish</italic>: a conditional decision to accept the manuscript for
  publication dependent on satisfactory revisions. After this intent to
  publish decision, the author is encouraged to publish the manuscript
  via an open science archive like SocArXiv. The journal website will
  link to this manuscript as a ‘working paper’. Any revisions in this
  phase are not required to be blinded. The reviewers also get the
  option to be publicly identified on the article if published.</p>
  <p>The purpose of this two-phased approach is to better align the
  incentives of authors and reviewers so that work is published both
  more quickly and with higher quality. Specifically, the job of the
  first phase is to identify valuable, if not yet wholly optimized
  research. Blind review, and the somewhat adversarial nature of the
  process, are essential in this phase to distinguish high quality
  submissions. Once there is agreement on the overall value of the
  manuscript, however, the preprint process is designed to alleviate
  authors’ anxiety (and potential hostility) regarding the status of
  their manuscript, as well as to encourage reviewers to focus on
  concrete, constructive changes rather than marshalling arguments to
  ‘kill’ the paper.</p>
  <p>Additionally, we offer the option of pre-registering research.
  While it may not be equally applicable to all types of computational
  research, it can be a useful tool to help our goal of avoiding bias
  against null-findings. We therefore will also accept registered
  reports as submissions, in which a introduction, theory, and methods
  are specified in advance, but data have not been collected and
  analyzed yet. In this case, the first phase of the review process is
  conducted on the basis of the preregistered report, meaning that the
  report will be sent out for review and an intent to publish the final
  article can be given on the basis of this review, independent of
  research outcomes but of course conditional on robust and transparent
  methodology in accordance with the preregistration. We encourage the
  use of preregistration services such as the Open Science Framework or
  aspredicted.org and/or the dissemination of the registered report as a
  preprint once intent to publish is given.</p>
  <p>This two-phase process and use of registered reports is
  experimental by design and should be seen as a first step in moving
  towards a more interactive and less adversarial review system. It is
  not clear how well it will work. Nonetheless it is one of the
  commitments of CCR to try new ideas that might improve the convoluted,
  and generally under-examined, publishing process.</p>
</sec>
<sec id="introduction-to-the-first-issue">
  <title>Introduction to the first issue</title>
  <p>The articles in this first issue present a snapshot of all aspects
  of computational communication research. present the Interface for
  Communication Research (iCoRe), a user-friendly web interface to
  access, explore, and analyze the Global Database of Events, Language
  and Tone (GDELT). This interface makes it easier to work with GDELT to
  answer substantive communication questions, as well as enhancing the
  transparency and replicability of such work by providing a
  standardized query interface. The authors demonstrate in three
  theory-driven case studies the usefulness of iCoRe.</p>
  <p> uses Structural Topic Models to show how the twitter feed of
  newspapers differ from their online content. This study shows how
  state-of-the-art analysis techniques can be used to study journalistic
  choices and how they differ for different audiences and contexts.</p>
  <p> present an open source browser plug-in that they use to observe
  both the content and context of the consumption of (public) Facebook
  posts. They also present a proof-of-concept study that, although
  highlighting the technical and social difficulties of recruiting
  participants for digital tracking studies, does show how the
  interaction with posts can be recorded, including scrolling, liking,
  and clicking links within a post.</p>
  <p> used state-of-the-art recommender system techniques to create
  personalized health communication messages in a longitudinal study.
  Their results show that personalized messages have an improved effect
  compared to either showing the overall most preferred message or a
  random message.</p>
  <p>Taken together, these four articles represent substantive
  computational scholarship in journalism health communication, and
  framing research. In addition, these articles contribute to making
  data and computational tools more accessible to communication
  scholars. We are confident that this is just the beginning of a stream
  of great research articles, and we look forward to your contributions
  and reviews.</p>
</sec>
</body>
<back>
<fn-group>
  <fn id="fn1">
    <label>1</label><p>Defining the niche and scope of CCR is an ongoing
    effort, and updated versions of this section will be posted on the
    journal website.</p>
  </fn>
</fn-group>
</back>
</article>