Skip to content

Instantly share code, notes, and snippets.

@vanatteveldt
Created February 26, 2023 22:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vanatteveldt/58d3c82c871f72536f073040fe176cb5 to your computer and use it in GitHub Desktop.
Save vanatteveldt/58d3c82c871f72536f073040fe176cb5 to your computer and use it in GitHub Desktop.
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.2 20190208//EN" "JATS-archivearticle1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="1.2" article-type="other">
<front>
<journal-meta>
<journal-id/>
<journal-title-group>
<journal-title>Computational Communication Research</journal-title>
</journal-title-group>
<issn publication-format="electronic">2665-9085</issn>
<publisher>
<publisher-name>Amsterdam University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5117/CCR2019.1.001.VANA</article-id>
<title-group>
<article-title>A Roadmap for Computational Communication
Research</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<string-name>Wouter van Atteveldt</string-name>
<aff id="aff-1">
<institution-wrap>
<institution>Vrije Universiteit Amsterdam, LJS
Nieuwsmonitor</institution>
</institution-wrap>
</aff>
</contrib>
<contrib contrib-type="author">
<string-name>Drew Margolin</string-name>
<aff id="aff-2">
<institution-wrap>
<institution>Cornell University</institution>
</institution-wrap>
</aff>
</contrib>
<contrib contrib-type="author">
<string-name>Cuihua Shen</string-name>
<aff id="aff-3">
<institution-wrap>
<institution>UC Davis</institution>
</institution-wrap>
</aff>
</contrib>
<contrib contrib-type="author">
<string-name>Damian Trilling</string-name>
<aff id="aff-4">
<institution-wrap>
<institution>University of Amsterdam</institution>
</institution-wrap>
</aff>
</contrib>
<contrib contrib-type="author">
<string-name>René Weber</string-name>
<aff id="aff-5">
<institution-wrap>
<institution>UC Santa Barbara</institution>
</institution-wrap>
</aff>
</contrib>
</contrib-group>
<pub-date date-type="pub" publication-format="electronic">
<year>2019</year>
</pub-date>
<permissions>
<copyright-statement>© The author(s)</copyright-statement>
<copyright-year>2019</copyright-year>
<copyright-holder>The author(s)</copyright-holder>
<license>
<ali:license_ref xmlns:ali="http://www.niso.org/schemas/ali/1.0/">https://creativecommons.org/licenses/by/4.0/</ali:license_ref>
<license-p>CC-BY 4.0</license-p>
</license>
</permissions>
<abstract><title>Abstract</title><p>
Computational Communication (CCR) is a new open access journal dedicated
to publishing high quality computational research in communication
science. This editorial introduction describes the role that we envision
for the journal. First, we explain what computational communication
science is and why a new journal is needed for this subfield. Then, we
elaborate on the type of research this journal seeks to publish, and
stress the need for transparent and reproducible science. The relation
between theoretical development and computational analysis is discussed,
and we argue for the value of null-findings and risky research in
additive science. Subsequently, the (experimental) two-phase review
process is described. In this process, after the first double-blind
review phase, an editor can signal that they intend to publish the
article conditional on satisfactory revisions. This starts the second
review phase, in which authors and reviewers are no longer required to
be anonymous and the authors are encouraged to publish a preprint to
their article which will be linked as working paper from the journal.
Finally, we introduce the four articles that, together with this
Introduction, form the inaugural issue.
</p></abstract>
<kwd-group kwd-group-type="author">
<kwd>computational communication science</kwd>
<kwd>computational social science</kwd>
<kwd>open science</kwd>
<kwd>research transparency</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="what-is-computational-communication-science">
<title>What is Computational Communication Science?</title>
<p>An increasing part of our daily life is organized and experienced
online, from connecting with friends and reading news to shopping,
entertainment, and even dating. Most of these online actions leave
‘digital traces’ that offer unprecedented opportunities for scholars
to explore, theorize, and test hypotheses about the way humans think,
behave, and interact . In addition, human artifacts and knowledge such
as scholarly and non-scholarly articles, records of historical events,
song lyrics, stories, etc., that provide rich information on the
context of human behavior, are increasingly available in digital form.
Most of these online ‘digital traces’ are communicative in nature.
Therefore, communication science, perhaps more than any other social
science, is in a promising position to leverage these rich data
sources to form a better understanding of human communication and
behaviour .</p>
<p>Computational Communication Science (CCS) is the label applied to
the emerging subfield that investigates the use of computational
algorithms to gather and analyse big and often semi- or unstructured
data sets to develop and test communication science theories . In
recent years, scholarly interest in this subfield increased
dramatically, as evidenced, for instance, by the strong growth of the
Computational Methods Division within the International Communication
Association (ICA), the largest international representation of
communication scholars. One testament of this interest is the new open
access journal Computational Communication Research, in which this
article is published, and the many recent and upcoming special issues
on computational communication science and related topics .</p>
<p>Method and theory development are necessarily synergistic . New
methods, from the telescope to DNA sequencing, have often been
instrumental to scientific progress by changing our perception of
reality and allowing new questions to be asked . New methodologies and
analytical approaches can lead to new findings which in turn can be
used to formulate or refine theories. At the same time, theories
suggest research questions that inspire the development of new
methodologies. Neither methodological nor theoretical development is
superior in science . With its unique set of strengths and weaknesses,
CCS is in a position to complement the traditional methodological
toolkit and enhance the paradigm of method-theory synergy in
communication science. For instance, going from self-reports in lab
settings to modeling actual behavior in its natural social setting can
alleviate many of the external and ecological validity issues of
experimental studies. Moving from small-N cross-sectional surveys or
panels with long time intervals to large-N real-time measurements can
help overcome the internal validity problems of current observational
studies. Finally, although large data sets do not guarantee high
quality data, more data points can help overcome problems of low
statistical power and allows the researcher to zoom in on specific
subpopulations or test more complex models than is possible with
traditional behavioral studies.</p>
<p>That said, there are a number of specific challenges that will need
to be addressed in a vibrant and critical community of computational
communication scientists if CCS is to fulfill its full potential .
First, the ownership of many of the required data sets by (social)
media companies and other commercial entities threatens the
accessibility of data and the reproducibility of studies. Second,
“big” data sets are often a by-product of naturally occurring
behaviour, and may not be representative for the actual behavior of
interest: expressed attitudes on, for instance, Twitter, review
websites, or dating apps might be quite different from the attitudes
in the general public. Third, computational methods are not immune
from replicability problems. A high number of researcher degrees of
freedom combined with a lack of currently established standards for
many new methods can jeopardize the scholarly scrutiny which is
essential in assuring additive science and replicability. Finally, CCS
requires unique skill sets (e.g. programming, data handling) which may
lead to a rethinking of our educational programs and the institutional
incentives for developing and maintaining these skill sets.</p>
<p>These considerations show that to be successful, CCS will have to
emphasize research transparency, reproducibility, and collaboration .
Research transparency and reproducibility is needed to generate
long-term trust in this new paradigm. Collaboration among a diverse
set of stakeholders is needed to create synergies between
methodological and theoretical progress, develop and maintain complex
computational software, update criteria for hiring, tenure, and grant
approvals, and provide researcher with access to proprietary data
sets.</p>
</sec>
<sec id="why-do-we-need-a-new-journal">
<title>Why do we Need a New Journal?</title>
<p>Why do we need a new journal to tackle these challenges? While some
may view computational research as simply a methodological extension
to existing communication research techniques and topics, we believe
it creates a broad and integrated set of opportunities and challenges
for the field that include debates over epistemology, ethics and the
role of publication in the scientific process . To address these
opportunities and challenges an integrated, communal effort is needed
to develop, debate, and demonstrate best practices–that is, to develop
relevant paradigms–that guide future research .</p>
<p>Such work can continue, as it has over the past decade, in articles
scattered among the top communication journals and computational
social science conference proceedings. However, we believe there are
important advantages to providing a specific outlet that addresses all
facets of this conversation. First, many papers can contribute to
important conversations within the computational community but,
understandably, are not recognized as valuable by general interest or
other, topic specific journals. Thus, the best judges of their
contribution are editors and reviewers who share an interest and
understanding of the relevant issues. Second, as much as computational
communication studies provide unique opportunities, they also face
unique challenges. As a consequence, the evaluation criteria applied
to computational communication studies can differ significantly from
those applied in other sub-fields . Some traditional criteria may be
not strict enough for computational work. For example, obtaining large
samples with sometimes hundreds of thousands of observations is
usually not a problem for computational studies, but renders classical
hypothesis testing as problematic (“everything is significant”). Yet
other criteria may be too restrictive, such as the still widespread
tendency not to publish null findings. Reviewers selected mostly on
substantive expertise may not appreciate these unique challenges in
computational studies. This can lead both to methodologically flawed
articles being accepted, and to good computational work being rejected
because it is held to the standards of classical methodology.</p>
<p>The third motivation for the journal is to actively promote a
consistent and coherent set of standards for addressing these unique
challenges. The challenges of computational communication research
apply across theoretical topics, methodological best practices, and
ethical commitments. Inevitably, some of the ideal best practices will
come into conflict. For example, accessibility and reproducibility can
often conflict with ethical concerns. Here the journal can serve as
both a forum to organize the conversation around these topics as well
as a place to work towards and implement an emerging consensus.
Finally, we recognize that the research topics of a computational
communication research journal are intrinsically tied to a set of
computational technologies that are rapidly developing. We thus
believe it is important that a computational communication research
journal invites and welcomes innovations and discoveries that have the
potential to push the envelope in state-of-the-art communication
science, but also come with an elevated risk of failure. Scientific
research is driven by a sound rationale and method, and should be
inherently risky. We envision CCR to be on the leading edge of risky
proposals to social scientific practice, with the hope that our
collective successes (and failures) can inform the communication field
more broadly.</p>
</sec>
<sec id="what-kind-of-research-does-ccr-seek">
<title>What Kind of Research Does CCR Seek?</title>
<p>A journal needs to develop and articulate a clear picture of what
it is looking for to guide the decisions of authors, reviewers, and
editors.<xref ref-type="fn" rid="fn1">1</xref> CCR welcomes research
that contributes to our theoretical understanding of human
communication. We define a theoretical contribution as one that is
additive to prior work by altering the field’s existing understanding
of and expectations for communication phenomena. These contributions
are best achieved by formulating hypotheses and research questions
that are risky, that is, include claims that are not self-evident and
in fact are likely to be wrong. In this context, finding support for
well argued, unlikely claims is a good strategy to make a theoretical
contribution. Replications and studies that test the soundness and
boundary conditions of existing theory also qualify as good
strategies. Of course, a logical consequence of pursuing risky
research is that computational scholars will see rejections or
null-findings of their claims more often than their support. Given a
well argued claim, reliable and valid measures, as well as a sound
analytical methodology, CCR is committed to value null-findings as a
contribution that increases knowledge. If computational scholars
honestly report what – against their expectation and best-practice
efforts – has not worked, then other can learn, build on these
efforts, and thereby contribute to additive science. This said, there
are three primary ways in which articles can contribute:</p>
<list list-type="order">
<list-item>
<p>By applying computational methods to new or existing
theoretical questions. Importantly, CCR’s emphasis on additive
contributions means that research need not exclusively test
hypotheses nor feel compelled to produce significant results.
Nonetheless, whether deductive or inductive, analysis should be
clearly linked to substantive theoretical questions and what is
already known, or suspected to be known, with regard to them.
Claims and conclusions should be explicit – naming boundary
conditions and alternative explanations – and, of course, well
supported by the data. Showing that a theory is at odds with data
is a relevant finding, but only if alternative explanations can be
reasonably ruled out, and if accompanied by a clear argument
indicating why the theory should have been applicable.</p>
</list-item>
<list-item>
<p>By developing, adapting, and/or validating methods. For this,
the researcher needs to show that the method/tool is reliable and
valid; that it is useful for understanding communication; and that
it is better (by some measure) than existing tools that do that
task. In most cases, tools or method papers should include
quantitative validation on a gold-standard data set that was not
used for development and that is representative of some use case
relevant to communication research.</p>
</list-item>
<list-item>
<p>By creating or adapting datasets and making them accessible and
searchable. Shared datasets are important because it makes it
easier to compare and replicate research by offering a common
point of reference. In publishing a description of a data set, it
should be clear how it was gathered and preprocessed. Where
possible, the raw data and cleaning procedure should be published
alongside the final data set. Data should be as open and
accessible as possible. For data that cannot be fully shared for
legal or privacy reasons, as much as possible of the data should
be shared openly (i.e. metadata, annotations, and/or anonymized
versions), and where possible a procedure for acquiring the
sensitive data should be given that is in principle accessible to
all researchers.</p>
</list-item>
</list>
<p>CCR demands transparent and reproducible research. Computational
analyses require many choices regarding design, preprocessing, and
parameter tuning, and transparency are needed to allow scrutiny of
these choices. As digital data and analysis code can be shared easily,
computational research can be at the forefront of the open science
philosophy . Most articles in CCR should be accompanied by an online
appendix in a form that encourages reproducibility and reusability.
For tool and software contributions, we expect software to be
published open-source on GitHub or an equivalent service and in the
repository that is normal for the programming language used, e.g. Pypi
or CRAN. For articles presenting substantive and/or methodological
analysis results and data contributions, we expect an online research
compendium published on GitHub or an equivalent service. Such a
compendium contains the data, code, and results, and makes it explicit
how the code is used to derive the results from the raw data . By
publishing this on GitHub rather than depositing it in a service such
as DataVerse, the code can be a living document rather than just a
snapshot. Reproducibility and persistence is guaranteed by storing the
final (and if applicable, raw) data on DataVerse in addition, and
archiving the named release of the repository corresponding to the
publication. An optional template for such a compendium, including
code for automatically testing and generating containers, will be made
available through the CCR website.</p>
</sec>
<sec id="the-ccr-review-process">
<title>The CCR Review Process</title>
<p>Like most journals in our field, CCR will publish articles only
after a rigorous peer-review process. However, in addition to
employing a new substantive scope, open access publication, and
openness for data and tool publications, CCR is also introducing a
procedural innovation: a “two-phase review process” in the way
articles are published.</p>
<p>In the first phase, a traditional double blind ‘adversarial’ review
takes place, where the central task for the reviewer and editors is to
judge whether a manuscript is (potentially) publishable: is it
high-quality, novel (including direct replications), and relevant. The
outcome of phase one is either rejection or an <italic>intent to
publish</italic>: a conditional decision to accept the manuscript for
publication dependent on satisfactory revisions. After this intent to
publish decision, the author is encouraged to publish the manuscript
via an open science archive like SocArXiv. The journal website will
link to this manuscript as a ‘working paper’. Any revisions in this
phase are not required to be blinded. The reviewers also get the
option to be publicly identified on the article if published.</p>
<p>The purpose of this two-phased approach is to better align the
incentives of authors and reviewers so that work is published both
more quickly and with higher quality. Specifically, the job of the
first phase is to identify valuable, if not yet wholly optimized
research. Blind review, and the somewhat adversarial nature of the
process, are essential in this phase to distinguish high quality
submissions. Once there is agreement on the overall value of the
manuscript, however, the preprint process is designed to alleviate
authors’ anxiety (and potential hostility) regarding the status of
their manuscript, as well as to encourage reviewers to focus on
concrete, constructive changes rather than marshalling arguments to
‘kill’ the paper.</p>
<p>Additionally, we offer the option of pre-registering research.
While it may not be equally applicable to all types of computational
research, it can be a useful tool to help our goal of avoiding bias
against null-findings. We therefore will also accept registered
reports as submissions, in which a introduction, theory, and methods
are specified in advance, but data have not been collected and
analyzed yet. In this case, the first phase of the review process is
conducted on the basis of the preregistered report, meaning that the
report will be sent out for review and an intent to publish the final
article can be given on the basis of this review, independent of
research outcomes but of course conditional on robust and transparent
methodology in accordance with the preregistration. We encourage the
use of preregistration services such as the Open Science Framework or
aspredicted.org and/or the dissemination of the registered report as a
preprint once intent to publish is given.</p>
<p>This two-phase process and use of registered reports is
experimental by design and should be seen as a first step in moving
towards a more interactive and less adversarial review system. It is
not clear how well it will work. Nonetheless it is one of the
commitments of CCR to try new ideas that might improve the convoluted,
and generally under-examined, publishing process.</p>
</sec>
<sec id="introduction-to-the-first-issue">
<title>Introduction to the first issue</title>
<p>The articles in this first issue present a snapshot of all aspects
of computational communication research. present the Interface for
Communication Research (iCoRe), a user-friendly web interface to
access, explore, and analyze the Global Database of Events, Language
and Tone (GDELT). This interface makes it easier to work with GDELT to
answer substantive communication questions, as well as enhancing the
transparency and replicability of such work by providing a
standardized query interface. The authors demonstrate in three
theory-driven case studies the usefulness of iCoRe.</p>
<p> uses Structural Topic Models to show how the twitter feed of
newspapers differ from their online content. This study shows how
state-of-the-art analysis techniques can be used to study journalistic
choices and how they differ for different audiences and contexts.</p>
<p> present an open source browser plug-in that they use to observe
both the content and context of the consumption of (public) Facebook
posts. They also present a proof-of-concept study that, although
highlighting the technical and social difficulties of recruiting
participants for digital tracking studies, does show how the
interaction with posts can be recorded, including scrolling, liking,
and clicking links within a post.</p>
<p> used state-of-the-art recommender system techniques to create
personalized health communication messages in a longitudinal study.
Their results show that personalized messages have an improved effect
compared to either showing the overall most preferred message or a
random message.</p>
<p>Taken together, these four articles represent substantive
computational scholarship in journalism health communication, and
framing research. In addition, these articles contribute to making
data and computational tools more accessible to communication
scholars. We are confident that this is just the beginning of a stream
of great research articles, and we look forward to your contributions
and reviews.</p>
</sec>
</body>
<back>
<fn-group>
<fn id="fn1">
<label>1</label><p>Defining the niche and scope of CCR is an ongoing
effort, and updated versions of this section will be posted on the
journal website.</p>
</fn>
</fn-group>
</back>
</article>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment