jabalazs/om_if_paper_reviews.md

## om_if_paper_reviews.md

      
    Raw
  

              om_if_paper_reviews.md
            
          
    Reviews for Opinion Mining and Information Fusion: A survey

Authors: Jorge A. Balazs and Juan D. Velásquez
Reviewer 1

The paper is very nice and perfectly fits in the scoe of the journal. It has a suitable organization, complete content and it is friendly to any non-expert in this topic. I recommend the acceptance of this paper.
Reviewer 2

In this paper the authors present a review in Opinion Mining in the Information Fusion context, gathering an appreciable amount of references related to these subjects.
The paper is meaningful and the references are appropriate. However, some minor aspects can be improved in order to enhance the overall readability and presentation.

Paragraphs 5th and 6th in the introduction can be swapped or at least combined. Through all the paper the concepts of OM and IF seem to be completely separated and only combined in the end. It would be better to relate IF to OM before introducing the second concept.
I miss a categorization of the references in Section 2. A table or a figure would greatly help the reader to form a map of relations among the presented references: which concepts relate/differentiate them? Are they pure reviews or have they an experimental setup (survey)? And many more.
Presenting the concepts of OM and IF after the related work section seems strange to me. A novel reader to OM will be forced to directly jump to Section 3 in order to understand many concepts of Section 2.1. This is also applicable to IF, albeit less crucial. Reordering and adapting the sections would be greatly appreciated.
Section 5 can also benefit from a final figure/table with references organized by their main type (say Sections 5.1 to 5.2.2), enabling quick references.

Reviewer 3

This paper is a survey on the state-of-the-art regarding to the use of information fusion (IF) techniques for opinion mining (OM). The paper is structured in the following way: after the introduction, section 2 describes briefly previous reviews on OM (1.5 pages) and IF (2 pages). Section 3 is a quite exhaustive review on the state of the art in OM (8.5 pages), including recent developments. Section 4 (1.5 pages) is a kind of continuation on the state of the art on IF, describing in more detail four papers. Section 5 (6.5 pages) is the core of the paper, a survey on the application of IF techniques to OM systems, taking into account both fusion of data and processes. Section 6 (less than 1 page) concludes the paper. The paper is well written and the text is clear.
Almost half of the paper is devoted to review the state of the art on OM (Sects. 2.1 and 3). In particular, 14 previous survey papers on OM are cited (references 1-3 and 11-21) and described in Sect. 2.1. They have been published between 2008 and 2014. It should be justified why a new survey on OM is needed now.
To increase the value of this new survey, I suggest to include a table (or similar artifact)  in  section 3 to indicate the different corpora that have been used over the years to evaluate the performance of opinion mining systems. For each corpus, you should indicate the most relevant characteristics and a list of papers in which it is used. It would be particularly relevant to include recent datasets on social media, such as SemEval 2013-2015 and CLEF Replab 2013-2014. It would be valuable to include non-English datasets such as TASS 2013-2014 for Spanish (http://www.daedalus.es/TASS2014/). You should investigate if similar datsets exist for other languages.
Due to its reduced size, Sect. 4 could be merged with Sect. 2.2. As an alternative, it could be inserted at the beginning of Sect. 5 in order to serving as a kind of introductory material for this section.
Sect. 5 lacks a discussion section describing the different IF techniques used in the OM systems described. More important, it would be of great help for future OM system designers if you propose a framework for the application of IF techniques for OM, or al least, a draft of such a framework. The reason for this is that, as a OM practitioner, if after reading the survey I consider that IF could be useful for my future research, I am not provided with insight on how to acomplish this integration of OM and IF.
Minor remarks: in page 9 (seconfd paragraph) you say that "search queries to the Twitter API are limited to 180 per 15-minute time window". In fact, these limits apply to REST API only. In the case of Streaming API, rate limits are not crearly set:
"Twitter does not make public the number of connection attempts which will cause a rate limiting to occur, but there is some tolerance for testing and development. A few dozen connection attempts from time to time will not trigger a limit. However, it is essential to stop further connection attempts for a few minutes if a HTTP 420 response is received. If your client is rate limited frequently, it is possible that your IP will be blocked from accessing Twitter for an indeterminate period of time." (see dev.twitter.com/streaming/overview/connecting visited April 13, 2015)