Skip to content

Instantly share code, notes, and snippets.

@jokerkeny
Last active September 19, 2018 04:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jokerkeny/18c396a00f1e0ab367b9721552013ec7 to your computer and use it in GitHub Desktop.
Save jokerkeny/18c396a00f1e0ab367b9721552013ec7 to your computer and use it in GitHub Desktop.
Solution to run wk2-Tutorial-TextMining.Rmd

If you just wanna see the final report and don't wanna knit it by yourself, just download this html(click and Ctrl+S, with your columbia gmail logged in):
https://drive.google.com/open?id=15pokdPPv8trdWEai-40uUwn2O34KDmJ2

Here are some solutions to some common problems when you try to run/knitr "wk2-Tutorial-TextMining.Rmd".
Hope it can help you.

Install/Load packages

During the first chunk of packages installation&loading, you may see several warnings(which doesn't matter) or errors(which prevents success install/load).
We suggest run "library("rvest"); library("tibble"); ..." line by line and figure out which packages haven't been properly installed. Then, install.package("<TheTroublePackageName>") So you can manually look up the error messages in Google.
Sometimes it may already output the directions to solve the error, just follow it.

Java problem(for "qdap" package)

when install.packages/library("qdap"), it may show errors similar to below:

- Error: package or namespace load failed for ‘qdap’:
-  .onLoad failed in loadNamespace() for 'rJava', details:
-   call: dyn.load(file, DLLpath = DLLpath, ...)
-   error: unable to load shared object '/Library/Frameworks/R.framework/Versions/3.4/Resources/library/rJava/libs/rJava.so':
-   dlopen(/Library/Frameworks/R.framework/Versions/3.4/Resources/library/rJava/libs/rJava.so, 6): Library not loaded: /Library/Java/JavaVirtualMachines/jdk-9.jdk/Contents/Home/lib/server/libjvm.dylib
-   Referenced from: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rJava/libs/rJava.so
-   Reason: image not found

That's because this package has some specific requirements for Java environment.

Solutions for Mac OS 10.5 or later

Step 0 Uninstall existing JDK

(If you are familiar with JRE setting, just set the $JAVA_HOME to your /jre directory and skip Step 0, Step 1, Step 2)
Start your MacOS terminal from your launchpad. Then run:(without the $ sign)

$ cd /Library/Java/JavaVirtualMachines
$ sudo rm -rf *

This will remove all JDK existing on your machine before, just to make a same initial environment for the following steps. Otherwise, the following steps may not work.

Step 1 Install JDK (Java Development Kit)

You can download your responding JDK8 here. Install it.

Step 2 Setting the $JAVA_HOME environment variable

adding export JAVA_HOME=$(/usr/libexec/java_home)/jre to ~/.bash_profile.
In case you are not familiar with terminal text editing, you can follow the steps below:
First, in your terminal(MacOS terminal, not R Console), run

$ sudo vim ~/.bash_profile 

In the file open, type Shift+G, then type o, then add the following line:

export JAVA_HOME=$(/usr/libexec/java_home)/jre

After that, type Escape, then type :wq, Return(Enter).

Step 3 Configure Java for R

Returning to the terminal, run

$ source ~/.bash_profile
$ sudo R CMD javareconf

Probably, when running R CMD javareconf, the output will contain the xcrun error as below:

xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun

Then you should re-install xcode command line by:

$ xcode-select --install

Then $ sudo R CMD javareconf again. If the xcrun problem still exists, try $ sudo xcode-select -switch / then $ sudo R CMD javareconf

Finally, restart R/RStudio, re-run install.packages("qdap") or library("qdap") in the R console, the problem could be solved.

Solutions for Linux(Ubuntu)

The solutions to linux systems is similar to the MacOS one, except the JDK installation, you can run:

$ sudo add-apt-repository ppa:webupd8team/java
$ sudo apt-get update
$ sudo apt-get install oracle-java8-installer

Also, instead of add export JAVA_HOME=$(/usr/libexec/java_home)/jre, you need to add export JAVA_HOME=$JAVA_HOME/jre

topicmodels error

you may also encounter error when install "topicmodels" rpackage, because of lacking gsl, then meet a compiling error. Then you need to setup gsl library on your system.
For Ubuntu, you can run $ sudo apt-get install libgsl-dev

Runtime Error

different nrow

When running line #97, you may encounter the following problem:

> speech.list=cbind(speech.list, speech.url)
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 125, 126

that is because the .csv megadata file is too old, as there appears a new speech on the presidency website.
you can use the csv file below to replace the original one in /data :
https://drive.google.com/open?id=1GjdnFRQS9_AKL00bcdV10sgYQPe9hhCU

cannot open file '../data/fulltext/inaugGeorgeWashington-1.txt'

The error message:

cannot open file '../data/fulltext/inaugGeorgeWashington-1.txt': No such file or directoryError in file(file, if (append) "a" else "w") : 
  cannot open the connection

That's because you've downloaded the ADS_Teaching/Tutorials/wk2-TextMining/ , which doesn't have ../data/fulltext Instead, you should use "ADS_Teaching/Tutorials/wk2-TextMining.zip"

names do not match

> speech.list=rbind(speech.list, Trump.speeches)
Error in match.names(clabs, names(xi)) : 
  names do not match previous names

If you View(speech.list), you may see the header president become "锘縋resident".
This problem is common in Windows.

Solution:

For every read.csv, add, fileEncoding = "UTF-8-BOM"
such as:
line #82 inaug.list=read.csv("../data/inauglist.csv", stringsAsFactors = FALSE, fileEncoding = "UTF-8-BOM")

Knitr(YAML) error

When Knit the document, it may occur:

Error in yaml::yaml.load(string, ...) : 
  Scanner error: mapping values are not allowed in this context at line 2, column 23
Calls: <Anonymous> ... parse_yaml_front_matter -> yaml_load_utf8 -> <Anonymous>
Execution halted

Solution:

For Windows, you can replace the YAML by:

---
title: 'Tutorial (week 2) B: text mining'
output:
  html_document:
    toc: true
    toc_depth: '2'
---

For linux, you can use:

---
title: "Tutorial (week 2) B: text mining"
output: 
html_notebook:
    toc: true
    toc_depth: 2
---

or, just select "knit to html" through the drawdown triangle.

Others

If it still encounters some problems, we recommmend you to update(uninstall and install again) R and RStudio to the newest version, then update.packages() in the R Console.
Or, you can search your problem on piazza or Google.

Welcome to leave comment if there are any problem with these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment