cagdasyetkin/quiz2.txt

## quiz2.txt
1. Explain in your words what the unnest_token function does
It is a function from Tidytext library which restructures text: Creates one token for each row. It splits a text column (this is our input) into tokens (like words). It helps us doing this tokenization.

2. Explain your words what the gutenbergr package does
Project Gutenberg digitizes the books for which copyright has expired with the help of volunteers. Gutenbergr R package provides these books to R users. We can download and process these books using this library.

3. Explain in your words how sentiment lexicon work
They are like dictionaries which matches words with their sentiment or emotion. Such as classifying them into Positive - Negative - Neutral categories. Once we match the words in our text with lexicon, we can start analyzing the frequencies. Even if we dont know the language in which the text has been written, we can have an overall understanding.

4. How does inner_join provide sentiment analysis functionality
We match the words in our text with the sentiments in the lexicon. There can be lots of words which are not available in the lexicon. Similarly, there can be lots of words in the lexicon which are not mentioned in our text. inner_join brings us the intersection between our text and the lexicon.

5. Explain in your words what tf-idf does
It tells us how importand a word is in the text we are analyzing.

6. Explain why you may want to do tokenization by bigram

7. Please install the following packages, if you have not already:

1. tidyverse
2. tidytext
3. gutenbergr

Pick two or more authors that you are familiar with, download their texts using the gutenbergr package, and do a basic analysis of word frequencies and TF-IDF

# until 7 by Monday. 7 can be delivered by Friday. But dont do it :) give them all by monday.
	1. Explain in your words what the unnest_token function does
	It is a function from Tidytext library which restructures text: Creates one token for each row. It splits a text column (this is our input) into tokens (like words). It helps us doing this tokenization.

	2. Explain your words what the gutenbergr package does
	Project Gutenberg digitizes the books for which copyright has expired with the help of volunteers. Gutenbergr R package provides these books to R users. We can download and process these books using this library.

	3. Explain in your words how sentiment lexicon work
	They are like dictionaries which matches words with their sentiment or emotion. Such as classifying them into Positive - Negative - Neutral categories. Once we match the words in our text with lexicon, we can start analyzing the frequencies. Even if we dont know the language in which the text has been written, we can have an overall understanding.

	4. How does inner_join provide sentiment analysis functionality
	We match the words in our text with the sentiments in the lexicon. There can be lots of words which are not available in the lexicon. Similarly, there can be lots of words in the lexicon which are not mentioned in our text. inner_join brings us the intersection between our text and the lexicon.

	5. Explain in your words what tf-idf does
	It tells us how importand a word is in the text we are analyzing.

	6. Explain why you may want to do tokenization by bigram

	7. Please install the following packages, if you have not already:

	1. tidyverse
	2. tidytext
	3. gutenbergr

	Pick two or more authors that you are familiar with, download their texts using the gutenbergr package, and do a basic analysis of word frequencies and TF-IDF

	# until 7 by Monday. 7 can be delivered by Friday. But dont do it :) give them all by monday.