Mindy Ng mindyng

## binary_search_tree.py
#Binary search tree (BST) is a binary tree where the value of each node is larger or equal to the values in all the nodes in that node's left subtree and is smaller than the values in all the nodes in that node's right subtree.

# Write a function that, efficiently with respect to time used, checks if a given binary search tree contains a given value.

# For example, for the following tree:

# n1 (Value: 1, Left: null, Right: null)
# n2 (Value: 2, Left: n1, Right: n3)
# n3 (Value: 3, Left: null, Right: null)
# Call to contains(n2, 3) should return True since a tree with root at n2 contains number 3.

## app_session.sql
/*App usage data are kept in the following table:

TABLE sessions
  id INTEGER PRIMARY KEY,
  userId INTEGER NOT NULL,
  duration DECIMAL NOT NULL

Write a query that selects userId and average session duration for each user who has more than one session.*/

-- Example case create statement:

## Capstone Project I - Data Wrangling
The very first step after downloading and unzipping the dataset was to import all 8 separate .csv files and format them
as individual pandas data frames. Each data frame would have a review per row. Each data frame would have 4 different
columns (from left to right): “Review Score”, “Tail of Review URL”, “Review Title” and “Review Text”.

All reviews were combined into one big dataframe to make data wrangling easier- such as applying functions on it.

Then columns: “Review Score” and “Review Text” were separated out as their own variables since these would be the main
objects handled in the Machine Learning algorithm.


## gist:3b97e11092140310253cb56a619f1324
The problem is I want to assign a sentiment to a review as +/-/neutral based on words used in product reviews. (Given a review, the goal is to predict the user’s attitude.)

According to Wikipedia, sentiment analysis is (sometimes known as opinion mining or emotion AI) refers to the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information.


My client would be amazon.com or some other e-commerce giant that would like to know which of their products are highly liked. Then the company can invest accordingly. Based on my analysis, the company would make certain products more available or recommend similar products in order to retain and grow their customer base.

For negative sentiments, client could do research on what are the drivers behind negative sentiments, especially related to competitors. If there is negative conversation, reach out to these reviewers.

With sentime

## gist:62c76312d969da12997572a41314513a
I was at first unwilling to attend tonight's meetup because it seemed like it was more for people who were still exploring
Data Science as a career. And I was already committed to Data Science. So maybe not tonight.

However, the point of me going to these meet-ups were not to just hear some advice, but to meet people, network with others in the Data Science space. And I was able to meet Michelle Kelsey, who is part of IBM's Watson Cognitive team. This really excited me because I was first exposed to IBM's Watson Cognitive through Serena's Watson. She was able to feed her play data into Watson, who predicted for her the best move/s for her next game.

The same cognitive solution was demonstrated with Cognotoy dino, a toy that learns, remembers and responds through dialog with the
thw user. This got the best of me. Now I want my own cognotoy dino. I ended up meeting Michelle face-to-face as planned, got her business card and took up her offer to meet her back in the city (SF) sometime in April to discuss more

## gist:35b418e11ff480815841080bbb7cf71d
Validating A/B Test Results for Yammer


Possible Causes to Increased Messages in Treatment Group
1. Metric may need to be redefined
2. Poor calculations
3. Users were not random, which would make test set-up faulty by being bias
4. Confounding factor that is hard to detect, but having effect(s) on test results


## gist:baca4114da1af15a43fca0458e40c147
  Last night, I got to meet a huge variety of people interested in data analytics/science! I really enjoyed meeting people who recently got their appetite wet in data science and people who are seasoned in the field since writing 1000-lines of code is a breeze for them and who talk about R as if they are more fluent in it than English. A neat surprise was the presence of Math and Economics professors adding in their input to the discussion on whether or not Peer Assisted Learning would help raise performance levels in STEM classes offered at Sacramento State University. This talk helped me learn that a result from an experiment can always be questioned. Experiment design can be reassessed even after the experiment has been completed. Therefore, even after drawing conclusions on my experiment, keeping an open mind for feedback would be advisable.

  The second talk was the one I had more interest in since I have been wondering about how to pick a model that best addressed my Capstone Project problem. I had a
	#Binary search tree (BST) is a binary tree where the value of each node is larger or equal to the values in all the nodes in that node's left subtree and is smaller than the values in all the nodes in that node's right subtree.

	# Write a function that, efficiently with respect to time used, checks if a given binary search tree contains a given value.

	# For example, for the following tree:

	# n1 (Value: 1, Left: null, Right: null)
	# n2 (Value: 2, Left: n1, Right: n3)
	# n3 (Value: 3, Left: null, Right: null)
	# Call to contains(n2, 3) should return True since a tree with root at n2 contains number 3.
	/*App usage data are kept in the following table:

	TABLE sessions
	id INTEGER PRIMARY KEY,
	userId INTEGER NOT NULL,
	duration DECIMAL NOT NULL

	Write a query that selects userId and average session duration for each user who has more than one session.*/

	-- Example case create statement:
	The very first step after downloading and unzipping the dataset was to import all 8 separate .csv files and format them
	as individual pandas data frames. Each data frame would have a review per row. Each data frame would have 4 different
	columns (from left to right): “Review Score”, “Tail of Review URL”, “Review Title” and “Review Text”.

	All reviews were combined into one big dataframe to make data wrangling easier- such as applying functions on it.

	Then columns: “Review Score” and “Review Text” were separated out as their own variables since these would be the main
	objects handled in the Machine Learning algorithm.
	The problem is I want to assign a sentiment to a review as +/-/neutral based on words used in product reviews. (Given a review, the goal is to predict the user’s attitude.)

	According to Wikipedia, sentiment analysis is (sometimes known as opinion mining or emotion AI) refers to the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information.


	My client would be amazon.com or some other e-commerce giant that would like to know which of their products are highly liked. Then the company can invest accordingly. Based on my analysis, the company would make certain products more available or recommend similar products in order to retain and grow their customer base.

	For negative sentiments, client could do research on what are the drivers behind negative sentiments, especially related to competitors. If there is negative conversation, reach out to these reviewers.

	With sentime
	I was at first unwilling to attend tonight's meetup because it seemed like it was more for people who were still exploring
	Data Science as a career. And I was already committed to Data Science. So maybe not tonight.

	However, the point of me going to these meet-ups were not to just hear some advice, but to meet people, network with others in the Data Science space. And I was able to meet Michelle Kelsey, who is part of IBM's Watson Cognitive team. This really excited me because I was first exposed to IBM's Watson Cognitive through Serena's Watson. She was able to feed her play data into Watson, who predicted for her the best move/s for her next game.

	The same cognitive solution was demonstrated with Cognotoy dino, a toy that learns, remembers and responds through dialog with the
	thw user. This got the best of me. Now I want my own cognotoy dino. I ended up meeting Michelle face-to-face as planned, got her business card and took up her offer to meet her back in the city (SF) sometime in April to discuss more
	Validating A/B Test Results for Yammer


	Possible Causes to Increased Messages in Treatment Group
	1. Metric may need to be redefined
	2. Poor calculations
	3. Users were not random, which would make test set-up faulty by being bias
	4. Confounding factor that is hard to detect, but having effect(s) on test results
	Last night, I got to meet a huge variety of people interested in data analytics/science! I really enjoyed meeting people who recently got their appetite wet in data science and people who are seasoned in the field since writing 1000-lines of code is a breeze for them and who talk about R as if they are more fluent in it than English. A neat surprise was the presence of Math and Economics professors adding in their input to the discussion on whether or not Peer Assisted Learning would help raise performance levels in STEM classes offered at Sacramento State University. This talk helped me learn that a result from an experiment can always be questioned. Experiment design can be reassessed even after the experiment has been completed. Therefore, even after drawing conclusions on my experiment, keeping an open mind for feedback would be advisable.

	The second talk was the one I had more interest in since I have been wondering about how to pick a model that best addressed my Capstone Project problem. I had a