Skip to content

Instantly share code, notes, and snippets.

@vicenteg
Created June 28, 2014 10:50
Show Gist options
  • Save vicenteg/fafbf7d456ad1a613887 to your computer and use it in GitHub Desktop.
Save vicenteg/fafbf7d456ad1a613887 to your computer and use it in GitHub Desktop.
yelp data ingest
DROP TABLE IF EXISTS yelp_business;
DROP TABLE IF EXISTS yelp_reviews;
CREATE TABLE IF NOT EXISTS yelp_business
(city STRING, review_count INT, name STRING, neighborhoods ARRAY<STRING>,
type STRING, business_id STRING, full_address STRING, hours STRING,
state STRING, longitude FLOAT, stars INT, latitude FLOAT, attributes STRING, open BOOLEAN, categories STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
CREATE TABLE IF NOT EXISTS yelp_reviews
(funny INT, useful INT, cool INT, user_id STRING, review_id STRING,
text STRING, business_id STRING, stars INT, date DATE, type STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
LOAD DATA LOCAL INPATH '/home/vagrant/yelp/yelp_academic_dataset_business.tsv' OVERWRITE INTO TABLE yelp_business;
LOAD DATA LOCAL INPATH '/home/vagrant/yelp/yelp_academic_dataset_review.tsv' OVERWRITE INTO TABLE yelp_reviews;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment