Skip to content

Instantly share code, notes, and snippets.

@milimetric
Created October 4, 2013 19:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save milimetric/6831839 to your computer and use it in GitHub Desktop.
Save milimetric/6831839 to your computer and use it in GitHub Desktop.
Hive script to create an internal table and insert hourly data aggregated at the daily level.
DROP TABLE IF EXISTS milimetric_pagecounts_daily;
CREATE TABLE IF NOT EXISTS milimetric_pagecounts_daily(
project string,
page string,
views int,
bytes int,
year int,
month int,
day int
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '
LOCATION '/user/milimetric/pagecounts_daily'
;
INSERT INTO TABLE milimetric_pagecounts_daily
SELECT project,
page,
sum(views),
sum(bytes),
year,
month,
day
FROM milimetric_pagecounts
WHERE year = 2013
GROUP BY project, page, year, month, day
;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment