Skip to content

Instantly share code, notes, and snippets.

@nikvas0
Created March 14, 2019 18:13
Show Gist options
  • Save nikvas0/c60ecb9c37d4a61b4cd924090ff6a806 to your computer and use it in GitHub Desktop.
Save nikvas0/c60ecb9c37d4a61b4cd924090ff6a806 to your computer and use it in GitHub Desktop.
##############################################################################################################################
# insert into ... select * from datasets.hits_v1
###########################################################
no index: 0 rows in set. Elapsed: 20.463 sec. Processed 8.87 million rows, 8.46 GB (433.65 thousand rows/s., 413.45 MB/s.)
3 x ngrambf(3, 512, 2, 0) (URLDomain, SearchPhrase, Title): 0 rows in set. Elapsed: 42.752 sec. Processed 8.87 million rows, 8.46 GB (207.57 thousand rows/s., 197.90 MB/s.)
3 x ngrambf(4, 512, 1, 0) (URLDomain, SearchPhrase, Title): 0 rows in set. Elapsed: 39.391 sec. Processed 8.87 million rows, 8.46 GB (225.28 thousand rows/s., 214.78 MB/s.)
1 x ngrambf(4, 512, 1, 0) (Title): 0 rows in set. Elapsed: 35.820 sec. Processed 8.87 million rows, 8.46 GB (247.74 thousand rows/s., 236.19 MB/s.)
1 x ngrambf(4, 512, 1, 0) (URLDomain): 0 rows in set. Elapsed: 25.047 sec. Processed 8.87 million rows, 8.46 GB (354.29 thousand rows/s., 337.79 MB/s.)
1 x tokenbf(512, 1, 0) (Title): 0 rows in set. Elapsed: 27.339 sec. Processed 8.87 million rows, 8.46 GB (324.59 thousand rows/s., 309.47 MB/s.)
1 x tokenbf(512, 1, 0) (URLDomain): 0 rows in set. Elapsed: 22.537 sec. Processed 8.87 million rows, 8.46 GB (393.74 thousand rows/s., 375.40 MB/s.)
##############################################################################################################################
ThinkPad-E570 :) show create table hits_3gram
SHOW CREATE TABLE hits_3gram
┌─statement────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.hits_3gram ( WatchID UInt64, JavaEnable UInt8, Title String, GoodEvent Int16, EventTime DateTime, E
ventDate Date, CounterID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RegionID UInt32, UserID UInt64, CounterClass
Int8, OS UInt8, UserAgent UInt8, URL String, Referer String, URLDomain String, RefererDomain String, Refresh UInt8, Is
Robot UInt8, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Array(U
Int32), ResolutionWidth UInt16, ResolutionHeight UInt16, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, Flash
Minor2 String, NetMajor UInt8, NetMinor UInt8, UserAgentMajor UInt16, UserAgentMinor FixedString(2), CookieEnable UInt8,
JavascriptEnable UInt8, IsMobile UInt8, MobilePhone UInt8, MobilePhoneModel String, Params String, IPNetworkID UInt32, Tr
aficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, IsArtifical UInt8, WindowClientWidth UIn
t16, WindowClientHeight UInt16, ClientTimeZone Int16, ClientEventTime DateTime, SilverlightVersion1 UInt8, SilverlightVers
ion2 UInt8, SilverlightVersion3 UInt32, SilverlightVersion4 UInt16, PageCharset String, CodeVersion UInt32, IsLink UInt8,
IsDownload UInt8, IsNotBounce UInt8, FUniqID UInt64, HID UInt32, IsOldCounter UInt8, IsEvent UInt8, IsParameter UInt8,
DontCountHits UInt8, WithHash UInt8, HitColor FixedString(1), UTCEventTime DateTime, Age UInt8, Sex UInt8, Income UInt8,
Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), RemoteIP UInt32, RemoteIP6 FixedString(16), WindowName
Int32, OpenerName Int32, HistoryLength Int16, BrowserLanguage FixedString(2), BrowserCountry FixedString(2), SocialNetwor
k String, SocialAction String, HTTPError UInt16, SendTiming Int32, DNSTiming Int32, ConnectTiming Int32, ResponseStartTim
ing Int32, ResponseEndTiming Int32, FetchTiming Int32, RedirectTiming Int32, DOMInteractiveTiming Int32, DOMContentLoadedT
iming Int32, DOMCompleteTiming Int32, LoadEventStartTiming Int32, LoadEventEndTiming Int32, NSToDOMContentLoadedTiming Int3
2, FirstPaintTiming Int32, RedirectCount Int8, SocialSourceNetworkID UInt8, SocialSourcePage String, ParamPrice Int64, Pa
ramOrderID String, ParamCurrency FixedString(3), ParamCurrencyID UInt16, GoalsReached Array(UInt32), OpenstatServiceName St
ring, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCam
paign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, RefererHash UInt64, URLHash UInt64, CLI
D UInt32, YCLID UInt64, ShareService String, ShareURL String, ShareTitle String, `ParsedParams.Key1` Array(String), `Pars
edParams.Key2` Array(String), `ParsedParams.Key3` Array(String), `ParsedParams.Key4` Array(String), `ParsedParams.Key5` Arra
y(String), `ParsedParams.ValueDouble` Array(Float64), IslandID FixedString(16), RequestNum UInt32, RequestTry UInt8, INDEX
domain URLDomain TYPE ngrambf(3, 512, 2, 0) GRANULARITY 1, INDEX sp SearchPhrase TYPE ngrambf(3, 512, 2, 0) GRANULARITY 1, I
NDEX t Title TYPE ngrambf(3, 512, 2, 0) GRANULARITY 1) ENGINE = MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterI
D, EventDate, intHash32(UserID)) SAMPLE BY intHash32(UserID) SETTINGS index_granularity = 8192 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.008 sec.
ThinkPad-E570 :) show create table hits_no_index
SHOW CREATE TABLE hits_no_index
┌─statement────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.hits_no_index ( WatchID UInt64, JavaEnable UInt8, Title String, GoodEvent Int16, EventTime DateTime,
EventDate Date, CounterID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RegionID UInt32, UserID UInt64, CounterCl
ass Int8, OS UInt8, UserAgent UInt8, URL String, Referer String, URLDomain String, RefererDomain String, Refresh UInt8,
IsRobot UInt8, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Arra
y(UInt32), ResolutionWidth UInt16, ResolutionHeight UInt16, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, Fl
ashMinor2 String, NetMajor UInt8, NetMinor UInt8, UserAgentMajor UInt16, UserAgentMinor FixedString(2), CookieEnable UInt8
, JavascriptEnable UInt8, IsMobile UInt8, MobilePhone UInt8, MobilePhoneModel String, Params String, IPNetworkID UInt32,
TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, IsArtifical UInt8, WindowClientWidth
UInt16, WindowClientHeight UInt16, ClientTimeZone Int16, ClientEventTime DateTime, SilverlightVersion1 UInt8, SilverlightV
ersion2 UInt8, SilverlightVersion3 UInt32, SilverlightVersion4 UInt16, PageCharset String, CodeVersion UInt32, IsLink UInt
8, IsDownload UInt8, IsNotBounce UInt8, FUniqID UInt64, HID UInt32, IsOldCounter UInt8, IsEvent UInt8, IsParameter UInt8
, DontCountHits UInt8, WithHash UInt8, HitColor FixedString(1), UTCEventTime DateTime, Age UInt8, Sex UInt8, Income UInt
8, Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), RemoteIP UInt32, RemoteIP6 FixedString(16), WindowN
ame Int32, OpenerName Int32, HistoryLength Int16, BrowserLanguage FixedString(2), BrowserCountry FixedString(2), SocialNet
work String, SocialAction String, HTTPError UInt16, SendTiming Int32, DNSTiming Int32, ConnectTiming Int32, ResponseStart
Timing Int32, ResponseEndTiming Int32, FetchTiming Int32, RedirectTiming Int32, DOMInteractiveTiming Int32, DOMContentLoad
edTiming Int32, DOMCompleteTiming Int32, LoadEventStartTiming Int32, LoadEventEndTiming Int32, NSToDOMContentLoadedTiming I
nt32, FirstPaintTiming Int32, RedirectCount Int8, SocialSourceNetworkID UInt8, SocialSourcePage String, ParamPrice Int64,
ParamOrderID String, ParamCurrency FixedString(3), ParamCurrencyID UInt16, GoalsReached Array(UInt32), OpenstatServiceName
String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTM
Campaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, RefererHash UInt64, URLHash UInt64,
CLID UInt32, YCLID UInt64, ShareService String, ShareURL String, ShareTitle String, `ParsedParams.Key1` Array(String), `P
arsedParams.Key2` Array(String), `ParsedParams.Key3` Array(String), `ParsedParams.Key4` Array(String), `ParsedParams.Key5` A
rray(String), `ParsedParams.ValueDouble` Array(Float64), IslandID FixedString(16), RequestNum UInt32, RequestTry UInt8) ENG
INE = MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate, intHash32(UserID)) SAMPLE BY intHash32(UserI
D) SETTINGS index_granularity = 8192 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.356 sec.
ThinkPad-E570 :) truncate table hits_no_index
TRUNCATE TABLE hits_no_index
Ok.
0 rows in set. Elapsed: 0.158 sec.
ThinkPad-E570 :) truncate table hits_3gram
TRUNCATE TABLE hits_3gram
Ok.
0 rows in set. Elapsed: 0.366 sec.
ThinkPad-E570 :) insert into hits_3gram select * from datasets.hits_v1
INSERT INTO hits_3gram SELECT *
FROM datasets.hits_v1
Ok.
0 rows in set. Elapsed: 42.752 sec. Processed 8.87 million rows, 8.46 GB (207.57 thousand rows/s., 197.90 MB/s.)
********** - 3 ngrambf(3, 512, 2, 0)
ThinkPad-E570 :) truncate table hits_no_index
TRUNCATE TABLE hits_no_index
Ok.
0 rows in set. Elapsed: 0.158 sec.
ThinkPad-E570 :) truncate table hits_3gram
TRUNCATE TABLE hits_3gram
Ok.
0 rows in set. Elapsed: 0.366 sec.
ThinkPad-E570 :) insert into hits_no_index select * from datasets.hits_v1
INSERT INTO hits_no_index SELECT *
FROM datasets.hits_v1
Ok.
0 rows in set. Elapsed: 20.463 sec. Processed 8.87 million rows, 8.46 GB (433.65 thousand rows/s., 413.45 MB/s.)
********** - no index
ThinkPad-E570 :) create table hits_4gram as datasets.hits_v1
CREATE TABLE hits_4gram AS datasets.hits_v1
Ok.
0 rows in set. Elapsed: 0.436 sec.
ThinkPad-E570 :) alter table hits_4gram add INDEX domain URLDomain TYPE ngrambf(4, 512, 1, 0) GRANULARITY 1, add INDEX sp Searc
hPhrase TYPE ngrambf(4, 512, 1, 0) GRANULARITY 1, add INDEX t Title TYPE ngrambf(4, 512, 1, 0) GRANULARITY 1
ALTER TABLE hits_4gram
ADD INDEX
domain URLDomain TYPE ngrambf(4, 512, 1, 0) GRANULARITY 1,
ADD INDEX
sp SearchPhrase TYPE ngrambf(4, 512, 1, 0) GRANULARITY 1,
ADD INDEX
t Title TYPE ngrambf(4, 512, 1, 0) GRANULARITY 1
Ok.
0 rows in set. Elapsed: 0.454 sec.
ThinkPad-E570 :) insert into hits_4gram select * from datasets.hits_v1
INSERT INTO hits_4gram SELECT *
FROM datasets.hits_v1
Ok.
0 rows in set. Elapsed: 39.391 sec. Processed 8.87 million rows, 8.46 GB (225.28 thousand rows/s., 214.78 MB/s.)
ThinkPad-E570 :) truncate table hits_4gram
TRUNCATE TABLE hits_4gram
Ok.
0 rows in set. Elapsed: 0.181 sec.
ThinkPad-E570 :) alter table hits_4gram drop index sp, drop index t
ALTER TABLE hits_4gram
DROP INDEX sp,
DROP INDEX t
Ok.
0 rows in set. Elapsed: 0.259 sec.
ThinkPad-E570 :) insert into hits_4gram select * from datasets.hits_v1
INSERT INTO hits_4gram SELECT *
FROM datasets.hits_v1
Ok.
0 rows in set. Elapsed: 25.047 sec. Processed 8.87 million rows, 8.46 GB (354.29 thousand rows/s., 337.79 MB/s.)
ThinkPad-E570 :) alter table hits_4gram drop index domain
ALTER TABLE hits_4gram
DROP INDEX domain
Ok.
0 rows in set. Elapsed: 0.553 sec.
ThinkPad-E570 :) alter table hits_4gram ADD INDEX t Title TYPE tokenbf(512, 1, 0) GRANULARITY 1
ALTER TABLE hits_4gram
ADD INDEX
t Title TYPE tokenbf(512, 1, 0) GRANULARITY 1
Ok.
0 rows in set. Elapsed: 0.489 sec.
ThinkPad-E570 :) insert into hits_4gram select * from datasets.hits_v1
INSERT INTO hits_4gram SELECT *
FROM datasets.hits_v1
Ok.
0 rows in set. Elapsed: 27.339 sec. Processed 8.87 million rows, 8.46 GB (324.59 thousand rows/s., 309.47 MB/s.)
ThinkPad-E570 :) alter table hits_4gram drop index t, ADD INDEX d URLDomain TYPE tokenbf(512, 1, 0) GRANULARITY 1
ALTER TABLE hits_4gram
DROP INDEX t,
ADD INDEX
d URLDomain TYPE tokenbf(512, 1, 0) GRANULARITY 1
Ok.
0 rows in set. Elapsed: 0.280 sec.
ThinkPad-E570 :) truncate table hits_4gram
TRUNCATE TABLE hits_4gram
Ok.
0 rows in set. Elapsed: 0.307 sec.
0 rows in set. Elapsed: 0.307 sec.
ThinkPad-E570 :) insert into hits_4gram select * from datasets.hits_v1
INSERT INTO hits_4gram SELECT *
FROM datasets.hits_v1
Ok.
0 rows in set. Elapsed: 22.537 sec. Processed 8.87 million rows, 8.46 GB (393.74 thousand rows/s., 375.40 MB/s.)
ThinkPad-E570 :) alter table hits_4gram drop index d, ADD INDEX t Title TYPE ngrambf(4, 512, 1, 0) GRANULARITY 1
ALTER TABLE hits_4gram
DROP INDEX d,
ADD INDEX
t Title TYPE ngrambf(4, 512, 1, 0) GRANULARITY 1
Ok.
0 rows in set. Elapsed: 0.529 sec.
ThinkPad-E570 :) truncate table hits_4gram
TRUNCATE TABLE hits_4gram
Ok.
0 rows in set. Elapsed: 0.183 sec.
ThinkPad-E570 :) insert into hits_4gram select * from datasets.hits_v1
INSERT INTO hits_4gram SELECT *
FROM datasets.hits_v1
Ok.
0 rows in set. Elapsed: 35.820 sec. Processed 8.87 million rows, 8.46 GB (247.74 thousand rows/s., 236.19 MB/s.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment