Instantly share code, notes, and snippets.
Thanks TakeZoe .
I need to write Scala program combined data from the two different files.
The file starting with assets_ contains asset. Each line represents 1 asset impression. The file starting ad-events_ contains various kinds of ad events. Each line represents one ad event. We are interested only in two kinds of ad events - 'view' and 'click'. Rest of the ad events should be ignored. Type of ad event is specified in the value of the field 'e'. You can see "e":"view" and "e":"click" in the json messages.
Each line also has another id called 'page view id'. This id is specified by json key 'pv'. You can find json key values similar to "pv":"7963ee21-0d09-4924-b315-ced4adad425f" in both the files.
The aim is to join the data in two files using "pv". You need to parse these files and combine the data in two files to check how many asset impressions, views and clicks are present in both the files for each page view id. the input files are .gz.
If it possible can you help me regarding this task?