Skip to content

Instantly share code, notes, and snippets.

View infectious's full-sized avatar

Kepler EMEA infectious

View GitHub Profile
# Load 30 days of ip files into RDW
parameter :today do
2.day.ago.to_date
end
execute do
files = NOP(:ARC).join("appnexus/**/ips_*.txt.gz")
sources = files.reject do |file|
require 'ipaddr'
parameter :today do
2.day.ago.to_date
end
helper :new_ips do
NOP(:RDW).from(:"logs__#{today.ymd}").select_map { distinct(:ip) }
end
require 'ipaddr'
parameter :today do
2.day.ago.to_date
end
helper :logs_union do
((today - 30.days)..today).map do |date|
"""
SELECT ip
local ad = tonumber(ARGV[1])
local BL = 1
local weight = 0
-- bid price constant will be 50p : 500000 /1000 = 500
-- give up bid price constant will be 10p : 100000/1000 = 100
local give_up_bid = 100
-- Check if there is a weight
if tonumber(redis.call('zscore', 'ad:weights', ad)) == nil then
return nil
end
local XB = 100000 -- give up bid threshold = 10p
local BL = 1
local ad = tonumber(ARGV[1])
local weight = 0
-- Check if there is a weight
if tonumber(redis.call('zscore', 'ad:weights', ad)) == nil then
return nil
end
-- This is a PSEUDO code for calculating the Bid price according to events e.g clicks/conversion or what IM deem an event (except impressions/auctions)
-- This code has not been TESTED and should be a guide line
-- Using this code for dev or prod is at the risk of however is using/deploying this :)
-- Ant!
--- Equation in the documentation
--Bde = 1000 x Event_rate x deal_goal x lookback_window_price
-- This is a PSEUDO code for calculating the Bid price according to events e.g clicks/conversion or what IM deem an event (except impressions/auctions)
-- This code has not been TESTED and should be a guide line
-- Using this code for dev or prod is at the risk of however is using/deploying this :)
-- Ant!
--- Equation in the documentation
--Bde = 1000 x Event_rate x deal_goal x lookback_window_price
ETL question:
how do I transform an extract to add multiple output rows for each input row? For example, in the following there in an array within each row extracted and I want each array member to contribute to a separate row.
eg, Here is one one row of an extract. The array called 'splits' within it has two members (each a hash).
{:name=>"segment_feed", :hour=>"2013_02_05_19", :timestamp=>"20130205204446",
:splits=>[{"part"=>"0", "status"=>"new", "checksum"=>"3980ec0b30f78e15782df5dc29ec89e4"},
{"part"=>"1", "status"=>"new", "checksum"=>"fec249e666448b236ea6a4367563ccd6"}]}
I want the following two rows in the load (as many rows as there are split parts):
# Fetch a list of siphons for download from AppNexus
extract :DWAPI do
get 'Siphon'
limit 2
resolve do |response|
rows = []
response.each do |input|
name = input[:name]
Problem:
Drastically fewer segment rows since 22nd May.
Some facts:
1. The raw Appnexus Segment files we download from AppNexus data siphon are still about the same size as before.
2. The extracted Segments files are now much smaller (1/1000th of former size).
3. Running the extract on older files reduces the extracted file size.
=> The extract is causing the problem.
(https://github.com/infectious/etl/blob/master/etl/apn/data_siphon/segments/extract.rb)