Skip to content

Instantly share code, notes, and snippets.

@neilpw
Created March 27, 2019 01:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save neilpw/0b8491fac3595b539427d6f4a4c8ad59 to your computer and use it in GitHub Desktop.
Save neilpw/0b8491fac3595b539427d6f4a4c8ad59 to your computer and use it in GitHub Desktop.
Load Activity Logs with Keyset Pagination
# This script uses keyset pagination to load all of a Copper account's
# activities without running afoul of the performance degradation the Activities
# API experiences at even relatively low pagination offsets.
#
# The approach, loosely, is as follows:
#
# The API sorts activities by activity date, in descending order. We'll start
# by requesting the first page of activities, i.e. the most recent ones.
#
# For the next request, instead of increasing page_number, we'll set
# 'maximum_activity_date' (which was unspecified in the first request) to the
# *minimum* activity date of the activities returned in the first request.
#
# We'll continue to do this until all records have been loaded.
require 'net/http'
require 'uri'
require 'json'
API_URL = 'https://api.prosperworks.com/developer_api/v1/activities/search'.freeze
API_URI = URI.parse(API_URL)
HEADERS = {
'Content-Type' => 'application/json',
'X-PW-AccessToken' => 'abcd1234',
'X-PW-Application' => 'developer_api',
'X-PW-UserEmail' => 'some@email.test'
}.freeze
PAGE_SIZE = 10
# Initially, 'maximum_activity_date' will be null, which will give us the
# most recent activities.
#
maximum_activity_date = nil
loop do
request_body = {
'maximum_activity_date' => maximum_activity_date,
'page_size' => PAGE_SIZE
}
response = Net::HTTP.post(API_URI, request_body.to_json, HEADERS)
activities = JSON.parse(response.body)
# An empty response means we've reached the end of the results.
#
break if activities.empty?
# This is where you do something with the activities.
#
# Keep in mind that, when loading an account's full activity stream (as
# opposed to an individual record's activity stream), there may be tens or
# hundreds of thousands of results. If possible, process each page of results
# as it arrives, rather than e.g. accumulating every activity into a single
# array and handling them all at once.
#
# Once that's done, we move along to the next iteration of the loop.
# Extract the activity date from each activity in the response.
#
activity_dates = activities.map { |a| a['activity_date'] }
# Capture the lowest (i.e. the earliest) activity date. We'll use this as the
# maximum_activity_date for the next request.
#
maximum_activity_date = activity_dates.min
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment