Skip to content

Instantly share code, notes, and snippets.

@clemens
Last active January 4, 2021 13:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save clemens/ef43d53cc6a7e8f34357dc04f0639092 to your computer and use it in GitHub Desktop.
Save clemens/ef43d53cc6a7e8f34357dc04f0639092 to your computer and use it in GitHub Desktop.

Here are a few things that I found odd when working extensively with the Pipedrive API.

Brief description of use case

For context, here's a brief description of our use case:

  • Our BI team runs various reports across the company. For some reports, we need certain data from Pipedrive.
  • Therefore we've built a tool that on a regular basis fetches data from the Pipedrive API and puts it into a Postgres database (which Tableau can easily interface with).
  • Part of the data fetching involves dynamically building the schema for the tables, so that we properly handle custom fields added to Pipedrive.

Here, we've run into problems because the Pipedrive API seems to not be entirely consistent when it comes to field types and also have other issues.

Field type-related issues

Inconsistent handling of enum fields

enum fields seem to generally have integer values and string labels. Notable exceptions in the activities resource:

  • done has boolean IDs (true and false)
  • busy_flag – same thing
  • type has string values (e.g. "call", "meeting")

Consequently, this needs manual remapping when scripting around field types.

Inconsistent handling of date/time fields

date and time fields are handled inconsistently. The most obvious examples are in the activities resource:

  • due_date says it's a date and it actually is (YYYY-MM-DD)
  • add_time says it's a time, but it really is a datetime (YYYY-MM-DD HH:MM:SS)
  • due_time says it's a time and it is, if you consider "time" to be a "time of day" (HH:MM) rather than a more database-like definition of time (= datetime)
  • duration says it's a time but apparently it goes beyond values of 23:59 (e.g. we have an activity with a value of 48:00 here, which presumably means 2 days)

I think this should be handled consistenly: If there's a type date, then it shouldn't sometimes be used as a date and sometimes as a datetime; if there's a type time, it's fine that it's treated as a "time of day", but then it shouldn't be used to represent a duration that can go beyond valid ranges for a time of day (= 00:00 to 23:59).

Inconsistent handling of blank values

Usually, blank values seem to be set as null which is what I'd expect. However, some fields in activities seem to contain an empty string instead (e.g. "duration": "", "lead_title": "", "marked_as_done_time": "" etc.).

Missing boolean type

I'm missing a distinct boolean field type – both for custom fields as well as Pipedrive's own fields (e.g. done and busy_flag in the activities resource). I think true/false concepts are sufficiently common to deserve their own data type.

Field Option IDs

IDs of field options are usually integers – except for status of deals where they seem to be strings.

Other field-related oddities

Some other minor oddities that I came across:

  • Several fields that are called *_id counterintuitively return full objects (including the ID) instead of only the ID (as the name would suggest)
  • lack of naming conventions, e.g.
    • user_id given as a parameter references entries of the users resource, but for organizations it's org_id
    • a reference to the users resource has field type user (e.g. creator_user_id in activities), but one to persons has type people (e.g. person_id in activities)
  • Pipedrive's own numerical fields seem to be int, but our custom fields are generally seem to be double
  • All fields have IDs except for monetary fields' currency

Other issues

Lack of granularity when filtering dates/times

It's a bummer that the "time ago" filters stop at 1 week ago and I can't thus filter for values > 6 hours ago, < 30 minutes or similar.

Pagination on "active" resources

There seems to be no "stable" way to handle pagination on large, "active" resources. Example:

  • I want to fetch multiple pages of activities.
  • I fetch page 1 from the API.
  • While fetching page 1, a new activity gets added.
  • When I fetch page 2, I get a duplicate activity that I've already fetched at page 1 and the newly added activity never gets fetched.
  • The problem only becomes worse as the result set and/or number of active users increases.

Of course, it's possible to work around this using a filter, i.e. before running the paginated requests, create a filter with update_time = the-start-time-of-the-process and then delete the filter once it's done. But this is quite cumbersome => there should be a better way (e.g. something that functions like a cursor in databases).

Cumbersome to get Deal Flow

To get data regarding deal flow, we need to fetch the flow for every single deal individually which is cumbersome, inefficient and time-consuming. It would be beneficial to have at least one of the following options:

  • optionally include flow when fetching deals, i.e. something like /deals?include_flow=true (potentially even with additional settings like /deals?include_flow=true&flow_items=change)
  • fetch flows for multiple deals in one request, i.e. something like /deals/flow?deal_ids=1,2,3 or alternatively for all deals matching a certain filter
  • batch requests (similar to what e.g. Salesforce has with its Batch API, see https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/resources_composite_batch.htm)

Filters API

Creating filters via the API is a bit odd. Why is the complexity of the mandatory and / or root filters not hidden behind some abstraction?

Exports via API

It's a bummer that I can't programatically create and download exports.

@daniti
Copy link

daniti commented Jan 4, 2021

Great stuff! Thank you for this 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment