Here are a few things that I found odd when working extensively with the Pipedrive API.
For context, here's a brief description of our use case:
- Our BI team runs various reports across the company. For some reports, we need certain data from Pipedrive.
- Therefore we've built a tool that on a regular basis fetches data from the Pipedrive API and puts it into a Postgres database (which Tableau can easily interface with).
- Part of the data fetching involves dynamically building the schema for the tables, so that we properly handle custom fields added to Pipedrive.
Here, we've run into problems because the Pipedrive API seems to not be entirely consistent when it comes to field types and also have other issues.
enum
fields seem to generally have integer values and string labels. Notable exceptions in the
activities
resource:
done
has boolean IDs (true
andfalse
)busy_flag
– same thingtype
has string values (e.g."call"
,"meeting"
)
Consequently, this needs manual remapping when scripting around field types.
date
and time
fields are handled inconsistently. The most obvious examples are in the
activities
resource:
due_date
says it's adate
and it actually is (YYYY-MM-DD
)add_time
says it's atime
, but it really is a datetime (YYYY-MM-DD HH:MM:SS
)due_time
says it's atime
and it is, if you consider "time" to be a "time of day" (HH:MM
) rather than a more database-like definition of time (= datetime)duration
says it's atime
but apparently it goes beyond values of 23:59 (e.g. we have an activity with a value of48:00
here, which presumably means 2 days)
I think this should be handled consistenly: If there's a type date
, then it shouldn't sometimes
be used as a date and sometimes as a datetime; if there's a type time
, it's fine that it's
treated as a "time of day", but then it shouldn't be used to represent a duration that can go
beyond valid ranges for a time of day (= 00:00
to 23:59
).
Usually, blank values seem to be set as null
which is what I'd expect. However, some fields in activities
seem to contain an empty string instead (e.g. "duration": ""
, "lead_title": ""
, "marked_as_done_time": ""
etc.).
I'm missing a distinct boolean
field type – both for custom fields as well as Pipedrive's own fields (e.g. done
and busy_flag
in the activities
resource). I think true
/false
concepts are sufficiently common to deserve their own data type.
IDs of field options are usually integers – except for status
of deals
where they seem to be
strings.
Some other minor oddities that I came across:
- Several fields that are called
*_id
counterintuitively return full objects (including the ID) instead of only the ID (as the name would suggest) - lack of naming conventions, e.g.
user_id
given as a parameter references entries of theusers
resource, but fororganizations
it'sorg_id
- a reference to the
users
resource has field typeuser
(e.g.creator_user_id
inactivities
), but one topersons
has typepeople
(e.g.person_id
inactivities
)
- Pipedrive's own numerical fields seem to be
int
, but our custom fields are generally seem to bedouble
- All fields have IDs except for monetary fields' currency
It's a bummer that the "time ago" filters stop at 1 week ago
and I can't thus filter for values > 6 hours ago
, < 30 minutes
or similar.
There seems to be no "stable" way to handle pagination on large, "active" resources. Example:
- I want to fetch multiple pages of activities.
- I fetch page 1 from the API.
- While fetching page 1, a new activity gets added.
- When I fetch page 2, I get a duplicate activity that I've already fetched at page 1 and the newly added activity never gets fetched.
- The problem only becomes worse as the result set and/or number of active users increases.
Of course, it's possible to work around this using a filter, i.e. before running the paginated
requests, create a filter with update_time = the-start-time-of-the-process
and then delete the
filter once it's done. But this is quite cumbersome => there should be a better way (e.g. something
that functions like a cursor in databases).
To get data regarding deal flow, we need to fetch the flow for every single deal individually which is cumbersome, inefficient and time-consuming. It would be beneficial to have at least one of the following options:
- optionally include flow when fetching deals, i.e. something like
/deals?include_flow=true
(potentially even with additional settings like/deals?include_flow=true&flow_items=change
) - fetch flows for multiple deals in one request, i.e. something like
/deals/flow?deal_ids=1,2,3
or alternatively for all deals matching a certain filter - batch requests (similar to what e.g. Salesforce has with its Batch API, see https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/resources_composite_batch.htm)
Creating filters via the API is a bit odd. Why is the complexity of the mandatory and
/ or
root filters not hidden behind some abstraction?
It's a bummer that I can't programatically create and download exports.
Great stuff! Thank you for this 👍