Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@ak--47
Last active October 26, 2022 19:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ak--47/0e92b1954b5f366f4d8e809f49eabf8d to your computer and use it in GitHub Desktop.
Save ak--47/0e92b1954b5f366f4d8e809f49eabf8d to your computer and use it in GitHub Desktop.
Mixpanel + Amplitude Differentiators

MIXPANEL + AMPLITUDE: data model differentiators

Table of Contents

IDENTITY RESOLUTION

For the correct sequence of events 1 ➡️ 2 ➡️ 3 ➡️ 4 with multiple uuids

AMPLITUDE (uses an inference pattern)

{ e: 1,
  p: { device_id : "foo" }
}

{ e: 2,
  p: { device_id : "foo", user_id: "bar" }
}

{ e: 3,
  p: { user_id: "bar" }
}

{ e: 4,
  p: { device_id : "foo" }
}
Pros Cons
more similar to db tables with 2 primary uuid keys not possible to merge user_id for joins across apps
some uses cases will not follow the inference pattern (many users on one device)

MIXPANEL (uses an explicit pattern):

{ e: 1,
  p: { $distinct_id : "foo" }
}

{ e: 2,
  p: { $distinct_id : "bar" }
}

{ e: 3,
  p: { $distinct_id: "bar" }
}

{ e: 4,
  p: { $distinct_id : "foo" }
}

{
  e: $identify, //or $merge
  p: { $anon_id: "foo", $identified_id: "bar" }
}
Pros Cons
possible to represent complex identity chains (up to 500 ids in a cluster) requires special $identify events to connect user streams
segment handles $identify events OOTB more events than rows in db
$identify and $merge events are not billed

UPDATING OLD EVENTS

AMPLITUDE (no native support)

export ➡️ fix ➡️ re-upload (new project)

MIXPANEL (match uuid, time, + $insert id ... and send again!)

{
  e: "foo",
  p: {
    "distinct_id" : "bar",
    "time": 123456,
    "$insert_id" : "789-654-321",
    "lucky number": 8
  }
}

  ... some time later

{
  e: "foo",
  p: {
    "distinct_id" : "bar",
    "time": 123456,
    "$insert_id" : "789-654-321",
    "lucky number": 42
  }
}

Mixpanel takes the latest ingested value 👍 (de-dupes the old one!) (in the works: a more scalable solution, allowing any attributes to be changed)

USER PROPERTIES

AMPLITUDE (props can be used only after they have been set)

for user QUX, given:

{ e: 1 }
{ e: 2 }
{ e: 3 }
{ $set: { NPS: 8}}
{ e: 4 }

can't analyze 1, 2, or 3 by NPS ; NPS is only available to 4 for user QUX

MIXPANEL (user props enrich event stream historically and are available to all events)

for user QUX, given:

{ e: 1 }
{ e: 2 }
{ e: 3 }
{ $set: { NPS: 8}}
{ e: 4 }

NPS score is available to all events for QUX (1, 2, 3, and 4)

LOOKUP TABLES

MIXPANEL (fully supported to enrich events, users, and groups 👍)

bring a CSV that shares a key with event props and do enrichment on the fly (or programmatically populate a lookups table over the API)

{
  e: "watch video",
  p: {
    video_id: 123-456-789 // ➡️ primiary key for a lookup table
  }
}

(later add)

video_id,	length,		premium,	publisher
"123-456-789",	420,		true,		"funimation"
"987-654-321",	42,		false,		"crunchyroll"

map the two together in the UI:

mapping CSV to event data

you can now analyze video_id by length, premium, publisher or any dimensions in the lookup table 🥳

AMPLITUDE (beta)

similar functionality; limited to 1M rows

CUSTOM/CALCULATED PROPERTIES

AMPLITUDE (has "computations" and "formulas" but no logical operators like IF)

MIXPANEL ("custom properties" use excel-inspired syntax with logical operators; "formulas" support simple arithmetic calculations; complex aggregators are part of the query builders)

Example: ensure premium+ and Premium+ are the same value, by fixing casing issues:

fix issues in the data to unblock analysis

Example: make a URL property useful by only showing the path and truncating the domain and querystring params:

regex operation on URLs to make them readable

COHORTS

AMPLITUDE (poor logical operators, confusing/inflexible time domains, not able to be used in all reports)

this behavioral definition is NOT properly related

enter image description here

MIXPANEL (correct logical operators, easy to model, can be used in every analysis report)

these behaviors are properly related in sequence (THEN, DID, DID NOT)

enter image description here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment