Skip to content

Instantly share code, notes, and snippets.

@dzeber
Last active April 28, 2017 23:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dzeber/f2ee1e03d72b9cd8fad09d1e2056dc75 to your computer and use it in GitHub Desktop.
Save dzeber/f2ee1e03d72b9cd8fad09d1e2056dc75 to your computer and use it in GitHub Desktop.

Data Collection for Unified Search v2

The goal for data collection is to submit all relevant data via the Shield pings. This will avoid having to work with the raw UT pings.

Note: In future, this should be unnecessary, since the Experiment CEP output plugin will redirect UT pings from profiles participating in any experiment (TxP, Shield, etc) to a separate bucket.

Ping types

The v2 Unified Search experiment submits pings triggered by certain events, some of which are a standard part of Shield studies. All of these have the shield-study docType. The event which triggered the ping is listed in the payload.study_state field.

payload.study_state When is this ping sent?
install when the user enters the study (installing the add-on)
end-of-study when the study expires and the user has not opted out (add-on is uninstalled, previous browser state restored)
user-ended-study when the user opts out of the study (add-on is uninstalled or disabled, previous browser state restored)
ineligible when it is determined that the user does not match the eligibility criteria required for the study; usually gets sent either instead of the install ping or right after
running at the start of a session or when the study is installed
shutdown at the end of a session
daily once per active day, at the time of the daily subsession split, but not sent at the end of a session (shutdown)
daily-shutdown at the end of a session

All of these pings follow the Common ping format, which includes among other fields the UT environment block. The payload object contains some standard Shield fields, such as study name, version and state, as well as any custom fields.

Standard Telemetry data

In order to avoid having to extract data from UT main pings, necessary portions of standard Telemetry are injected into certain Shield pings. This way, the Shield pings form the sole data source for the study analysis (aside from possibly looking up pre-period measurements in the main_summary derived dataset). Note that participants get Extended Telemetry enabled upon entering the study.

Shield pings that include this UT data contain the following fields in their payload (listed together with their most relevant contents), which are also fields in the standard main ping payload:

  • info: session IDs and counters, subsession start date and length
  • histograms: in particular, histograms relating to Awesomebar usage like FX_URLBAR_SELECTED_RESULT_TYPE
  • keyedHistograms: SEARCH_COUNTS
  • simpleMeasurements: activeTicks and UITelemetry, which contains counts of different types of search, including in-content search
  • processes: Scalars; also subprocess histograms (but probably not relevant here)

Reporting UT in Shield pings

The above data is injected into the Shield pings in the following way:

  • pings with study_state == "daily" are generated when a daily subsession split occurs, and contain Telemetry data from the previous subsession. These pings will have payload.info.reason == "daily".
  • pings with study_state == "daily-shutdown" are sent at session end, and contain the Telemetry data from the last subsession in that session. These have payload.info.reason == "shutdown".
  • pings with study_state == "shutdown" are also sent at session end, and contain the saved-session Telemetry data, which is aggregated over the entire session. These have payload.info.reason == "gather-payload".

Caveats

  • Some subsession data will not get reported at all under this scheme. If a subsession split occurs with reason other than those listed above (eg. environment-change or aborted-session), we will not see data from that subsession in the Shield pings, aside from in aggregate in the shutdown pings.
  • Some timers reset in each subsession, and some don't. Histograms reset in each subsession, and main ping histograms for a given session should add up to the saved-session value. However, the search counts in UITelemetry don't reset, and keep accumulating across the entire session.

Study data

As well as the standard UT and Shield data, the payload contains a number of other custom fields describing the user's experience during the study:

Field Description Details
changesApplied has a treatment has been applied? - always false on the control branch
- always true on treatment branches in one-phase design
- true on treatment branches in two-phase design during second phase
diagnostics.allWindowsClosed were all windows were closed on shutdown? should only be true for "shutdown" or "daily-shutdown" study states
diagnostics.searchBarRemovedManually was the searchbar removed by the user after the experiment started? can happen on any branch, but doesn't invalidate the unified or minimal branches
diagnostics.searchBarRemovedByExperiment was the searchbar actually removed as a part of applying the treatment? should be true iff the branch is unified or minimal
diagnostics.searchBarWidth width of the searchbar in pixels should only be non-zero for control and oneoff
diagnostics.oneoff value of browser.urlbar.oneOffSearches starts off true for everything but control, but may be changed by the user after the experiment has started
diagnostics.suggestions value of browser.urlbar.suggest.searches starts off true for everything but control, but may be changed by the user after the experiment has started
diagnostics.maxRichResults value of browser.urlbar.maxRichResults starts off as 6 for minimal and 10 for everything else, but may be changed by the user or Sync after the experiment has started
firstrunRevision the first installed version of the add-on if this is 1, it's a direct update from the v1 study; many preferences may have been left in an unexpected state
onboardingBranch did this user see the onboarding message about search suggestions in the URLbar? may be true or false on any branch
revision add-on version has less granularity than study_version, may be redundant
testing is this is a test profile? should be set to true manually when testing

Sample payload

{
  "about": {
    "_src": "shield",
    "_v": 2
  },
  "branch": "unified", // "control", "oneoff", "unified", "minimal" 
  "changesApplied": true,
  "diagnostics": {
    "allWindowsClosed": false,
    "searchBarRemovedManually": false,
    "searchBarRemovedByExperiment": true,
    "searchBarWidth": 0,
    "oneoff": true,
    "suggestions": true,
    "maxRichResults": 10,
  },
  "firstrunRevision": 2,
  "onboardingBranch": true,
  "revision": 2,
  "study_name": "@unified-urlbar-shield-study",
  "study_state": "running",  // "install", "running", "daily", "daily-shutdown", "shutdown", "user-ended-study", "end-of-study", "ineligible"
  "study_version": "2.2.0",
  "testing": true,

  // The following are only included if the study_state is "daily", "daily-shutdown", or "shutdown".
  "info": {...},
  "histograms": {...},
  "keyedHistograms": {...},
  "processes": {
    "content": {...},
    "parent": {
      "scalars": {...},
      ...
    }
  },
  "simpleMeasurements": {
    "UITelemetry": {
      "toolbars": {
        "countableEvents": {
            "__DEFAULT__": {
              "search": {
                ...
              },
              "search-oneoff": {
                ...
              },
              ...
            },
            ...
        },
        ...
      },
      ...
    },
    ...
  }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment