Skip to content

Instantly share code, notes, and snippets.

@dreftymac
Forked from ctesta01/qsf_explanation.md
Created May 25, 2017 20:29
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save dreftymac/47ccfa837e7f89d7a37539a6fc25dcac to your computer and use it in GitHub Desktop.
How does a Qualtrics Survey File work?

Quickstart Guide to undertsanding the Qualtrics Survey File

This information is likely to quickly become outdated when Qualtrics next changes the formatting of the QSF file. This guide was started February 2017. I hope that it is a useful introduction to understanding the contents of the QSF file that one can download from Qualtrics.

This document includes:

Introduction to JSON

Qualtrics surveys have two components which can be exported -- their survey file and their response data. The survey file contains information about the survey's details, its contents, and all of the information necessary to reconstruct the survey. The ".qsf" filetype stands for "Qualtrics Survey File" but the contents of the file is JSON -- JavaScript Object Notation.

JSON Introduction on w3m

The point of JSON is that it is easily processed by machines -- and less so for humans. If you would like take a look at a QSF or JSON file, I recommend uploading it to a JSON viewer or using a "pretty printer" to render it more readable. The QSF file rarely contains confidential information, but please be careful not to upload sensitive information to an online JSON viewer.

Raw QSF Data:

JSON using a JSON Viewer

Some packages to handle JSON data:

Anatomy of a Qualtrics Survey File

A QSF contains two top level objects: SurveyEntry and SurveyElements.

SurveyEntry describes many meta-details that describe the QSF. It contains the name of the survey, the creation date, start date, end date, many internal ID values that Qualtrics associates with a survey, such as the SurveyID, SurveyOwnerID, and more.

"SurveyEntry": {
  "SurveyID": "SV_4I2LK0XhHYRhhrL",
  "SurveyName": "Sample Survey",
  "SurveyDescription": null,
  "SurveyOwnerID": "UR_8kzO7vEaBPWnAB7",
  "SurveyBrandID": "tufts",
  "DivisionID": null,
  "SurveyLanguage": "EN",
  "SurveyActiveResponseSet": "RS_bjfzpALWe8AMZOR",
  "SurveyStatus": "Inactive",
  "SurveyStartDate": "0000-00-00 00:00:00",
  "SurveyExpirationDate": "0000-00-00 00:00:00",
  "SurveyCreationDate": "2017-02-10 08:53:57",
  "CreatorID": "UR_8kzO7vEaBPWnAB7",
  "LastModified": "2017-02-13 14:23:41",
  "LastAccessed": "0000-00-00 00:00:00",
  "LastActivated": "0000-00-00 00:00:00",
  "Deleted": null
}
# When you run get_setup() one of the objects returned to the global environment is survey. 
# To access SurveyEntry, access the list element survey[['SurveyEntry']].
> survey[['SurveyEntry']]

$SurveyID
[1] "SV_eFjHq9W5eLCeJI9"

$SurveyName
[1] "Long & Exhaustive Sample Survey"
...

The other top level entry, SurveyElements, is a numbered list that contains the major components of a survey. These roughly break down into the following different kinds of components:

  • Survey Blocks
  • Survey Flow
  • Notes
  • Survey Options
  • Scoring
  • Survey Statistics
  • Question Count
  • Survey Questions
  • Response Set
# Similarly, to access these, access the list element survey[['SurveyElements']]. 
# However, I would caution against doing this directly, since it will print
# the whole survey to the command line. One can use functions like head and str
# to read the data a little more clearly. 
> head(survey[['SurveyElements']], n=1)

[[1]]
[[1]]$SurveyID
[1] "SV_eFjHq9W5eLCeJI9"

[[1]]$Element
[1] "BL"

[[1]]$PrimaryAttribute
[1] "Survey Blocks"
...

> str(survey[['SurveyElements']], max.level = 2)

List of 55
 $ :List of 6
  ..$ SurveyID          : chr "SV_eFjHq9W5eLCeJI9"
  ..$ Element           : chr "BL"
  ..$ PrimaryAttribute  : chr "Survey Blocks"
  ..$ SecondaryAttribute: NULL
  ..$ TertiaryAttribute : NULL
  ..$ Payload           :List of 8
  ...

This is what the SurveyElements can look like uploading the QSF to a JSON viewer.

Note that some of these components will only ever appear once in a given survey's list of SurveyElements, while others may appear many times. I will only describe some of these which are of the most importance to the QualtricsTools project.

Here is what I know about each of these components:

Survey Blocks only appears once in the list of SurveyElements as a block with PrimaryAttribute: "SurveyBlocks" and Element: "BL". It contains a list of blocks and for each it includes the block name, block ID, and a list of the export tags of the questions contained in that block.
Survey Flow always appears once in the QSF. Its payload contains a list of blocks ordered according to the flow of the survey.
A Notes Survey Element contains the notes for a particular question. In a note's Payload is a "ParentID" which contains the Data Export Tag of the question the notes correspond to, and also in Payload under "Notes" there is an element for each note on the question and its "Message" containing the contents of the note.
Survey Options contain useful information that detail what options survey respondents have when taking the survey, such as whether or not they can use the browser's back button, whether or not a previous question button appears, etc.
I've never used the Scoring.
I've never used the Survey Statistics.
The Question Count is the number of questions in the QSF.
The Survey Questions are stored each individually as their own Survey Element. Each question contains a PrimaryElement which is its Question ID, a SecondaryElement which is the first 99 characters of the Question Text, and of course it contains a Payload with the details of the question. See below for further breakdown of the QSF's Survey Question structure.
The Response Set contains the ID of the set of respondents intended to receive or who have received the survey.

Breakdown of Question Structure

In the QSF under the SurveyElements there will be one element for each question. Questions have many different components, the most important of which are the following:

  • SurveyID
  • Element
  • PrimaryAttribute
  • SecondaryAttribute
  • TertiaryAttribute
  • Payload
    • QuestionText
    • DataExportTag
    • QuestionType
    • Selector
    • SubSelector
    • Choices
    • RecodeValues
    • ChoiceDataExportTags
    • ChoiceOrder
    • Answers
    • AnswerOrder
    • QuestionID

If we're in R and have imported a survey, we can get a question and take a look at where these properties live within it.

If you import a QSF and CSV by using the get_setup function in QualtricsTools, then the function will construct the questions list which you can access in the global scope. Questions are retrieved from this list via their index, and there are some functions in QualtricsTools to help you find a given question:

> questions[[1]]
$SurveyID
[1] "SV_4ILRhqlGA79u2Md"
$Element
[1] "SQ"
$PrimaryAttribute
[1] "QID3"
...

> find_question_index(questions, "q3_volunteer")
[1] 6

> find_question_index_by_qid(questions, "QID5")
[1] 6

> questions[[6]][['Payload']][['DataExportTag']]
[1] "q3_volunteer"

> questions[[6]][['PrimaryAttribute']]
[1] "QID5"

Now I will explain many of the important components of a question as stored in the QSF data:

<tr>
<td>
QuestionText
</td>
<td>
The QuestionText element contains the question's text.
</td>
SurveyID The SurveyID merely indicates the survey to which a question belongs.
Element A question starts out as an element of the SurveyEntries list. The Element tag simply denotes what kind of SurveyEntry this element is, and so for a question the Element item will always be "SQ" to denote survey question. That is for any question in questions, questions[[i]][['Element']] == SQ.
PrimaryAttribute The PrimaryAttribute is the same as the QuestionID for questions. It is always "QID" followed by some numbers. The numbers usually go in the order of the creation of questions as the survey was made. These QIDs are distinct from the DataExportTags which users can customize in the Qualtrics interface. QIDs cannot be changed.
SecondaryAttribute This list item contains up to a certain number of characters of the question text. If the question is particularly long, the SecondaryAttribute does not always contain the full question text. For that, look to the question[['Payload']][['QuestionText']] element.
TertiaryAttribute I have never seen a question that had a non-NULL TertiaryAttribute.
Payload The Payload contains the remaining elements which describe a given question.
DataExportTag The DataExportTag is the tag for a given question which is customizable in the interface like so.
QuestionType The QuestionType contains a short code which describes which of the basic categories of questions this question falls into. These can be "MC", "TE", "DB", "Matrix", "SBS", "DD", and others. It is my guess that these stand for respectively: "Multiple Choice", "Text Entry", "Descriptive Box", "Matrix", "Side-by-Side", "Drop-Down", etc.
Selector The Selector describes what interface the respondent uses to make their response to the question. These range from things like "MAVR", "TE", "Likert", "SAVR", "TB", and more. These abbreviations stand for "Multiple Answer Vertical", "Text Entry", "Likert", "Single Answer Vertical", "Text-Box", etc. "
SubSelector Only some questions have SubSelector elements within their Payload. The SubSelector may contain information like "SingleAnswer" or "MultipleAnswer" for a Matrix Likert question, and for other questions it contains strings like "TX" and "Long".
Choices These are the choices which were presented to a respondent. They are indexed by internal labeling, and if you have renamed them in the Qualtrics interface, then their naming convention will show up in the RecodeValues. If the question's choices are ordered in a particular way, then this will show up in ChoiceOrder.
RecodeValues The RecodeValues of a question are how the names of the possible choices were customized. These are indexed in the same order as the Choices list, and their contents is the customized label for the ith choice, i.e. question[['Choices']][[i]] is labeled question[['RecodeValues']][[i]] if RecodeValues are being used. If a question's choices were not recoded, this list element does not appear.
See [Recode Values menu option](http://i.imgur.com/80iKEqS.png) and [recode values interface](http://i.imgur.com/2uczrrW.png)
</td>
ChoiceDataExportTags In the context of a matrix question, the question's Choices are considered to be the horizontal elements of the matrix. These can not only be recoded in the interface, but can also be given alphanumeric names. This is a list which is indexed with the same indices as the Choices list which contains these prescribed alphanumeric names.
The ChoiceDataExportTags are listed in the Qualtrics interface as <a href='http://i.imgur.com/XxYDWFn.png'>Question Export Tags</a>
</td>
ChoiceOrder This ordering dictates in which order the choices are listed on the screen.
Answers For Matrix questions, the vertical options are denoted as "Answers" in the question's structure. These answers can also be renamed, and if they are this information is included under RecodeValues.
AnswerOrder The AnswerOrder list describes the order in which the vertical options of a matrix question are listed.
QuestionID The QuestionID is the ID that is automatically assigned to a question. This is not editable, unlike the DataExportTag.

Recognizing Question Types

To determine how the question's results are supposed to be determined, it's very important that we are able to quickly determine what kind of question a given question is. To do so, I have written the functions contained in the question_type_checking.R file

MCSA Multiple Choice Single Answer questions are characterized by having question[['Payload']] == "MC", and `question[['Payload']][['Selector']]` is one of "SAVR", "SAHR", "SACOL", "DL", or "SB".
MCMA Multiple Choice Multiple Answer questions are characterized by having question[['Payload']] == "MC", and `question[['Payload']][['Selector']]` is one of "MAVR", "MAHR", "MACOL", or "MSB".
TE Text Entry questions have question[['Payload']][['QuestionType']] == "TE".
Likert SA A Matrix Single Answer question has question[['Payload']][['QuestionType']] == "Matrix" and has SubSelector as one of "DL" or "SingleAnswer". A Matrix question has both Choices and Answers, and may potentially have ChoiceDataExportTags and RecodeValues to label these.
</td>
Likert MA A Matrix Single Answer question has question[['Payload']][['QuestionType']] == "Matrix" and has SubSelector as "MultipleAnswer" A Matrix question has both Choices and Answers, and may potentially have ChoiceDataExportTags and RecodeValues to label these.
SBS Side By Side questions have question[['Payload']][['QuestionType']] == "SBS". They are automatically broken down by the QualtricsTools application into smaller questions, and reinserted as independent questions. The contents of their contained questions is kept within the question's [['Payload']][['AdditionalQuestions']] list.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment