Created
October 30, 2020 03:21
-
-
Save adamdavislee/036e714c3fa7f01b00764eec2b3febad to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
\documentclass[man, biblatex]{apa7} | |
\usepackage{hyperref} | |
\hypersetup{pdfborderstyle={/S/U/W 1}, | |
linktocpage} | |
\DeclareLanguageMapping{american}{american-apa} | |
\usepackage{csquotes} | |
\usepackage{listings} | |
\usepackage{graphicx} | |
\usepackage{soul} | |
\usepackage{setspace} | |
\usepackage{wrapfig} | |
\usepackage[T1]{fontenc} | |
\usepackage[utf8]{inputenc} | |
\usepackage{charter} | |
\usepackage{environ} | |
\usepackage{tikz} | |
\usetikzlibrary{calc,matrix} | |
\DeclareLanguageMapping{american}{american-apa} | |
\title{Enhanced User and Userbase Understanding through Data Visualization} | |
\author{Adam D Lee} | |
\affiliation{\textit{ADdroitLee Consulting Inc.}} | |
\shorttitle{Data Visualization Enhancement} | |
\begin{filecontents}[overwrite]{general.bib} | |
@online{vega, | |
title={Circular Plot Examples}, | |
editor={Vega}, | |
url={https://vega.github.io/vega-lite/examples/#circular-plots}} | |
\end{filecontents} | |
\addbibresource{general.bib} | |
%BEGIN timeline code from SO | |
\makeatletter | |
\let\matamp=& | |
\catcode`\&=13 | |
\makeatletter | |
\def&{\iftikz@is@matrix | |
\pgfmatrixnextcell | |
\else | |
\matamp | |
\fi} | |
\makeatother | |
\newcounter{lines} | |
\def\endlr{\stepcounter{lines}\\} | |
\newcounter{vtml} | |
\setcounter{vtml}{0} | |
\newif\ifvtimelinetitle | |
\newif\ifvtimebottomline | |
\tikzset{description/.style={ | |
column 2/.append style={#1} | |
}, | |
timeline color/.store in=\vtmlcolor, | |
timeline color=red!80!black, | |
timeline color st/.style={fill=\vtmlcolor,draw=\vtmlcolor}, | |
use timeline header/.is if=vtimelinetitle, | |
use timeline header=false, | |
add bottom line/.is if=vtimebottomline, | |
add bottom line=false, | |
timeline title/.store in=\vtimelinetitle, | |
timeline title={}, | |
line offset/.store in=\lineoffset, | |
line offset=4pt, | |
} | |
\NewEnviron{vtimeline}[1][]{% | |
\setcounter{lines}{1}% | |
\stepcounter{vtml}% | |
\begin{tikzpicture}[column 1/.style={anchor=east}, | |
column 2/.style={anchor=west}, | |
text depth=0pt,text height=1ex, | |
row sep=1ex, | |
column sep=1em, | |
#1 | |
] | |
\matrix(vtimeline\thevtml)[matrix of nodes]{\BODY}; | |
\pgfmathtruncatemacro\endmtx{\thelines-1} | |
\path[timeline color st] | |
($(vtimeline\thevtml-1-1.north east)!0.5!(vtimeline\thevtml-1-2.north west)$)-- | |
($(vtimeline\thevtml-\endmtx-1.south east)!0.5!(vtimeline\thevtml-\endmtx-2.south west)$); | |
\foreach \x in {1,...,\endmtx}{ | |
\node[circle,timeline color st, inner sep=0.15pt, draw=white, thick] | |
(vtimeline\thevtml-c-\x) at | |
($(vtimeline\thevtml-\x-1.east)!0.5!(vtimeline\thevtml-\x-2.west)$){}; | |
\draw[timeline color st](vtimeline\thevtml-c-\x.west)--++(-3pt,0); | |
} | |
\ifvtimelinetitle% | |
\draw[timeline color st]([yshift=\lineoffset]vtimeline\thevtml.north west)-- | |
([yshift=\lineoffset]vtimeline\thevtml.north east); | |
\node[anchor=west,yshift=16pt,font=\large] | |
at (vtimeline\thevtml-1-1.north west) | |
{\textit{\vtimelinetitle}}; | |
\else% | |
\relax% | |
\fi% | |
\ifvtimebottomline% | |
\draw[timeline color st]([yshift=-\lineoffset]vtimeline\thevtml.south west)-- | |
([yshift=-\lineoffset]vtimeline\thevtml.south east); | |
\else% | |
\relax% | |
\fi% | |
\end{tikzpicture} | |
} | |
%END timeline code from SO | |
\begin{document} | |
\maketitle | |
\tableofcontents | |
\newpage | |
\section{Abstract} | |
\subsection{The Problem Identified} | |
Seamus Company (hereinafter referred to as SC) lacks a method for visualizing its user's data both as a tool for internal use and as a value offer to its customers. Currently, SC's interactive online products aren't being used to their full potential, because while they are testing their user's understanding and accumulating the results of those interactions, no tool has been written to analyze the results nor display progress over time to SC's end users. | |
\subsection{Solution Offered and its Benefits} | |
I propose developing a web app to display this information in an aesthetically pleasing fashion to your end-users, using a mature and elegant visualization engine called \textit{Vega}.\\ | |
This will benefit SC both by taking the previous inpenetrable data you own on your userbase and presenting you with actualizable information about which of your products are used most frequently, which are most effective in developing understanding in your users, which are most effective for older/younger users, etc. and also benefit your users with similar information personalized their own performance history. | |
\subsection{Outcome} | |
Upon the completion of the proposed contract, SC would have ownership of a web app leveraging their existing online presence and collected data, which exposed multiple facets of a user's activity in a digestable form to each user. A second app with similar functionality would expose aggregate customer data in multiple ways to provide insight from the company perspective into product performance. | |
\subsection{Cost} | |
The proposed project would require approximately a month to develop involving about 160 hours of developer time, totalling \$32,000 in consulting fees. | |
\subsection{Expertise} | |
I have an excellent history of delivering quality visualization software adding value to clients for over 7 years. Prior to my work as a private consultant, I spent more than a decade working inside of \textit{Large Company Inc}. Most of that time was spent in the \textit{Large Company's} analytics department, where I developed my passion for exposing tailored and actualizable information from previously unused data sets. | |
\newpage | |
\section{Proposal} | |
\subsection{Proposed Solution} | |
Currently SC has an excellent library of interactive online activites for individual development of grammar proficiency in individuals from elementary level through high school. Notably lacking in SC's platform is the ability to analyze it's substantial user data to better inform business decisions. Similarly lacking are visual elements and interfaces giving SC's users visibility into their progression on the platform.\\ | |
As a remedy, this proposal would integrate visual elements in various parts of SC's online interface, create a more in-depth visualization of user progress in a dedicated user-accessible portal, and an analagous portal accesible only to SC staff with aggregate analysis of user behavior.\\ | |
This would be done by integrating visual elements into SC's existing online interface using aesthetically pleasing visual elements created with a visualization engine called \textit{Vega}. \textit{Vega} and it's sister tool \textit{Vega-lite} have a rich library of visualization that consume JSON data when properly formatted. For example the following data | |
\begin{lstlisting}[language=C, basicstyle=\tiny] | |
{ | |
"data": { | |
"values": [ | |
... | |
{"milliseconds": 733.3333333333335, "speed": "high speed", "trial": 6}, | |
{"milliseconds": 766.6666666666669, "speed": "high speed", "trial": 7}, | |
{"milliseconds": 766.6666666666669, "speed": "high speed", "trial": 8}, | |
{"milliseconds": 766.6666666666669, "speed": "high speed", "trial": 9}, | |
... | |
] | |
}, | |
"encoding": { | |
"x": {"field": "trial", "type": "quantitative"}, | |
"y": {"field": "milliseconds", "type": "quantitative"}, | |
"color": {"field": "speed", "type": "nominal"} | |
}, | |
"mark": "point" | |
} | |
\end{lstlisting} | |
produces the following visualization | |
\includegraphics[width=0.5\textwidth]{chart}\\ | |
The data for this project would be extracted using SQL queries from SC's existing database of user data, and transformed/annotated into a form consumable by \textit{Vega}. The transformation layer would be written in a back-end language already in use at SC to allow for developer familiarity.\\ | |
In addition, thorough documentation will be written along with the source code to maximize readability. And adding significant value to the proposal, two technicians from SC will be walked through the process of adding a visualization. The purpose of familiarizing technical representatives from SC is to allow for future in-house enhancement of the developed functionality without resorting to further consultation except in the case of large-scale improvements. This would leave SC in the position to organically grow their collection of visualizations alongside their developing education platform. | |
\newpage | |
\subsection{Related Works} | |
\subsubsection{Examples of Vega Visualizations} | |
Many of the types of visualization in this proposal can be found in Vega's example library \parencite{vega}. This library includes examples of circular plots which can be layered on top of each other to form the multi-level pie chart referenced in \underline{\hyperref[objective 1]{Objective 1}} | |
\newpage | |
\subsection{Goals, Objectives, and Deliverables} | |
The goal of this proposal is to take previously unused data accumulated by SC, leveraging it through the use of data visualization tools using emerging web technologies to provide users and performace analyzers with a clear view into metrics of past user performance. This would include an integration of SC's existing online UI with relatively small visualizations of general user progress and stats placed strategically throughout the user experience to provide feedback, a separate UI allowing users to benefit from a less coarse level of performace analysis, as well as company tools to provide insight into the platform allowing for different variables such as user age, difficulty of topic, etc. | |
\label{objective 1} | |
\subsubsection{Objective 1: Integrated Performance Metrics} | |
Even when a user isn't interested in taking the time to look closely at their metrics in the separate web app, they will still benefit from constant feedback in the form of one or more charts of their progress placed between activities. Further, these charts will be even more engaging if tailored to the audience by age and interest. For example, an elementary school student is much more likely to appreciate a pie chart representing his/her progress through a set of tasks if it is made to look like a literal pie being eaten. Similarly a college student is more likely to appreciate more detailed information in a multi-level pie chart showing ratios of progress in different topics over two or more periods. | |
\paragraph{Deliverable A: Bar Chart of Activity Set Progress.} | |
While moving between activities, SC's current interface shows a list of activities including an indication of completion/non-completion with each activity, but does not currently show a progress bar indicating overall progress within the activity set. This proposal would integrate such a bar at the bottom of the interface, being careful to match the existing style of the interface for the sake of UX continuity. For activities targeting high school students and higher grades the bar will be relatively subtle; for younger students the bar will resemble a cartoon, formula one racecar making progress along a racetrack. | |
\paragraph{Deliverable B: Pie Charts for Progress Indication on Home Page} | |
When a user logs into SC's online interface, currently their landing page is identical whether the user is new to the platform or has used the platform regularly for a year. This proposal will add a tasteful pie chart to the right of the user's profile picture indicating progress through various activity sets. It's important that this be an indicator of overall progress, which only accrues over time. It shouldn't be an indicator of performance over time (which will be available to the user in the more general web app) because a user will rarely be at peak performance and would be subconciously dissuaded from logging in, knowing that they would be confronted with their prior success and relative decrease in performance by comparison. For a high school student or greater the chart will have three concentric layers, the outer representing overall progress, the inner representing progress over the last month, and the innermost representing progress over the last week all color-coded to indicate which portions of progress apply to which enrolled activity set. For younger students the chart will only track their overall progress on the most recent activity set engaged in; it will resemble a steaming apple pie and the child's preffered avatar (Petra Perfect for example) will be pictured next to the remaining pie encouraging the child to help them consume the delicacy. | |
\paragraph{Deliverable C: In-Activity Performace Indicator} | |
Currently, while using one of SC's activities the user is aware after submitting their response to each question (e.g. \textit{Which of the following are correct: •...we love the USA and it's allies..., •...we love the U.S.A and it's allies, •...we love the U.S. and it's allies..., •...we love the usa and it's allies...'}) whether or not their response was correct, but they are not aware (until completing the activity) what their overall success rate is. This proposal will add a small ring on the top of the interface near the activity's timer to indicate the user's developing score, including an integer inside the ring representing the same information as a percentage. | |
\subsubsection{Objective 2: User-Facing Analytics page} | |
For more in-depth information, a user-facing interface will be created with controls for fine-tuning the analyis of user performance. The variables the user will be permitted to control for will include various attributes such as difficulty of activity, number of attempts, type of activity, and success rate. | |
\paragraph{Deliverable A: Bar chart for overall progress} | |
SC has activity sets that cover specific parts of english grammar (for example prepositions, clauses, or sentence diagrams). Users can be enrolled in any number of these sets of activities and will benefit from having a bar chart on the web app showing overall progress in each enrolled activity set. | |
\paragraph{Deliverable B: Graph of Performace Over Time} | |
Users will benefit from a graph plotting the number of activity's attempted/completed per day over the course of \textit{t} time. There will be two dropdown menus integrated into this visual element, one to select a specific topic (or all topics) to display progress from, and another for specifying whether to display progress from all-time, or the previous year, month, or week. | |
\paragraph{Deliverable C: Pie Chart of Topical Interests} | |
A pie chart will be added to the analytics page showing a breakdown of the user's interest in one topic vs another. A dropdown menu will be included to allow the user to choose between interest as measured by activites completed or interest as measured by time spent. | |
\subsubsection{Objective 3: Company-Facing Analytics Page} | |
The type of information useful to SC as the developer of these activities is distinct from the type of information useful to the consumer of said activities, primarily because it will generally be more aggregate in nature. A company facing interface will be developed similar to the user-facing version but tailored to providing SC with actualizable data on variables such as total user enagement by activity, average user success rate by activity, average user engagement by age, average time of engagement and so on. | |
\paragraph{Deliverable A: User Engagement per Activity/Topic} | |
In the company-facing analytics page, one of the most useful pieces of information will be a pie chart showing the relative popularity of one activity set vs another. It will include a toggle button to show similar information broken down by overall topic instead. | |
\paragraph{Deliverable B: Revenue per Activity/Topic} | |
Because SC offers different subscription tiers to it's service (free-trial/monthly/annually membership, individual/family subscription) a similar pie chart will break down activities and topics not simply by time spent on each topic overall, but will instead prorate that time by the amount that the relevant users have invested in SC's platform. This will enable insight into which parts of the platform long-term users find most rewarding on the platform, which might otherwise be obscured by the parts that users simply exploring the platform during their free trial gravitate toward. | |
\paragraph{Deliverable C: Active Userbase Over Time} | |
A graph will be integrated showing a plot of the number of active users over time. Two date pickers will allow customization of the time window. To the right of the date pickers there will be an interface for specifying the number of times within a week, month etc. that a user needs to have logged in to be considered active. | |
\paragraph{Deliverable D: Filters} | |
At the top of this entire analytics page, a filtering system will be implemented to restrict all analyzed data based on several factors including age of user, start date, end date, activit[y|ies] by RegEx, and topic[s] by RegEx. | |
\label{objective 4} | |
\subsubsection{Objective 4: Middleware} | |
The visualization interfaces can be aesthetically pleasing, intuitive, and enlightening, but will serve no purpose whatsoever without good data integration. Back-end code will be written to form appropriate queries to SC's databases, transform the data into a form digestible by the UI, and securely transport the data to the appropriate UI-endpoint. | |
\paragraph{Deliverable A: SQL Queries} | |
Queries will be written specific to SC's existing data model that correspond to each of the planned UI elements. Many of these queries will be written as templates to allow for the customization in the UI elements. | |
\paragraph{Deliverable B: Data Manipulation} | |
SC currently uses a mixture of Java and Clojure code for their back-end software. Because I have extensive experience writing Clojure code, and because it is extremely well suited to data manipulation, all of the transformations of SQL query results will be transformed into JSON data consumable by the front end. Most of the front end visualizations will be rendered with the help of the Vega and Vega-lite APIs, consequently, the JSON produced by the back-end data transformation will, in most cases, be Vega (or Vega-lite) specifications. | |
\subsubsection{Objective 5: Documentation, and Training} | |
\textit{ADroitLee Consulting} will always be readily available to consider further development of additional functionality in the future, but SC will benefit significantly from a well-written codebase, thorough documentation, and trained in-house developers that can efficiently integrate emerging data from future additions to SC's online activity-stack into this visualization system both on the data-extraction and UI-integration ends. Both the front-end and back-end code will be well-written with clear documentation aimed at future enhancement of the integrations. Training will be provided to at least two of SC's in-house technicians, leaving them with the ability to elegantly integrate future additions to the online activity stack into both the user and company-facing interfaces. | |
\paragraph{Deliverable A: Documentation} | |
\begin{wraptable}{r}{8cm} | |
\begin{tabular}{|p{0.5\textwidth}} | |
\textit{Literate Programming is a format that combines regular prose and computer code in a single formatted document for more detailed explanation than is commonly practiced using in-code comments.} | |
\end{tabular} | |
\end{wraptable} | |
The written code will include thorough documentation; well-written comments will be included throughout the added code to clarify its intention, flow, and organization. In addition two \textit{literate programs} will be written, one as a walkthrough of how to transform SQL data into Vega-lite compatible JSON and the other as a reference for integrating said JSON into a page as a Vega visualization.\\ | |
\paragraph{Deliverable B: Training} | |
As part of the proposal, training will be provided to two of your in-house technicians, leaving them capable of integrating some future data into the system without resorting to outside consultation. They will be walked through the process of transforming data from two SQL queries into Vega-lite format and writing new Vega-lite UI integrations one in the company-facing UI and one in the user-facing UI. | |
\newpage | |
\subsection{Timeline} | |
Counterintuitively, I propose completing \underline{\hyperref[objective 4]{Objective 4}} before the other parts of the project. This is meant to prevent wasted time that would otherwise be spent \textit{mocking} out input to the front end visualizations, and will prevent unanticipated logistical issues with the back-end to front-end pipeline.\\ | |
\vspace{1cm} | |
\centering | |
\begin{vtimeline}[description={text width=8cm}, | |
row sep=4ex, | |
use timeline header, | |
timeline title={Implementation Timeline}] | |
Oct 1 & Project started\endlr | |
Oct 8 & Deliverable 4.A completed\endlr | |
Oct 12 & Deliverable 4.B completed\endlr | |
Oct 14 & Deliverable 1.A completed\endlr | |
Oct 16 & Deliverables 1.B and 1.C completed\endlr | |
Oct 18 & Deliverables 2.A and 2.B completed\endlr | |
Oct 20 & Deliverables 2.C completed\endlr | |
Oct 22 & Deliverables 3.A, 3.B, and 5.B completed\endlr | |
Oct 23 & Deliverables 3.C and 3.D completed\endlr | |
Oct 25 & Deliverable 5.A completed\endlr | |
\end{vtimeline} | |
\newpage | |
\subsection{Resources Required} | |
\newpage | |
\subsection{Evaluation Framework} | |
\newpage | |
\section{Summary} | |
\newpage | |
\nocite{*} | |
\printbibliography | |
\end{document} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment