TiredFalcon/mc2018.md

## mc2018.md

      
    Raw
  

              mc2018.md
            
          
    Mobile Computing 2018

1. Mobile devices, sensors and actuators

1.1 Sensors

A device that can observe the environment.

E.g.: microphone, camera, accelerometer, gyroscope, barometer,
proximity sensor, ambient light sensor, hall sensor, touch sensor
on the display, all antennas (Wi-Fi, NFC, GPS,
Digital compass, ...).
1.2 Actuators

A device that can change the environment.

E.g.: speaker, IR laser, flash, display.
1.3 Sensors and actuators in mobile computing

Choice depends on the application. Understand what the sensor
measure, how it does it and what it transmits as a resulting
signal is important.
What can be sensed? Location, lighting conditions, air quality,
density of people around, noise, ...
Why use mobile phones as sensing devices?

A huge quantity of sensing points
No deployment costs
High availability
Personal data
Higher level information by combining and processing raw
sensor data

When getting a new sensor:

understand what it measures, how it does it (accuracy and
range, limitations), understand your options
use data sheets, notes, documentation, ...

2. Localization

Location specified in lat/long
Different types: cell-tower based, GPS, Wi-Fi based.

Usually used in combination for higher accuracy.
Characteristics:

absolute vs. relative (distance to others)
points vs. intervals (areas)
physical vs. symbolic (e.g. Empire State Building, ...)
continuous vs. on demand

Range based vs. range free localization:

range based: uses estimations of distances and such. E.g.: GPS,
cell-tower based.
range free: uses other information rather than distance (e.g.
connectivity). E.g.: Wi-Fi based.

2.1 Cell-tower based

Knowing position of cell towers and distances of mobile device
to cell towers. Best accuracy with three towers.

In reality distance from cell tower has error range that can be
10-100 meters, so we get an area of possible locations.
2.1.1 Ranging techniques

How to get distance between cell tower and mobile device?

Using radio-frequency signals.

Time of arrival (ToA): simple d = v * (t_rx - t_tx)

Needs sync between tx and rx withing nanoseconds, difficult.
2-way ToA: d = v * [(t4 - t1) - (t3 - t2)]/2 with t1-t2 being
first transmission and t3-t4 being transmission back.
In practice: unknown delays might occur. Large errors can happen.
Time difference of arrival (TDoA): d = [(v_a * v_b)/(v_a - v_b)] * (t4 - t2 - t_wait)
with v_a and v_b being the speeds of two transmissions,
both in the same direction, happening at t1-t2 and t3-t4, with
t_wait = t3 - t1.

Requires two different types of transceivers (with different
speeds) on both transmitter and receiver. Drawbacks: complexity,
cost.
Received Signal Strength Indicator (RSSI):

Power of received signal: P_rx(d) ~= beta * P_tx * 1/(d^alpha)
where alpha is the path loss exponent (depends on medium),
beta is a factor captures different attenuation (e.g. multipath).

RSSI = 10 * log_10 (P_rx / 1mW)

Also, P_rx = 10^(RSSI/10) mW.

The closer the transmitter, the higher the RSSI.

Factor of 2 in received mW translates to 3dBm difference in
RSSI.

2.2 GPS (Global Positioning System)

Since 1995, freely accessible by any GPS receiver, maintained by
US gov.

Supported by 24 satellites around Earth.
Accuracy depends on "tier".

SPS (Standard Positioning Service): ~3m
PPS (Precise PS): highly accurate and secure

Also, GPS is precisely synced, to nanosecond precision.
2.3 Wi-Fi based

Steps:

Mobile device records visible APs: by requesting APs their
information (this is called probing), or the information is send
by the APs upon beaconing (usually every 100 TU, TU = 1024 mu_s)
Send list of AP ids to a Wi-Fi localization service server.
Also other information collected and sent is called fingerprint,
and includes MAC addr, SSID, RSSI, possibly enc type, channel.

Fingerprint can be different at different location where AP is
visible.
Server processes and returns location to mobile device: there
are databases of millions of APs and their fingerprints,
collected through wardriving.

How to avoid fingerprinting my own WLAN? Add _nomap at the end
of your SSID.
2.4 Comparison of different localization technologies


Tech
Pros
Cons


GPS
Very accurate, free, always available, requires just receiver
Inaccurate indoors and cities, power hungry


Cell
No additional HW
Inaccurate, requires 3rd party services


Wi-Fi
Quite accurate, works also indoors and cities
Requires 3rd party services, partial coverage


3. Human mobility prediction

Mobility data used for understanding consumer behaviour.

Why prediction? Traffic info, home automation (based on when the
user is expected to come back), bandwidth allocation, smart
reminders, location based search, ...
3.1 Basics

Next-place prediction: predict next relevant place (not when).

Place: where a user spends substantial amount of time.
Mobility trace: sequence of relevant places visited in given time
period (each with arrival timestamp).

Derived from raw data: CDR (call details records), GPS traces,
Wi-Fi scans/logins.
3.1.1 Identification and recognition of relevant places

Identification: is this a relevant place?

Recognition: which relevant place is being visited?
Two main approaches: geometry based and fingerprint based.


Approach
Pros
Cons


Geometry
Places very connected to real locations
Depends on accuracy and availability of positioning system


Fingerprint
Does not depend on accuracy and availability of a positioning system
Radio env might change over time


3.2 Geometry based approach (Kang 2005)

Define places in terms of locations, e.g. workplace is a
rectangle around the office.
Problems with existing clustering alogs: resulting clusters
too large, algos expensive, algos take number of clusters
as input (we don't know it).
Idea: use time-clustering algo to extract relevant places.

Algo has two parameters d and t, distance and time threshold:

higher d -> fewer, larger, less precise places
lower d -> smaller, more precise places, but might miss or
fragment places
higher t -> might miss relevant places
lower t -> create high number of relevant places

3.3 Fingerprint based approach: PlaceSense (Kim 2009)

Idea: leverage and exploit pervasive RF-beacons (Wi-Fi).

Architeture (of PlaceSense):

place signature (fingerprints) DB
place service
visit history DB

Steps: entrance (radio env is stabilizing) and departure
(changing).

Stable radio env? Concepts of familiar (previous scan window
contained it) and new (none previous contained it) beacon:

stable env: if scan window does not contain any new beacons.
scan windows are accumulated until entrance is determined or new
beacon found.
s_max: number of stable scan windows before entrance detected
to detect departure: new or missing familiar beacons
problems? infrequent and "late" beacons. Solved by filtering
infrequent beacons out by focusing on beacons with high response
rate.

Data collection campaign: Nokias 95 (GPS and Wi-Fi), collect
traces every 10 seconds, upload to server every night. Three data
collectors (users).
Precision: fraction of instances classified as relevant that are
actually relevant.

Recall: fraction of instances classified as relevant with respect
to all relevant instances.
PlaceSense is really better than previous approaches.

Entrance/departure times are very precise.

PlaceSense is accurate at discovering places visited for short
duration (< 30 mins) or places where the device remains mobile.
3.4 Mobility prediction tasks

Slotted mobility trace: trace that specifies current place every
s minutes (e.g. 15 mins). If more places are visited during s,
"longer" place is kept.
Next-place prediction: simple algo -> predict most often visited
place as next.

Residence time prediction: predict time spent in current place.

Next-slot place prediction: aims to predict relevant place in next
slot of slotted mobility trace.

Transition or self-transition: move or stay at same place in next
slot (again for slotted MT).

Next-slot transition prediction: predict if next is transition or
self-transition.
3.5 Markov Models (MM) and Hidden MM

Relevant places modelled as states.

Trace modelled as graph with transition probabilities.

Create transition matrix.
Prediction: best way -> predict always most probable next place.
Types:

1st order: conditional probability distribution of next relevant
place depends only on the current place.
2nd order: ^^^ depends on current and previous place.

3.6 Influence of temporal and spatial features on human mobility (Baumann 2013)

Methodology:

Dataset from Nokia Lausanne Data Collection Campaign (185
participants, 18 months), againg with Nokia N95 phones (GPS, Wi-Fi,
GSM, BT, Motion, comm, user interactions)
Extract mobility traces
Derive Markov model and transition probs based on features
Next-slot place prediction

Combinations of features.

Spatial features: current location (P1) and current and prev (P2)
Temporal features: day of week (D), hour of day (H), weekday
or weekend (W)

Given slotted mobility trace: based on selected features (e.g. WH)
filter away unrelated historic data (e.g. not weekend and not 6PM)
then predict most probable next place based on probabilities for
selected subset of data.
Remember precision and recall? Add also accuracy.

Accuracy: fraction of correctly classified instances over the
total number of classified instances.

Precision: fraction of instances classified as relevant that are
actually relevant.

Recall: fraction of instances classified as relevant with respect
to all relevant instances.

Harmonic mean (F1): (2 * precision * recall) / (precision + recall)
Summary: high next slot place prediction accuracy does not
indicate good ability to capture transitions.
3.7 Occupancy prediction: PreHeat (Scott 2011)

Predict if closed environment will be occupied to preheat it.

To save energy.

Main goal: remove need to manually program thermostat and improve
efficiency of home heating.

MissTime: time in which one is at home with a cold house.
PreHeat marks occupancy with a sequence of binary digits
indicating house occupancy (one bit per 15 mins time slot).

When house is empty it tries to predict when it will be occupied
again.

How? Search in past days for K-similar ones (Hamming distance),
then compute occupancy probability as a mean of the corresponding
slots of the K-similar past days. If probability is higher than
threshold, predict occupied.
Problems and -> solutions:

humnas mobility different based on weekends -> distinguish
weekends and weekdays, use only history of correct category.
for early in the day there are only few data points (for
computing K-similarity) -> add four hours of data from previous
days.

Results: PreHeat

is able to improve prediction accuracy in comparison to
thermostats' schedule
reduces MissTime by keeping gas consumption on same or lower
level in comparison to thermostats' schedule

4. Mobile phone programming

Two main players: Android and iOS.

Android is the most popular, OOS, based on Java.
4.1 History of Android

Android: OOS, based on Linux OS, allows devs to write Java apps,
owned by the Open Handset Alliance.

Android Inc. co-founded by Andy Rubin, bought by Google in 2005
for $50M.
Late 2007: Android Developer Challenge. $10M challenge to develop
Android apps.
4.2 Android basics

API levels: integer value that uniquely identifies the framework
API revision offered by a version of the Android platform.

Apps must declare target API level (in the app manifest).
4.3 Android system architecture

Components:

Application framework
Binder IPC (Inter-Process Communication): enables high level
framework APIs to interact with Android system services.
System services: modular, with focused components (e.g. window
manager, search service, notification manager). Two groups: system
and media.
HAL (Hardware abstraction layer)
Linux kernel (why Linux? Portability, features, security)

Security: multi-user system where each app is a different user.

Each app has its own VM, Linux process and unique user ID.
Android Runtime (ART): from 5.0, before Dalvik VM.
Dalvik: .class -[DEX compiler]-> .dex -[dexopt]-> .odex that
runs on the DVM.

Explicitly address battery-powered, limited memory and  processing
power of Android.

At the time there was not Open Source Java VM.
ART: ... -> .dex -[dex2oat]-> .oat that runs on ART.

The executable .oat is generated specifically for the target
device. ART is an ahead-of-time compiler. Compilation happens on
install, longer install times compared to Dalvik (which was JIT).
4.4 Main building blocks of Android


Activities: a single screen of an app, has lifecycle (controlled
by Activity Manager). States: starting, running, paused, stopped,
destroyed.
Services: run in background, no UI. Lifecycle states: starting,
running, destroyed.
Content providers: interfaces for sharing data between apps. To
overcome fact that apps are sandboxed. Used for large data
exchanges (otherwise use intents). Uses CRUD.
Broadcast revceivers: dormant code that gets activated when
specific event occurs. No visual representation. When triggered,
they run code. E.g. SMS received, battery low.
Intents: messages sent among building blocks. Explicit
(receiver is specified) or implicit (in case of multiple, OS asks).

5. Human Activity recognition (HAR)

Features: Max-min, mean, variance, std dev, ...
5.1 CenceMe

Shares inferred data to social networks, e.g. "Walking to the
office and talking to Bob".
Data: GPS, microphone (voice/noise/silence), accelerometer
(activity recognition), BT (nearby friends).
Phone Audio Classifier:

runs on the phone itself
use DFT (Discrete Fourier Transform) of raw audio data
extract DFT samples from freq range
compute features and create feature vector <mean, std_dev>
linear classifier to decide talking/not talking

Conversation classifier:

runs on CenceMe server
input audio primitives (talking/not talking)
output: fact (conversation/not conversation)
how? rolling window on primitives

if 2/5 are talking -> conversation
if 4/5 are not talking -> not conversation

5.2 Challenges in HAR


intraclass variability: same activity but different input data
(e.g. walking style)
interclass variability: different activity but similar input
data
null class problem: activities taht are irrelevant to the
application
class imbalance: typical in HAR few classes occur often
ground-truth annotation: time consuming, tedious and expensive
data collection and experiment design: lack of standard data
sets, different requirements
variability in sensor characteristics: hardware errors, failures,
changes in operating temperature, balibration issues, sensor
positions (on different handsets)
tradeoffs in accuracy, latency, processing power

5.3 Activity recognition chain


Raw data from several sensors
Preprocessing (remove artifacts, filter)
Segmentation (features are computed on segments)
Feature extraction
Classification

Annotation thechniques: time diary, self recall, experience
sampling (request user via notification), games (with purpose).
5.4 Classification methods


Linear discriminant: simple.
Nearest neighour: based on distance from clusters (often
euclidean distance).
K-NN: consider K NN, classify as majority of N.
Weighted K-NN: add weights to influence of K-NN, e.g. inverse of
distance.

5.5 Evaluation of classifiers

Confusion matrix: columns are recognized activity, rows are ground
truth. Normalize data, then show as coloured gradient. Best result
would be white diagonal and all the rest black. Compute:

accuracy: diagonal over total. Drawback: dominant class is often
the null class.
precision (focus on one class): true positive cell over whole
column (true positive + false positives)
recall (focus on one class): true prositive cell over whole row
(true positive + false negatives)

Training vs. test data (both are raw data with ground truth labels).

Training data: used to train the classifier.

Test data: used to evaluate the classifier.

We cannot use training data as test data.
Usually not much labelled data available.

Therefore: Cross-validation (leave-one-out).

split data in subsets (folds)
each fold is used for either training or test
K-fold cross validation: in each run: k-1 folds used for
training, 1 for testing.

Leave-one-x-out. x=

person: assess generalization to an unseen user (for
user-independent system)
run: assess user-specific system
day: assess robustness over time

Comparing to other classifiers: baseline classifiers.

random: pick a class at random, accuracy 1/n
biased random: pick class i with prob p_i, computed as prior
prob of input belonging to class i, accuracy sum_1^k p_i^2
zero-rule (0R): always predict class with highest prior prob,
accuracy p_i

5.6 Multi-sensor systems

Curse of dimensionality: more sensors -> more features.

Computational and memory complexity increases, classification
accuracy decreases after a certain level.
6. Data collection

Census, Market research, ...
Methods: observation, interviews, questionnaires, databases,
online surveys.
6.1 Examples of data collection campaigns


Reality Mining data set: 100 participants, 9 months, Nokia


Collected: call logs, BT proximity, cell tower IDs, app
usage, phone status (e.g. charging or idle).


Lausanne data collection campaign: 170+ participants, 4 years,
Nokia N95. Collected: GPS, calls/sms, pictures/video count,
BT scans (unique visible devices), WLAN scans (APs, cell towers),
accelerometer, audio, app events, unique calendar entries, unique
phone book entries.
StudentLife data set: 83 pasrticipants, 10 weeks term. To assess
mental health, academic performance, behavioural trends. Android
and iPhone apps, plus Microsoft band 2. Collected: physical
activity, sleep, sociability, audio amplitude, location, phone
lock/unlock events, biological data (e.g. HR).

7. User interface

7.1 Desing of UIs: goals and principles

Goals:

Utility: how well are the desired functions performed?
Usability: how easy is it for users to reach the desired
functions?
Likeability: how enjoyable is the use of the product?

Principles:

Visibility: can the user tell by looking what the state of the
device is and what the alternatives for action are?
Mapping: is it possible to determine the relationships between
actions and results, between controls and effects?
Consistency: is the user able to predict how actions affect the
system?
Feedback: does the user receive full and continuous feedback
about the results of actions?

7.2 Mobile UIs

Constraints: mobile, small, handheld, touch, few physical buttons.
Mobile mindset:


microtasks


location awareness


remedy to boredom: not only games but explorable and fun apps


Physical feel: mimic familiar rules of objects (inertia,
elasticity, friction). E.g. haptics.


Rule of thumb: organize UI for the thumb, comfort zone. Lesser
used controls can be outside of zone.


Size matters: big buttons win, minimum tap area (48dp square,
spacing 8dp).


Guidelines: important information at top of screen. Be scroll
skeptic, use nesting when needed.


8. Mobile data prefetching

8.1 Basics

Fundamental technique tyo improve app performance.

Application prefetching vs. content prefetching.

Why? Latency, energy, cost.
Apps prefetching:

FALCON: most recently used apps, long training period
Xu et al: user context and shared apps usage patterns
PREPP: partial matching

Content prefetching:

Informed Mobile Prefetching (IMP): application informs IMP,
cost-benefit function used to prioritize content prefetch
EarlyBird: prefetching of links and feeds, twitter Android
client

8.2 Example: PREPP

Problem: apps increasingly depending on dynamic content, latency
increases, launching or refreshing apps can take tens of s.
Solution: app usage prediction algo, time till usage temporal
model, parallel prefetch on screen unlock events.
Three parts:

App prediction algo: given pattern of app usage, predict most
probable next app based. Similar to text compression algo, use
partial match.
Temporal modeling: tradeoff between freshness and cost. Uses app
usage history to calculate the (CDF of P(TTU < delta_t)).
Decision engine: exec at each app change event. Determines next
likely app e, constructs temporal model for e. Output: time
delta_t to let elapse from last app change event before prefetching
application e. Opt goal: maximize freshness while minimizing cost.

Median freshness: 3 minutes.
8.3 Example: EarlyBird

Problem: content retrieval for mobile social networks is often too
slow, too energy-hungry or too expensive.
Solution: inference-based prefetching of mobile content. Minimize
delay considering constraints (e.g. energy).
EarlyBird:

decides judiciously what to prefetch based on users behaviour (they
do not value all subscriptions equally)
considers access patterns to determine what and when to prefetch

Two complementary prefetching components: at launch and on Wi-Fi.

Components adjusted to fit user-specific cellular data usage and
battery consumption budgets.
Evaluation: 6000 users, 30 million tweets.

Results: Delay reduction 62%, at the cost of 3% battery and no more
than 40MB/month.
9. Selected topics in MC I

9.2 Coroner 2018

Problem: recruiting participants for studies often cumbersome and
time consuming and costly.

Solution: leverage online ads to recriut and engage with
participants. Once clicked, experiment run in ad itself, or
browser, or native mobile app.
Results: low click rate in general, experiments with native apps
very low.
9.3 Netravali 2018

Proxy based url rewriting to decrease mobile browsing bandwidth.
9.4 Mao 2018

Enable mobile devices to capture images also in the darkness or
across obstacles.
9.5 Bao 2013

Empty slides, thanks.
Appendix A: Data analysis

Porca madonna.
Appendix B: Ethics in human subject studies


Consider ethical implications of experiment.
Get informed consent from all subjects.
Protect the privacy of your subjects.
Tech	Pros	Cons
GPS	Very accurate, free, always available, requires just receiver	Inaccurate indoors and cities, power hungry
Cell	No additional HW	Inaccurate, requires 3rd party services
Wi-Fi	Quite accurate, works also indoors and cities	Requires 3rd party services, partial coverage
Approach	Pros	Cons
Geometry	Places very connected to real locations	Depends on accuracy and availability of positioning system
Fingerprint	Does not depend on accuracy and availability of a positioning system	Radio env might change over time