Skip to content

Instantly share code, notes, and snippets.

@TiredFalcon
Last active January 21, 2019 10:17
Show Gist options
  • Save TiredFalcon/4e1844ec2bf0e737f50ae5a4f018cdc9 to your computer and use it in GitHub Desktop.
Save TiredFalcon/4e1844ec2bf0e737f50ae5a4f018cdc9 to your computer and use it in GitHub Desktop.
Mobile Computing 2018

Mobile Computing 2018

1. Mobile devices, sensors and actuators

1.1 Sensors

A device that can observe the environment.
E.g.: microphone, camera, accelerometer, gyroscope, barometer, proximity sensor, ambient light sensor, hall sensor, touch sensor on the display, all antennas (Wi-Fi, NFC, GPS, Digital compass, ...).

1.2 Actuators

A device that can change the environment.
E.g.: speaker, IR laser, flash, display.

1.3 Sensors and actuators in mobile computing

Choice depends on the application. Understand what the sensor measure, how it does it and what it transmits as a resulting signal is important.

What can be sensed? Location, lighting conditions, air quality, density of people around, noise, ...

Why use mobile phones as sensing devices?

  • A huge quantity of sensing points
  • No deployment costs
  • High availability
  • Personal data
  • Higher level information by combining and processing raw sensor data

When getting a new sensor:

  • understand what it measures, how it does it (accuracy and range, limitations), understand your options
  • use data sheets, notes, documentation, ...

2. Localization

Location specified in lat/long Different types: cell-tower based, GPS, Wi-Fi based.
Usually used in combination for higher accuracy.

Characteristics:

  • absolute vs. relative (distance to others)
  • points vs. intervals (areas)
  • physical vs. symbolic (e.g. Empire State Building, ...)
  • continuous vs. on demand

Range based vs. range free localization:

  • range based: uses estimations of distances and such. E.g.: GPS, cell-tower based.
  • range free: uses other information rather than distance (e.g. connectivity). E.g.: Wi-Fi based.

2.1 Cell-tower based

Knowing position of cell towers and distances of mobile device to cell towers. Best accuracy with three towers.
In reality distance from cell tower has error range that can be 10-100 meters, so we get an area of possible locations.

2.1.1 Ranging techniques

How to get distance between cell tower and mobile device?
Using radio-frequency signals.

  • Time of arrival (ToA): simple d = v * (t_rx - t_tx)
    Needs sync between tx and rx withing nanoseconds, difficult.
  • 2-way ToA: d = v * [(t4 - t1) - (t3 - t2)]/2 with t1-t2 being first transmission and t3-t4 being transmission back. In practice: unknown delays might occur. Large errors can happen.
  • Time difference of arrival (TDoA): d = [(v_a * v_b)/(v_a - v_b)] * (t4 - t2 - t_wait) with v_a and v_b being the speeds of two transmissions, both in the same direction, happening at t1-t2 and t3-t4, with t_wait = t3 - t1.
    Requires two different types of transceivers (with different speeds) on both transmitter and receiver. Drawbacks: complexity, cost.
  • Received Signal Strength Indicator (RSSI):
    Power of received signal: P_rx(d) ~= beta * P_tx * 1/(d^alpha) where alpha is the path loss exponent (depends on medium), beta is a factor captures different attenuation (e.g. multipath).
    RSSI = 10 * log_10 (P_rx / 1mW)
    Also, P_rx = 10^(RSSI/10) mW.
    The closer the transmitter, the higher the RSSI.
    Factor of 2 in received mW translates to 3dBm difference in RSSI.

2.2 GPS (Global Positioning System)

Since 1995, freely accessible by any GPS receiver, maintained by US gov.
Supported by 24 satellites around Earth.

Accuracy depends on "tier".

  • SPS (Standard Positioning Service): ~3m
  • PPS (Precise PS): highly accurate and secure

Also, GPS is precisely synced, to nanosecond precision.

2.3 Wi-Fi based

Steps:

  • Mobile device records visible APs: by requesting APs their information (this is called probing), or the information is send by the APs upon beaconing (usually every 100 TU, TU = 1024 mu_s)
  • Send list of AP ids to a Wi-Fi localization service server. Also other information collected and sent is called fingerprint, and includes MAC addr, SSID, RSSI, possibly enc type, channel.
    Fingerprint can be different at different location where AP is visible.
  • Server processes and returns location to mobile device: there are databases of millions of APs and their fingerprints, collected through wardriving.

How to avoid fingerprinting my own WLAN? Add _nomap at the end of your SSID.

2.4 Comparison of different localization technologies

Tech Pros Cons
GPS Very accurate, free, always available, requires just receiver Inaccurate indoors and cities, power hungry
Cell No additional HW Inaccurate, requires 3rd party services
Wi-Fi Quite accurate, works also indoors and cities Requires 3rd party services, partial coverage

3. Human mobility prediction

Mobility data used for understanding consumer behaviour.
Why prediction? Traffic info, home automation (based on when the user is expected to come back), bandwidth allocation, smart reminders, location based search, ...

3.1 Basics

Next-place prediction: predict next relevant place (not when).
Place: where a user spends substantial amount of time.

Mobility trace: sequence of relevant places visited in given time period (each with arrival timestamp).
Derived from raw data: CDR (call details records), GPS traces, Wi-Fi scans/logins.

3.1.1 Identification and recognition of relevant places

Identification: is this a relevant place?
Recognition: which relevant place is being visited?

Two main approaches: geometry based and fingerprint based.

Approach Pros Cons
Geometry Places very connected to real locations Depends on accuracy and availability of positioning system
Fingerprint Does not depend on accuracy and availability of a positioning system Radio env might change over time

3.2 Geometry based approach (Kang 2005)

Define places in terms of locations, e.g. workplace is a rectangle around the office.

Problems with existing clustering alogs: resulting clusters too large, algos expensive, algos take number of clusters as input (we don't know it).

Idea: use time-clustering algo to extract relevant places.
Algo has two parameters d and t, distance and time threshold:

  • higher d -> fewer, larger, less precise places
  • lower d -> smaller, more precise places, but might miss or fragment places
  • higher t -> might miss relevant places
  • lower t -> create high number of relevant places

3.3 Fingerprint based approach: PlaceSense (Kim 2009)

Idea: leverage and exploit pervasive RF-beacons (Wi-Fi).
Architeture (of PlaceSense):

  • place signature (fingerprints) DB
  • place service
  • visit history DB

Steps: entrance (radio env is stabilizing) and departure (changing).
Stable radio env? Concepts of familiar (previous scan window contained it) and new (none previous contained it) beacon:

  • stable env: if scan window does not contain any new beacons.
  • scan windows are accumulated until entrance is determined or new beacon found.
  • s_max: number of stable scan windows before entrance detected
  • to detect departure: new or missing familiar beacons
  • problems? infrequent and "late" beacons. Solved by filtering infrequent beacons out by focusing on beacons with high response rate.

Data collection campaign: Nokias 95 (GPS and Wi-Fi), collect traces every 10 seconds, upload to server every night. Three data collectors (users).

Precision: fraction of instances classified as relevant that are actually relevant.
Recall: fraction of instances classified as relevant with respect to all relevant instances.

PlaceSense is really better than previous approaches.
Entrance/departure times are very precise.
PlaceSense is accurate at discovering places visited for short duration (< 30 mins) or places where the device remains mobile.

3.4 Mobility prediction tasks

Slotted mobility trace: trace that specifies current place every s minutes (e.g. 15 mins). If more places are visited during s, "longer" place is kept.

Next-place prediction: simple algo -> predict most often visited place as next.
Residence time prediction: predict time spent in current place.
Next-slot place prediction: aims to predict relevant place in next slot of slotted mobility trace.
Transition or self-transition: move or stay at same place in next slot (again for slotted MT).
Next-slot transition prediction: predict if next is transition or self-transition.

3.5 Markov Models (MM) and Hidden MM

Relevant places modelled as states.
Trace modelled as graph with transition probabilities.
Create transition matrix.

Prediction: best way -> predict always most probable next place.

Types:

  • 1st order: conditional probability distribution of next relevant place depends only on the current place.
  • 2nd order: ^^^ depends on current and previous place.

3.6 Influence of temporal and spatial features on human mobility (Baumann 2013)

Methodology:

  • Dataset from Nokia Lausanne Data Collection Campaign (185 participants, 18 months), againg with Nokia N95 phones (GPS, Wi-Fi, GSM, BT, Motion, comm, user interactions)
  • Extract mobility traces
  • Derive Markov model and transition probs based on features
  • Next-slot place prediction

Combinations of features.

  • Spatial features: current location (P1) and current and prev (P2)
  • Temporal features: day of week (D), hour of day (H), weekday or weekend (W)

Given slotted mobility trace: based on selected features (e.g. WH) filter away unrelated historic data (e.g. not weekend and not 6PM) then predict most probable next place based on probabilities for selected subset of data.

Remember precision and recall? Add also accuracy.
Accuracy: fraction of correctly classified instances over the total number of classified instances.
Precision: fraction of instances classified as relevant that are actually relevant.
Recall: fraction of instances classified as relevant with respect to all relevant instances.
Harmonic mean (F1): (2 * precision * recall) / (precision + recall)

Summary: high next slot place prediction accuracy does not indicate good ability to capture transitions.

3.7 Occupancy prediction: PreHeat (Scott 2011)

Predict if closed environment will be occupied to preheat it.
To save energy.
Main goal: remove need to manually program thermostat and improve efficiency of home heating.
MissTime: time in which one is at home with a cold house.

PreHeat marks occupancy with a sequence of binary digits indicating house occupancy (one bit per 15 mins time slot).
When house is empty it tries to predict when it will be occupied again.
How? Search in past days for K-similar ones (Hamming distance), then compute occupancy probability as a mean of the corresponding slots of the K-similar past days. If probability is higher than threshold, predict occupied.

Problems and -> solutions:

  • humnas mobility different based on weekends -> distinguish weekends and weekdays, use only history of correct category.
  • for early in the day there are only few data points (for computing K-similarity) -> add four hours of data from previous days.

Results: PreHeat

  • is able to improve prediction accuracy in comparison to thermostats' schedule
  • reduces MissTime by keeping gas consumption on same or lower level in comparison to thermostats' schedule

4. Mobile phone programming

Two main players: Android and iOS.
Android is the most popular, OOS, based on Java.

4.1 History of Android

Android: OOS, based on Linux OS, allows devs to write Java apps, owned by the Open Handset Alliance.
Android Inc. co-founded by Andy Rubin, bought by Google in 2005 for $50M.

Late 2007: Android Developer Challenge. $10M challenge to develop Android apps.

4.2 Android basics

API levels: integer value that uniquely identifies the framework API revision offered by a version of the Android platform.
Apps must declare target API level (in the app manifest).

4.3 Android system architecture

Components:

  • Application framework
  • Binder IPC (Inter-Process Communication): enables high level framework APIs to interact with Android system services.
  • System services: modular, with focused components (e.g. window manager, search service, notification manager). Two groups: system and media.
  • HAL (Hardware abstraction layer)
  • Linux kernel (why Linux? Portability, features, security)

Security: multi-user system where each app is a different user.
Each app has its own VM, Linux process and unique user ID.

Android Runtime (ART): from 5.0, before Dalvik VM.

Dalvik: .class -[DEX compiler]-> .dex -[dexopt]-> .odex that runs on the DVM.
Explicitly address battery-powered, limited memory and processing power of Android.
At the time there was not Open Source Java VM.

ART: ... -> .dex -[dex2oat]-> .oat that runs on ART.
The executable .oat is generated specifically for the target device. ART is an ahead-of-time compiler. Compilation happens on install, longer install times compared to Dalvik (which was JIT).

4.4 Main building blocks of Android

  • Activities: a single screen of an app, has lifecycle (controlled by Activity Manager). States: starting, running, paused, stopped, destroyed.
  • Services: run in background, no UI. Lifecycle states: starting, running, destroyed.
  • Content providers: interfaces for sharing data between apps. To overcome fact that apps are sandboxed. Used for large data exchanges (otherwise use intents). Uses CRUD.
  • Broadcast revceivers: dormant code that gets activated when specific event occurs. No visual representation. When triggered, they run code. E.g. SMS received, battery low.
  • Intents: messages sent among building blocks. Explicit (receiver is specified) or implicit (in case of multiple, OS asks).

5. Human Activity recognition (HAR)

Features: Max-min, mean, variance, std dev, ...

5.1 CenceMe

Shares inferred data to social networks, e.g. "Walking to the office and talking to Bob".

Data: GPS, microphone (voice/noise/silence), accelerometer (activity recognition), BT (nearby friends).

Phone Audio Classifier:

  • runs on the phone itself
  • use DFT (Discrete Fourier Transform) of raw audio data
  • extract DFT samples from freq range
  • compute features and create feature vector <mean, std_dev>
  • linear classifier to decide talking/not talking

Conversation classifier:

  • runs on CenceMe server
  • input audio primitives (talking/not talking)
  • output: fact (conversation/not conversation)
  • how? rolling window on primitives
    if 2/5 are talking -> conversation if 4/5 are not talking -> not conversation

5.2 Challenges in HAR

  • intraclass variability: same activity but different input data (e.g. walking style)
  • interclass variability: different activity but similar input data
  • null class problem: activities taht are irrelevant to the application
  • class imbalance: typical in HAR few classes occur often
  • ground-truth annotation: time consuming, tedious and expensive
  • data collection and experiment design: lack of standard data sets, different requirements
  • variability in sensor characteristics: hardware errors, failures, changes in operating temperature, balibration issues, sensor positions (on different handsets)
  • tradeoffs in accuracy, latency, processing power

5.3 Activity recognition chain

  • Raw data from several sensors
  • Preprocessing (remove artifacts, filter)
  • Segmentation (features are computed on segments)
  • Feature extraction
  • Classification

Annotation thechniques: time diary, self recall, experience sampling (request user via notification), games (with purpose).

5.4 Classification methods

  • Linear discriminant: simple.
  • Nearest neighour: based on distance from clusters (often euclidean distance).
  • K-NN: consider K NN, classify as majority of N.
  • Weighted K-NN: add weights to influence of K-NN, e.g. inverse of distance.

5.5 Evaluation of classifiers

Confusion matrix: columns are recognized activity, rows are ground truth. Normalize data, then show as coloured gradient. Best result would be white diagonal and all the rest black. Compute:

  • accuracy: diagonal over total. Drawback: dominant class is often the null class.
  • precision (focus on one class): true positive cell over whole column (true positive + false positives)
  • recall (focus on one class): true prositive cell over whole row (true positive + false negatives)

Training vs. test data (both are raw data with ground truth labels).
Training data: used to train the classifier.
Test data: used to evaluate the classifier.
We cannot use training data as test data.

Usually not much labelled data available.
Therefore: Cross-validation (leave-one-out).

  • split data in subsets (folds)
  • each fold is used for either training or test
  • K-fold cross validation: in each run: k-1 folds used for training, 1 for testing.

Leave-one-x-out. x=

  • person: assess generalization to an unseen user (for user-independent system)
  • run: assess user-specific system
  • day: assess robustness over time

Comparing to other classifiers: baseline classifiers.

  • random: pick a class at random, accuracy 1/n
  • biased random: pick class i with prob p_i, computed as prior prob of input belonging to class i, accuracy sum_1^k p_i^2
  • zero-rule (0R): always predict class with highest prior prob, accuracy p_i

5.6 Multi-sensor systems

Curse of dimensionality: more sensors -> more features.
Computational and memory complexity increases, classification accuracy decreases after a certain level.

6. Data collection

Census, Market research, ...

Methods: observation, interviews, questionnaires, databases, online surveys.

6.1 Examples of data collection campaigns

  • Reality Mining data set: 100 participants, 9 months, Nokia
  1. Collected: call logs, BT proximity, cell tower IDs, app usage, phone status (e.g. charging or idle).
  • Lausanne data collection campaign: 170+ participants, 4 years, Nokia N95. Collected: GPS, calls/sms, pictures/video count, BT scans (unique visible devices), WLAN scans (APs, cell towers), accelerometer, audio, app events, unique calendar entries, unique phone book entries.
  • StudentLife data set: 83 pasrticipants, 10 weeks term. To assess mental health, academic performance, behavioural trends. Android and iPhone apps, plus Microsoft band 2. Collected: physical activity, sleep, sociability, audio amplitude, location, phone lock/unlock events, biological data (e.g. HR).

7. User interface

7.1 Desing of UIs: goals and principles

Goals:

  • Utility: how well are the desired functions performed?
  • Usability: how easy is it for users to reach the desired functions?
  • Likeability: how enjoyable is the use of the product?

Principles:

  • Visibility: can the user tell by looking what the state of the device is and what the alternatives for action are?
  • Mapping: is it possible to determine the relationships between actions and results, between controls and effects?
  • Consistency: is the user able to predict how actions affect the system?
  • Feedback: does the user receive full and continuous feedback about the results of actions?

7.2 Mobile UIs

Constraints: mobile, small, handheld, touch, few physical buttons.

Mobile mindset:

  • microtasks

  • location awareness

  • remedy to boredom: not only games but explorable and fun apps

  • Physical feel: mimic familiar rules of objects (inertia, elasticity, friction). E.g. haptics.

  • Rule of thumb: organize UI for the thumb, comfort zone. Lesser used controls can be outside of zone.

  • Size matters: big buttons win, minimum tap area (48dp square, spacing 8dp).

  • Guidelines: important information at top of screen. Be scroll skeptic, use nesting when needed.

8. Mobile data prefetching

8.1 Basics

Fundamental technique tyo improve app performance.
Application prefetching vs. content prefetching.
Why? Latency, energy, cost.

Apps prefetching:

  • FALCON: most recently used apps, long training period
  • Xu et al: user context and shared apps usage patterns
  • PREPP: partial matching

Content prefetching:

  • Informed Mobile Prefetching (IMP): application informs IMP, cost-benefit function used to prioritize content prefetch
  • EarlyBird: prefetching of links and feeds, twitter Android client

8.2 Example: PREPP

Problem: apps increasingly depending on dynamic content, latency increases, launching or refreshing apps can take tens of s.

Solution: app usage prediction algo, time till usage temporal model, parallel prefetch on screen unlock events.

Three parts:

  • App prediction algo: given pattern of app usage, predict most probable next app based. Similar to text compression algo, use partial match.
  • Temporal modeling: tradeoff between freshness and cost. Uses app usage history to calculate the (CDF of P(TTU < delta_t)).
  • Decision engine: exec at each app change event. Determines next likely app e, constructs temporal model for e. Output: time delta_t to let elapse from last app change event before prefetching application e. Opt goal: maximize freshness while minimizing cost.

Median freshness: 3 minutes.

8.3 Example: EarlyBird

Problem: content retrieval for mobile social networks is often too slow, too energy-hungry or too expensive.

Solution: inference-based prefetching of mobile content. Minimize delay considering constraints (e.g. energy).

EarlyBird:

  • decides judiciously what to prefetch based on users behaviour (they do not value all subscriptions equally)
  • considers access patterns to determine what and when to prefetch

Two complementary prefetching components: at launch and on Wi-Fi.
Components adjusted to fit user-specific cellular data usage and battery consumption budgets.

Evaluation: 6000 users, 30 million tweets.
Results: Delay reduction 62%, at the cost of 3% battery and no more than 40MB/month.

9. Selected topics in MC I

9.2 Coroner 2018

Problem: recruiting participants for studies often cumbersome and time consuming and costly.
Solution: leverage online ads to recriut and engage with participants. Once clicked, experiment run in ad itself, or browser, or native mobile app.

Results: low click rate in general, experiments with native apps very low.

9.3 Netravali 2018

Proxy based url rewriting to decrease mobile browsing bandwidth.

9.4 Mao 2018

Enable mobile devices to capture images also in the darkness or across obstacles.

9.5 Bao 2013

Empty slides, thanks.

Appendix A: Data analysis

Porca madonna.

Appendix B: Ethics in human subject studies

  • Consider ethical implications of experiment.
  • Get informed consent from all subjects.
  • Protect the privacy of your subjects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment