-
visualization - use of computer-supported and interactive visual representations of data to amplify cognition
-
significant chunk of visual information processing occurs at the pre-attentive level (ex. popout)
-
visualization pipeline
- data
- enrichment - interpolating/approximating raw data, thereby creating a model
- interpolation or approximation
- filtering - choosing portion of data we want to analyze
- remove irrelevant data and outliers (portion caused by measurement error), smooth data
- mapping - data onto visual parameters
- arrows, glyphs, colors, trees, ...
- rendering - creating an image
- 2D/3D, problem with interactivity
-
interactivity is need to overcome limitations of computer, humans, displays
-
kinds of visualization
- scientific visualization
- visualization of data with spatial attributes (coordinates)
- ex. map plots
- information visualization
- visualization of abstract data structures
- scientific visualization
-
dataset types
- spatial data
- spatial fields
- grid (vertices, edges, cells between edges), attributes (vertices and cells can both contain several values)
- grid tells us form which samples to interpolate
- geometry
- vertices, edges (contain data attributes), faces
- spatial fields
- abstract data
- tabular data (items, attributes, cells containing value of attribute for item)
- relational data (nodes, relations, attributes)
- ex. relational database
- text
- interpolation and other enrichment techniques do not work!
- mix of both
- geographical data
- geometry + abstract data (ex. population)
- we can combine both in our queries
- geographical data
- spatial data
-
ordering direction
- sequential (height)
- diverging (temperature)
- cyclic (hours)
-
marks
- points, lines, areas
-
visual channels (position rocks because it can be used with all types of abstract data)
- nominal
- spatial region
- shape
- ordered
- color hue / saturation / lightness
- length / area / volume
- angle
- grouping
- gestalt grouping
- containment
- similarity
- connection
- proximity
- gestalt grouping
- nominal
-
some visual channels are not completely independent from one another (width x height, hue x saturation, shape x size)
-
popout
- is pre-attentive
- reduces cognitive load, no need for focus, fast (~ 200ms)
-
size
- length is perceptionally accurate, area somewhat accurate, volumne inaccurate
-
orientation
- can be used for ordered attributes
- accuracy of perception isn't uniform (acute angles)
-
shape
- high discriminality
- can only be used for categorical attributes
-
color
- use hue for discriminality, saturation and luminance for ordering
- perception is relative (it depends on surrounding colors)
- color blindness
-
grouping (ascending order of magnitude)
- similarity
- connectedness
- containment
-
3D challenges
- depth perception is poor (stereoscopic 3D / VR as a solution?)
- occlusion (interaction is required)
- perspective distortion
- shading interferes with color channel
-
methods of interaction
- changing/transforming data
- changing visualization technique
- changing data enrichment
- modifying the filter
- changing mapping to graphical elements
-
brushing
- selecting subset of data items with input device to emphasize them
-
linking
- highlight brushed data items in different views or partites of visualization
-
rearrangment and sorting (ex. parallel coordinates)
-
navigation
- overview / detail (ex. minimap)
- one detailed view, one complete view
- two separate views, spatial separation
- pan and zoom
- infinite plane, allow the user to move and pan
- temporal separation - the user needs to remember certain information
- focus+context (ex. fish eye)
- display most data with less details and small portion of data with a lot of detail in same view
- deformation
- overview / detail (ex. minimap)
-
reduction of data
- filtering
- dynamic queries
- deliver continuous updates
- low latency
- easy to use, visualizes bounds
- allow the user to change query by moving sliders and other basic UI elements
- dynamic queries
- aggregation
- binning
- divide range of attributes into bins
- count number of items in each bin and map the number to a channel
- clustering
- group data items based on similarity
- calculate average and map the result onto channels
- binning
- filtering
-
reduction of attributes
- filtering - remove the attribute altogether
- aggregation - via dimension reduction
-
placement of multiple views
- juxtaposition
- side by side
- requires brushing/linking or coordination
- large number of views - small size
- superimposition
- position views on top of one another
- embedding
- embed one view into another (ex. focus+context)
- juxtaposition
-
colormap
- changes in value are perceived uniformly across the colormap
- map implies correct ordering
- it should work in grayscale and for color blind people
- colors should be selected intuitively (water - blue, terrain - green,...)
- allow for inversion of mapping
-
contour line
- all points in a dataset that have the same scalar value
- boundaries between regions
- represented by curves in 2D (isolines), surfaces in 3D (isosurfaces)
- contours cannot intersect
- distance between two contours indicates magnitude of gradient (speed of change in data)
-
marching squares/cubes
- 2^|F| ways to divice the cell
-
volumetric data
- spatial field
- grid is in 3D
-
texture based volume rendering
- use planes with 2D textures
- slicing plane switching needed when changing viewpoints
-
data enrichment via bilinear/trilinear interpolation
-
glyphs
- displayed at sampling points
- direction mapped on orientation of arrow
- magnitude mapped on length / color
- challenges
- overlapping
- in 3D occlusion, direction interpretation ambiguity
- shading, more complex objects
- we as humans suck at interpolating glyphs
-
alleviate occlusion by subsampling, opacity
-
stream objects
- choose seed points and trace them in field for a number of steps
- visualize trajectories using vectors
-
stream ribbon
- color mapping vortacity (tendency of something to rotate, local spinning motion)
-
tabular data
- rows - items
- columns - attributes
- cells - scalar values
-
attribute types
- nominal (categories)
- ordinal (S, M, L) - not measurable intervals
- quantitative - we can do arithmetics
- discrete, continuous
-
abstract data
- no spatial coordinates at which data was measured
- impossible to do data enrichment, no relation between data
-
axis layout
- orthogonal
- non-orthogonal ("basis" vectors are not lin. independent -> hard to interpret imo)
-
glyphs
- geometry that changes shape with data
-
identification tasks
- identify attribute (range, distribution, outliers, value for given item)
- identify item (for given attribute)
- identify attributes (is there correlation between the attributes? clustering)
-
techniques
- 2 attributes - scatterplot
- 3 attributes - 3D scatterplot
- stereoscopy, VR, rotation
- 4-5 attributes - colour, shapes in addition to 3 spatial dimensions
-
interaction
- data manipulation
- selection, view transformation
- data reduction
- filtering, clustering
- view organization
- juxtaposition, brushing, inking
- data manipulation
-
faceting
- visualising every combination of attributes
- allows us to spot correlation between attributes
- cluster identification
- brushing - selection of a subset of data using input devices (emphasising or deemphising it)
- linking - selecting an item across multiple plots highlighting each of the item's attributes
-
parallel coordinates
- place axis parallel and join the dots
- can be used to identify correlation between neighbouring attributes
- hierarchical approach - organize data into clusters and ignore outliers -> visual clarity on upper layers
- pipeline: data -> binning -> outlier detection -> trend mapping (ignore outliers) -> graph (combined with interaction and outliers)
-
star glyphs
- similar to parallel coords but mapped onto a polyline
- attributes are spaced out at equal angles around a circle
- saves space
- less screen space for items closer to centre
- can be projected into scatterplot to map 5+-dimensional data
-
star coordinates
- distribute vectors evenly on unit circle
- "base" vectors are not lin. independent, but we still project into 2D space -> fuck linear algebra -> leads to ambiguity
- create points via linear combination of attributes
- apparently letting the user decide on the orientation of "base" vectors can help with interpreting such abomination
- distribute vectors evenly on unit circle
-
bargrams
- works with nominal, ordinal attributes
- proportion of each category is mapped onto length of line
- parallel set
- enhancement of bargrams
- visualizes relations between attributes
- interaction
- reordering categories, brushing
- bundled layout
- we only connect neighbouring categories
-
we introduce relations between tables
- relation - subset of cartesian product, can be unary or binary
-
attributes can be stored in nodes as well as links (we do not have to do M:N decomposition as in rel. DB)
-
data is typically abstract preventing us from doing data enrichment
-
encoding attributes
- of nodes - shape, color, size
- of relations - width, color
-
visualization tasks
- all the tasks discussed prior
- node incidence
- shortest path
-
treemap
- using containment to encode hierarchical relations
- makes use of only one attribute (size of files)
- recursive space dividing technique that alternates axes based on tree depth
- we also project depth into color of squares
- other examples - stock market divided into industries and then companies, tasks
- treemap gives us an overview of the entire hierarchy
-
A E S T H E T I C S
- minimize number of crossings
- minimize area
- minimize aspect ratio
- angular resolution between edges incident to a node
- edge length (total, maximum, uniform)
- bends
- symmetry
-
big data
- high velocity, volume, variety
- ex. NSA
- 300m US citizens
- metadata, calls, texts, surveillence images,...
- veracity - possibility of including shit data (fake profiles on FB)
-
visual analytics
- combination of automated analysis techniques with interactive visualization
- to make sense of large and complex datasets
- visualization + data mining + interaction
- making use of full perceptual and cognitive abilities during analysis
- quick informed decisions from people who aren't experts on data mining / visualization
- combination of automated analysis techniques with interactive visualization
-
conceptual challenges
- we cannot use standard visualization techiques
- heavy use of binning and clustering
- ex. earthquakes, density visualization (M25 traffic accidents)
-
clustering
- grouping a set of objects in such a way that objects in these groups are more similar to one another than to objects outside
-
data mining
- process of extraction of interesting patterns from huge amounts of data
- regression for data enrichment
- clustering for simplification
- box plots for statistic analysis
- outlier detection for detecting anomalies
- classificiation using ANN
-
issues
- subtle
- abstract
- meaning ambigiousness
- context
-
analysis levels
- lexical - strings
- syntactic - word types
- semantic - meanings
-
visualizing text
- understanding
- grouping for future classification
- comparison (git diffs)
- correlation (detecting plagiarism)
-
word clouds
- frequency analysis
-
judging relative word importance in document
- tf * log(N / df)
- tf - term frequency
- df - documents including the word
- N - total number of documents
- geographical data structure
- geometry stored as vector or raster representation
- non-spatial attributes
-
4D - 3 spatial dimensions + 1 temporal dimension
-
time is unidirectional (we cannot go back)
-
time-oriented data
- temporal aspect is of interest to us
- ordinal / discrete / continuous
- problem with granularity
- ex. gantt chart
- temporal aspect is of interest to us
-
arrangements
- linear or cyclic
-
point in time vs interval
-
mapping of time
- static - map onto spatial dimension
- ThemeRiver
- dynamic - create an animation (series of views)
- static - map onto spatial dimension