Visualization Analysis & Design

25d ago
3 Views
1 Downloads
6.76 MB
51 Pages
Last View : 4d ago
Last Download : 4d ago
Upload by : Emanuel Batten
Transcription

Visualization Analysis & DesignTamara MunznerDepartment of Computer ScienceUniversity of British Columbia(minor edits: Helwig Hauser)D3 Unconference KeynoteNovember 21 2015, San Francisco CAhttp://www.cs.ubc.ca/ tmm/talks.html#[email protected]

Defining visualization (vis)Computer-based visualization systems provide visual representations of datasetsdesigned to help people carry out tasks more effectively.2

Why have a human in the loop?Computer-based visualization systems provide visual representations of datasetsdesigned to help people carry out tasks more effectively.Visualization is suitable when there is a need to augment human capabilitiesrather than replace people with computational decision-making methods. don’t need vis when fully automatic solution exists and is trusted many analysis problems ill-specified– don’t know exactly what questions to ask in advance possibilities– long-term use for end users (e.g. exploratory analysis of scientific data)– presentation of known results– stepping stone to better understanding of requirements before developing models– help developers of automatic solution refine/debug, determine parameters– help end users of automatic solutions verify, build trust3

Why use an external representation?Computer-based visualization systems provide visual representations of datasetsdesigned to help people carry out tasks more effectively. external representation: replace cognition with perception[Cerebral:Visualizing Multiple Experimental Conditions on a Graphwith Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEETVCG (Proc. InfoVis) 14(6):1253-1260, 2008.]4

Why represent all the data?Computer-based visualization systems provide visual representations of datasetsdesigned to help people carry out tasks more effectively. summaries lose information, details matter– confirm expected and find unexpected patterns– assess validity of statistical modelAnscombe’s QuartetIdentical statisticsx mean9x variance10y mean8y variance4x/y correlation 15

Why represent all the data?Computer-based visualization systems provide visual representations of datasetsdesigned to help people carry out tasks more effectively. summaries lose information, details matter– confirm expected and find unexpected patterns– assess validity of statistical modelAnscombe’s QuartetIdentical statisticsx mean9x variance10y mean8y variance4x/y correlation 15

Analysis framework: Four levels, three questionsdomain domain situationabstraction– who are the target users?idiomalgorithm abstraction[A Nested Model of Visualization Design and Validation.– translate from specifics of domain to vocabulary of visMunzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ] what is shown? data abstractiondomain often don’t just draw what you’re given: transform to new formabstraction why is the user looking at it? task abstraction idiom how is it shown? visual encoding idiom: how to draw interaction idiom: how to manipulate algorithm– efficient computationidiomalgorithm[A Multi-Level Typology of Abstract Visualization TasksBrehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). ]6

Why is validation difficult? different ways to get it wrong at each levelDomain situationYou misunderstood their needsData/task abstractionYou’re showing them the wrong thingVisual encoding/interaction idiomThe way you show it doesn’t workAlgorithmYour code is too slow7

Why is validation difficult? solution: use methods from different fields at each levelanthropology/ethnographyDomain situationObserve target users using existing toolsData/task yanthropology/ethnographyVisual encoding/interaction idiomJustify design with respect to alternativesAlgorithmMeasure system time/memoryAnalyze computational complexitytechnique-drivenworkAnalyze results qualitativelyMeasure human time with lab experiment (lab study)Observe target users after deployment ()Measure adoption[A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ]8

Why is validation difficult? solution: use methods from different fields at each levelanthropology/ethnographyDomain situationObserve target users using existing toolsData/task yanthropology/ethnographyVisual encoding/interaction idiomJustify design with respect to alternativesAlgorithmMeasure system time/memoryAnalyze computational complexitytechnique-drivenworkAnalyze results qualitativelyMeasure human time with lab experiment (lab study)Observe target users after deployment ()Measure adoption[A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ]8

Why is validation difficult? solution: use methods from different fields at each levelanthropology/ethnographyDomain situationObserve target users using existing toolsproblem-drivenworkData/task yanthropology/ethnographyVisual encoding/interaction idiomJustify design with respect to alternativesAlgorithmMeasure system time/memoryAnalyze computational complexitytechnique-drivenworkAnalyze results qualitativelyMeasure human time with lab experiment (lab study)Observe target users after deployment ()Measure adoption[A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ]8

Why analyze? imposes a structure on hugedesign space– scaffold to help you thinksystematically about choices– analyzing existing as stepping stoneto designing new

Why analyze?TreeJuxtaposerSpaceTree imposes a structure on hugedesign space– scaffold to help you thinksystematically about choices– analyzing existing as stepping stoneto designing rgetsPath between two nodesSpaceTreeEncode[SpaceTree: Supporting Exploration in LargeNode Link Tree, Design Evolution and EmpiricalEvaluation. Grosjean, Plaisant, and Bederson.Proc. InfoVis 2002, p 57–64.][TreeJuxtaposer: Scalable Tree Comparison Using Focus Context With Guaranteed Visibility. ACM Trans. onGraphics (Proc. SIGGRAPH) 22:453– 462, gateSelectArrangeAggregate9

What?DatasetsWhat?Why?How?AttributesData TypesItemsAttribute TypesAttributesLinksPositionsGridsCategoricalData and Dataset TypesTablesNetworks &TreesFieldsItemsItems PositionsClusters,Sets, ListsPositionsQuantitativeAttributesNetworksFields (Continuous)Grid of positionsAttributes (columns)LinkItems(rows)Node(item)Cell containing valueCellOrdering DirectionSequentialDivergingAttributes (columns)Value in cellMultidimensional TableOrdinalItemsDataset TypesTablesOrderedTreesCyclicValue in cellGeometry (Spatial)Dataset AvailabilityStaticPositionDynamic10

Types: Datasets and dataDataset TypesTablesSpatialNetworksNetworksFields (Continuous)Attributes (columns)LinkItems(rows)Cell containing valueGrid of positionsNode(item)Node(item)Attribute TypesCategoricalGeometry (Spatial)CellPositionAttributes (columns)Value in cellOrderedOrdinalQuantitative11

Why?ActionsTargetsAll ecordOneDeriveDistributiontag– discover distribution– compare trends– locate outliers– browse topologyTarget knownDependencyCorrelationSimilarityTarget unknownLocationknownLookupBrowseNetwork ifyManyExtremesSearch {action, target} areSummarizeWhat?Spatial DataShapeWhy?How?12

Actions 1: Analyze consume– discover vs present classic split aka explore vs explainAnalyzeConsumeDiscoverPresentEnjoy– enjoy newcomer aka casual, socialProduceAnnotateRecordDerivetag produce– annotate, record– derive crucial design choice13

Derive don’t just draw what you’re given!– decide what the right thing to show is– create it with a series of transformations from the original dataset– draw that one of the four major strategies for handling complexityexportsimportstradebalancetrade balance exports importsOriginal DataDerived Data14

Analysis example: Derive one attribute Strahler number– centrality metric for trees/networks– derived quantitative attribute– draw top 5K of 500K for good skeleton[Using Strahler numbers for real time visual exploration of huge graphs. Auber.Proc. Intl. Conf. Computer Vision and Graphics, pp. 56–69, 2002.]Task 1Task ibute on nodesWhat?In TreeOut Quantitativeattribute on 4.64.94InTree InQuantitativeattribute on nodesWhat?In TreeIn Quantitative attribute on nodesOut Filtered TreeOutFiltered TreeRemovedunimportant partsWhy?SummarizeTopologyHow?ReduceFilter15

Actions II: Search what does user know?Search– target, locationTarget knownTarget ateExplore16

Actions III: Query what does user know?Search– target, location how much of the datamatters?– one, some, allTarget knownTarget ateExploreQuery analyze, search, queryIdentifyCompareSummarize– independent choices for each17

TargetsAll DataTrendsNetwork ributionManyDependencyCorrelationSimilaritySpatial DataShapeExtremes18

pfrom categorical and orderedattributesManipulate rAlignHueSaturationLuminanceSize, Angle, Curvature, .UseShapeMotionDirection, Rate, Frequency, .19

How to encode: Arrange space, map channelsEncodeArrangeExpressSeparateMapfrom categorical and anceSize, Angle, Curvature, .UseShapeMotionDirection, Rate, Frequency, .20

Encoding visually analyze idiom structure21

Definitions: Marks and channels marksPointsAreasLines– geometric primitivesPosition channelsHorizontalColorVerticalBoth– control appearance of marksShapeTiltSizeLengthAreaVolume22

Encoding visually with marks and channels analyze idiom structure– as combination of marks and channels1:vertical position2:vertical positionhorizontal position3:vertical positionhorizontal positioncolor hue4:vertical positionhorizontal positioncolor huesize (area)mark: linemark: pointmark: pointmark: point23

Channels: Expressiveness types and effectiveness rankingsMagnitude Channels: Ordered AttributesIdentity Channels: Categorical AttributesPosition on common scaleSpatial regionPosition on unaligned scaleColor hueLength (1D size)MotionTilt/angleShapeArea (2D size)Depth (3D position)Color luminanceColor saturationCurvatureVolume (3D size)24

Channels: Matching TypesMagnitude Channels: Ordered AttributesIdentity Channels: Categorical AttributesPosition on common scaleSpatial regionPosition on unaligned scaleColor hueLength (1D size)MotionTilt/angleShapeArea (2D size)Depth (3D position) expressiveness principle– match channel and data characteristicsColor luminanceColor saturationCurvatureVolume (3D size)25

Channels: RankingsMagnitude Channels: Ordered AttributesIdentity Channels: Categorical AttributesPosition on common scaleSpatial regionPosition on unaligned scaleColor hueLength (1D size)MotionTilt/angleShapeArea (2D size)Depth (3D position)Color luminanceColor saturationCurvatureVolume (3D size) expressiveness principle– match channel and data characteristics effectiveness principle– encode most important attributes withhighest ranked channels26

pfrom categorical and orderedattributesManipulate rAlignHueSaturationLuminanceSize, Angle, Curvature, .UseShapeMotionDirection, Rate, Frequency, .27

How to handle complexity: 3 more erSelectPartitionAggregateNavigateSuperimposeEmbed 1 previousDerive change view over time facet across multipleviews reduce items/attributeswithin single view derive new data toshow within view28

How to handle complexity: 3 more erSelectPartitionAggregate 1 previousDerive change over time- most obvious & flexibleof the 4 strategiesNavigateSuperimposeEmbed29

Idiom: Animated transitions smooth transition from one state to another– alternative to jump cuts– support for item tracking when amount of change is limited example: multilevel matrix views– scope of what is shown narrows down middle block stretches to fill space, additional structure appears within other blocks squish down to increasingly aggregated representations[Using Multilevel Call Matrices in Large Software Projects. van Ham. Proc. IEEE Symp. Information Visualization (InfoVis), pp. 227–232, 2003.]30

How to handle complexity: 3 more erSelectPartitionAggregateNavigateSuperimposeEmbed 1 previousDerive facet data acrossmultiple views31

FacetJuxtaposeCoordinate Multiple Side By Side ViewsShare Encoding: Same/DifferentLinked HighlightingPartitionShare Data: All/Subset/NoneSuperimposeShare Navigation32

Idiom: Linked highlightingSystem: EDV see how regionscontiguous in one vieware distributed withinanother– powerful and pervasiveinteraction idiom encoding: different– multiform data: all shared[Visual Exploration of Large Structured Datasets.Wills. Proc. New Techniquesand Trends in Statistics (NTTS), pp. 237–246. IOS Press, 1995.]33

Idiom: bird’s-eye mapsSystem: Google Maps encoding: same data: subset shared navigation: shared– bidirectional linking differences– viewpoint– (size) overview-detail[A Review of Overview Detail, Zooming, and Focus Context Interfaces.Cockburn, Karlson, and Bederson. ACM Computing Surveys 41:1 (2008),1–31.]34

Idiom: Small multiplesSystem: Cerebral encoding: same data: none shared– different attributes fornode colors– (same network layout) navigation: shared[Cerebral:Visualizing Multiple Experimental Conditions on a Graph with Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEE Trans.Visualization and Computer Graphics (Proc. InfoVis 2008) 14:6 (2008), 1253–1260.]35

Coordinate views: Design choice ilSmall MultiplesMultiformMultiform,Overview/DetailNo Linkage why juxtapose views?– benefits: eyes vs memory lower cognitive load to move eyes between 2 views than remembering previous state withsingle changing view– costs: display area, 2 views side by side each have only half the area of one view36

Partition into views how to divide data between viewsPartition into Side-by-Side Views– encodes association between itemsusing spatial proximity– major implications for what patternsare visible– split according to attributes design choices– how many splits all the way down: one mark per region? stop earlier, for more complex structurewithin region?– order in which attribs used to split– how many views37

Partitioning: List alignment single bar chart with grouped bars small-multiple bar charts– split by state into regions– split by age into regions complex glyph within each region showing all ages one chart per region– compare: easy within state, hard across ages11.065 Years and Over45 to 64 Years25 to 44 Years18 to 24 Years14 to 17 Years5 to 13 YearsUnder 5 Years10.09.08.07.0– compare: easy within age, harderacross 011CATKNYFLILPAhttp:/bl.ocks.org/mbostock/3887051 http:/bl.ocks.org/mbostock/467920250CATKNYFLILPA38

Partitioning: Recursive subdivisionSystem: HIVE split by neighborhood then by type then time– years as rows– months as columns color by price neighborhood patterns– where it’s expensive– where you pay much morefor detached type[Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, and Wood. IEEE Transactions on Visualization and Computer Graphics(Proc. InfoVis 2009) 15:6 (2009), 977–984.]39

Partitioning: Recursive subdivisionSystem: HIVE switch order of splits– type then neighborhood switch color– by price variation type patterns– within specific type, whichneighborhoods inconsistent[Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, and Wood. IEEE Transactions on Visualization and Computer Graphics(Proc. InfoVis 2009) 15:6 (2009), 977–984.]40

Partitioning: Recursive subdivisionSystem: HIVE different encoding forsecond-level regions– choropleth maps[Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, and Wood. IEEE Transactions on Visualization and Computer Graphics(Proc. InfoVis 2009) 15:6 (2009), 977–984.]41

How to handle complexity: 3 more erSelectPartitionAggregateNavigateSuperimposeEmbed 1 previousDerive reduce what is shownwithin single view42

Reduce items and attributes reduce/increase: inverses filter– pro: straightforward and intuitive to understand and computeReducing Items and AttributesFilterItems– pro: inform about whole set– con: difficult to avoid losing signal not mutually exclusive– combine filter, aggregate– combine reduce, facet, change, deriveFilterAggregateAttributes– con: out of sight, out of mind aggregationReduceEmbedAggregateItemsAttributes43

Idiom: boxplotstatic item aggregationtask: find distributiondata: tablederived data!44!!!!!2 median: central line lower and upper quartile: boxes lower upper fences: whiskers0!0– 5 quant attribs22!!2 multi-modality is particularly imp!!– values beyond which items are outliersnskmm– outliers beyond fence cutoffs explicitly shownFigure 4: From left to right: box plot[40 years of boxplots.Wickham and Stryjewski. 2012. had.co.nz]right are: standard normal (n), right44

Idiom: Dimensionality reduction for documents attribute aggregation– derive low-dimensional target space from high-dimensional measured spaceTask 2Task 1Task 3wombatInHD dataWhat?In Highdimensional dataOut 2D dataOut2D dataWhy?ProduceDeriveIn2D dataWhat?In 2D dataOut ScatterplotOut Clusters &pointsOutScatterplotClusters & eSelectInScatterplotClusters & pointsWhat?In ScatterplotIn Clusters & pointsOut Labels forclustersOutLabels forclustersWhy?ProduceAnnotate45

What?DatasetsData yzeNetworks & FieldsTreesConsumeItemsItems ItemsPositionsClusters,Sets, ListsCategoricalAttributes (columns)tagItems(rows)SearchCell containing valueMultidimensional TableLocationknownLocationValuein ttributes kExpressCellNode(item)OrderTreesTarget knownLookupGrid of positionsSeparateAttributes (columns)Target unknownBrowseUseLocateOrdering nipulateChangeCorrelationSimilaritySize, Angle, Curvature, .Network ributionMapfrom categorical and ue in cellGeometry works RecordTablesAll DataOrderedPositionsProduceDataset ionAttribute TypesWhy?Data and Dataset TypesTablesdomainAttributesTopologyPathsDirection, Rate, Frequency, .What?PositionSpatial Data46

More [email protected] this talkhttp://www.cs.ubc.ca/ tmm/talks.html#vad15d3 book page (including tutorial lecture slides)http://www.cs.ubc.ca/ tmm/vadbook– 20% promo code for book ebook combo:HVN17– – illustrations: Eamonn Maguire papers, videos, software, talks, full w.cs.ubc.ca/ tmmVisualization Analysis and Design.Munzner. A K Peters Visualization Series, CRC Press, Visualization Series, 2014.47

Visualization Analysis & Design Tamara Munzner Department of Computer Science University of British Columbia D3 Unconference Keynote November 21 2015, San Francisco CA . [A Nested Model of Visualization Design and Validation. Munzner. File Size: 6MB