login/register

Snip!t from collection of Alan Dix

see all channels for Alan Dix

Snip
summary

To verify cluster separation in high-dimensional data, a...
technique, and then visualize it with 2D Scatterplots, i ...
of providing guidance between these visual encoding choi ...
manually inspected a broad set of 816 scatterplots deriv ...
... guidanc

ScatterplotEval.pdf
http://www.cs.ubc.ca/labs/imager/tr/2013/ScatterplotEval/ScatterplotEval.pdf

Categories

/Channels/visualisation

[ go to category ]

For Snip

loading snip actions ...

For Page

loading url actions ...

To verify cluster separation in high-dimensional data, analysts often reduce the data with a dimension reduction (DR)
technique, and then visualize it with 2D Scatterplots, interactive 3D Scatterplots, or Scatterplot Matrices (SPLOMs). With the goal
of providing guidance between these visual encoding choices, we conducted an empirical data study in which two human coders
manually inspected a broad set of 816 scatterplots derived from 75 datasets, 4 DR techniques, and the 3 previously mentioned
scatterplot techniques. Each coder scored all color-coded classes in each scatterplot in terms of their separability from other classes.
We analyze the resulting quantitative data with a heatmap approach, and qualitatively discuss interesting scatterplot examples. Our
findings reveal that 2D scatterplots are often ‘good enough’, that is, neither SPLOM nor interactive 3D adds notably more cluster
separability with the chosen DR technique. If 2D is not good enough, the most promising approach is to use an alternative DR
technique in 2D. Beyond that, SPLOM occasionally adds additional value, and interactive 3D rarely helps but often hurts in terms of
poorer class separation and usability. We summarize these results as a workflow model and implications for design. Our results offer
guidance to analysts during the DR exploration process.

HTML

<div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 146.782px; top: 184.249px; transform: scale(1.03207, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="577.5274705449928">To verify cluster separation in high-dimensional data, analysts often reduce the data with a dimension reduction (DR)</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 196.869px; transform: scale(1.02181, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="620.6292318295251">technique, and then visualize it with 2D Scatterplots, interactive 3D Scatterplots, or Scatterplot Matrices (SPLOMs). With the goal</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 209.488px; transform: scale(1.03083, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="620.6292318295253">of providing guidance between these visual encoding choices, we conducted an empirical data study in which two human coders</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 222.108px; transform: scale(1.03481, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="620.6292318295253">manually inspected a broad set of 816 scatterplots derived from 75 datasets, 4 DR techniques, and the 3 previously mentioned</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 234.727px; transform: scale(0.997796, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="620.6292318295253">scatterplot techniques. Each coder scored all color-coded classes in each scatterplot in terms of their separability from other classes.</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 247.347px; transform: scale(1.00461, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="620.6292318295251">We analyze the resulting quantitative data with a heatmap approach, and qualitatively discuss interesting scatterplot examples. Our</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 259.965px; transform: scale(1.03381, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="620.6292318295253">findings reveal that 2D scatterplots are often &#x2018;good enough&#x2019;, that is, neither SPLOM nor interactive 3D adds notably more cluster</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 272.585px; transform: scale(1.03787, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="620.6292318295252">separability with the chosen DR technique. If 2D is not good enough, the most promising approach is to use an alternative DR</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 285.204px; transform: scale(1.00904, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="620.6292318295251">technique in 2D. Beyond that, SPLOM occasionally adds additional value, and interactive 3D rarely helps but often hurts in terms of</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 297.824px; transform: scale(1.00053, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="620.6292318295251">poorer class separation and usability. We summarize these results as a workflow model and implications for design. Our results offer</div><div dir="ltr" style="font-size: 10.6267px; font-family: sans-serif; left: 103.68px; top: 310.443px; transform: scale(1.00006, 1); transform-origin: 0% 0% 0px;" data-font-name="g_font_768_0" data-canvas-width="262.9993811713254">guidance to analysts during the DR exploration process.</div>