Monday, April 7, 2014

Notes on Visualisation and Interfaces for Large Datasets

For anyone interested in visualisation-type stuff of large datasets, here are a few leads (from the literature search I've been doing as part of my research work) to save some time. I actually have some more comprehensive write ups on many of these topics (and also on other topics), but that'll be for another day...

Node-link diagrams (i.e. similar to the Nodes Editor in Blender when all the nodes are collapsed, or basically any of the diagrams commonly used to represent graphs/networks in computer science) do not scale to large datasets. Things break down once you move past 10-20 links per node. For small setups though (i.e. those with small numbers of links between nodes - in effect, every "toy problem" usually used by researchers doing work on these things), things are manageable.

One interesting variant here though is the "SpaceTree" [Grosjean, Plaisant, Bederson, 2002], which dynamically rescales and adjusts the layout to squeeze as much of the tree on screen as it can. It is however limited to trees with a "preferred" direction.

- Someone has looked into the problem of getting lost in multi-level information spaces (aka "Desert Fog" in Zoomable User Interfaces). Check out the work (e.g. [1], etc.) of Jul and Furnas (aka the "Fisheye" guy) for ideas.

- All roads lead to TreeMaps. As soon as you move away from any kind of radial or node-link layouts of hierarchical data, there's only one way that will end, and that is a reinvention of the basic treemaps formulation [Johnson and Shneiderman, 1991]. Any tweaks you achieve will just be yet another spatial partitioning scheme for treemaps. Have fun with that if that's your thing. Just don't go thinking that you've invented anything new... you've just innovated a better wheel.

- Hierarchical lists (i.e. table of contents, bookmarks, folder tree, etc. but also lists of panels collapsible, accordion, etc. as found in tool palettes and Blender's interface) have been evaluated on several occasions (and yes, I have run experiments on this as well). The findings are quite clear on this: we're all doing it wrong! It's more efficient and annoying when people don't have to scroll to find category headings. The way that they do this scrolling varies a bit with the length of the list and the person - generally, those who strictly stick to madly raking their scrollwheels in all cases are slower than those who just start dragging the scrollthumb around on the scrollbar. 

- Scrollbars serve an important role in user interfaces. Although successive iterations of UI designs from certain companies (I'm looking at you Apple and to a lesser extent Google) have been aggressively trimming away scrollbars, guided by the dogma that they are "Useless. Ugly. Clutter", all the evidence I've seen so far points the other way! Instead of trimming them down, we really ought to be actually putting more information on them, whether that be revisitation info (e.g. like the Readwear or Footprints scrollbars), search results (e.g. as in Chrome), markers for important headings/items and other predictions/suggestions for potentially salient targets (as outlined by Willet et al.'s "Scented Widgets" paper). Why?

- Spatial Memory is an important capability we have. For example, without looking, try doing the following exercises: 1) Point to where you'd find the Paste button in Word, 2) now point to the Insert Picture, 3) to select "Heading 3" preset style..., and finally, 4) to bring up the page margins dialog.  Unless you haven't actually used some of these commands, chances are you probably had some vague inkling of the items we're talking about - if nothing else, the sight of the relevant buttons come to mind - and your arm can probably shoot off straight to the rough area without much extra thought/effort. This much the experiments have found.

- Tabs for command selection and "mode errors" - One of the problems with tab-based interfaces for command selection (and really, many other forms of strongly hierarchical organisations structures for command sets) are mode-error like problems where you automatically go to select a command, but find that you're on the wrong tab to do so. This has been attributed to "spatial overloading" - that is, more than one item ends up being located at a certain point on screen; this in turn means that while our spatial memory tells us that an item is at a certain location, it might not be so because we've got the wrong context. Perhaps this is the most severe problem with Ribbons today (other aspects of them are really not that bad when you think about what they do - especially in terms of the Fitts' Law and visibility benefits they give to functionality which may otherwise not be noticed/found).

- Consider the "Information Scent" you provide. Pirolli and Card's "Information Foraging Theory" is a very interesting/useful model for thinking about how people go about using various tools/interfaces. The basic idea is that, like animals foraging for food, humans go "foraging" for information by making a series of small near-impulsive decisions (aka "satisficing" - basically, making "best guesses" given incomplete information) based on the "information scent" of potential options. Information scent is basically a measure of the perceived value offered/to be gained from looking into something versus the amount of effort needed to do so.

In essence, this says that we take the path of items which look like they will most likely lead to our goals. This leads into the whole breadth vs depth issue (in short, we are better with shallow and broad hierarchies/taxonomies than deep and narrow). Another issue is the notion of information patches (i.e. different sources of info or different interaction techniques), and the tradeoffs there between staying within the same familiar patch versus branching out to another patch.

This was just a quick intro to these concepts, but there's quite a bit that can be learned/gained from this.

1 comment:

  1. Lawrence D’OliveiroMarch 18, 2015 at 5:26 PM

    Have you read any of Edward Tufte’s books? E.g. “Visual Display of Quantitative Information”.