da: (bit)
[personal profile] da

There was a talk at the University last Wednesday afternoon with Ben Shneiderman of UMD Human Computer Interaction Lab


In short: pretty graphs and charts, but is it science?  On the other hand, does that matter?  I think yes it does matter, as long as the goal is to improve the sciences, as these folks are; instead of producing art qua art.

Schneiderman's lab has done a lot of work over the last 20-odd years developing new computer interfaces and visualization tools, including  the reasonably-popular Treemap visualizations.  They are interested in helping people "use vision to think".

A goal of his group is the methodic and scientific analysis of interface design. Controlled studies, scientific proof that their interfaces make it faster for people to do their work.  Amusingly, at the end of the talk, he admitted how difficult that is to do, especially since learning new interfaces goes very slowly for their subjects, and how in practice they're using case-studies and very little science.  Sigh.

He also said that they aren't following up with research subjects as often as they should.  One of their commercial products, Spotfire, is used in all top 25 drug-discovery companies, he says. But: they have little idea how it's used, in practice; only that customers are willing to pay $ 1k / seat / yr for licenses. 

Spotfire itself appears to be an interesting way to visualize complicated data.  There's a "demo" that gets into specifics of the user-interface at about the 1-minute mark.  (Warning, music and annoying marketroid speech.)  It would be awesome if other software used these same tools, such as Excel.  I liked their slider with drag-bars on both ends, and how quickly they produced result graphics (they say they produce output in 100ms on data with 1 million elements).

He showed some of the sorts of charts they provide in Spotfire, and how they compare. He briefly showed off an academic comparison of information visualization environments they put together, called Olive (network diagramming tools, 1-d/2-d/3-d environments, temporal, etc.)

A diversion with treemaps, including a demo of an itunes interface using treemaps which is pretty cool.

He demoed one of their time-series interfaces, TimeSearcher (downloadable for Windows) which lets the user select multiple "interesting" regions of line-charts to weed the overall-interesting ones from many-thousands of total pieces of data.  It was quite slick-looking, since he was talking about the Dow Jones averages and showing how to find underpriced stocks. 

He demoed a program called hierarchical clustering explorer (downloadable) for the task of finding features in many-variable data.  One big fraction of the screen is occupied by a "dendogram view" which I didn't understand terribly well in the 2-minute description. However: another chunk of the interface was devoted to comparing any pair out of a large number of variables, and I think that worked well for its intentions. 

He demoed with 14 variables.  You can compare any two in a 14x13 triangle, where clicking one of the boxes in the triangle brings up a scatterplot of the two variables.  Taking that a step further, you can choose a basis for comparison, such as correlation, and it will highlight all of the squares in the triangle which are "interesting" according to that basis.  They can do "correlation", "quadratics", "exponential", "least square error", and they're working on "uniformity".  He said it does interesting things to find outliers, which even finds outliers which are close to a clump.

I spoke to dan about this, and he brought up a large blind-spot; which I've verified from the software's website: this apparently does nothing with statistics.  A large fraction of science is the annoying task of determining whether the results could have been due to chance, and as far as I can tell, statistics are indeed a weakness for this tool.  (In a bit of brainstorming with dan, I think this could possibly be improved if they were willing to make this interface act more cleanly as a middle-layer between statistics tools and a database.  They said that they want to do that, but to really serve that role, instead of a menu of simplistic stats tools, it should be able to do chi-square tests, zeta tests, and so on.  But: I expect they've considered this; and I don't think my opinion is terribly informed on the matter.)

Interesting problems Shneiderman said they did solve: fitting 30 thousand pieces of visual data on the screen in an understandable arrangement; and design so selecting data in any view will affect the selections in all the other views.  Also, they decided to not use a heavy-weight database; they're using in-memory caches with linear data structures, and copies of the data for each bit that's currently visible in the interface so it doesn't get bogged down.  That was a bit of a surprise to me; I'd think a SQL db would be fast enough.

Finally, they discussed network data.  He says, and I agree, that nodes and lines can be a difficult means to understand relationships and communication.  There's often too much data obscuring the relationships, and positioning is usually arbitrary (where it will fit, or cause the least number of lines to overlap).  His group proposes "network visualization with semantic substrates" where the layout of the nodes is meaningful, and the user has control over what sort of links are visible.  A project they worked on has a goal that: every node is visible along with its in/out edges.  Every edge allows you to find the source and destination.  If you can't do that, a lot of visual cues are wasted.

December 2024

S M T W T F S
12 34567
891011121314
15161718192021
22232425262728
293031    

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Thursday, 25 December 2025 09:48 pm
Powered by Dreamwidth Studios