Sorting"case value" plots

CODAP Forums CODAP Help Forum Sorting"case value" plots

This topic contains 7 replies, has 4 voices, and was last updated by  Tim Erickson 5 months ago.

Viewing 8 posts - 1 through 8 (of 8 total)
  • Author
    Posts
  • #799 Score: 0

    Andee Rubin
    Participant

    I have a data set with the states of the US in one column and a number (e.g. incidence of Lyme disease) in the second.  I would like to make a graph that has state on the X-axis and incidence on the Y-axis but I would like the states to be plotted in order of increasing values of incidence.  So, I sorted the table by the numerical variable, then plotted state on the X-axis and incidence on the Y-axis — but the states still came out in alphabetical order.  Is there a way around this?  It does seem like something people would want to do with a categorical and a numerical variable.

    Link to dataset: http://bit.ly/lyme17

    • This topic was modified 5 months, 2 weeks ago by  Bill Finzer.
    #801

    Bill Finzer
    Keymaster

    Hi Andee,

    So you didn’t want to drag the states to the desired order?

    I was stuck for quite a while and almost gave up the search for an automated solution. But, as you can see in the screen captures, I was able to create a new attribute whose alphabetic order is in order of decreasing incidence.

    So that gives you a “workaround” but it’s hardly a generally useful solution.

    One wrinkle is that most of the time a given category (State in this case) has more than one value. So you would have to plot something like a mean to have single-values to order by. Then you could have a command somewhere in the graph interface to Order by Value.

    But do we want to add something to the interface to automate a task seldom encountered and that can be done manually?

    Bill

    #804

    Andee Rubin
    Participant

    No, I didn’t want to drag 50 states into the right order, especially when some of the values are relatively close and I couldn’t see them well.  (Or maybe you weren’t seriously asking that?)

    I actually disagree that most of the time there would be multiple values for a single category – maybe that’s the case in the data you’ve been working with, but it’s not the case in most of the data I’ve been working with.  Consider, for example, any data set that has one row/case per person – and several numbers associated with each person (height, weight, shoe size, etc.) and I’d like to see the distribution of heights as a case value plot, ordered, in the way they would be if people were standing in a line.

    Perhaps the whole realm of case-value plots is not one that CODAP wants to support – but I would argue, at least for the age group I’m working with, they are a very useful representational form for students to work with, certainly before they learn to write formulas.  (I’m not quite sure how the formula works, by the way – can you explain?  But it’s certainly not an entry-level formula..)

    #805

    Dan Damelin
    Keymaster

    One possibility would be to add some options to the menu that pops up when you click an axis title. If the axis is categorical we could provide a “Sort…” submenu and give options like [alphabetically (ascending), alphabetically (descending), by value (ascending), by value (descending)]. I think it needs some design thinking, but may be possible.

    #806

    Andee Rubin
    Participant

    I’m still a little confused about why this DOESN”T work…If I sort the TABLE, then the rows are in a different order – why don’t they end up on the graph that way?

    #807

    Bill Finzer
    Keymaster

    Regarding the connection between case order in the table and order of categories: We regard these two things as conceptually distinct. Suppose, for example that you have census records of people, each with a marital status. We don’t say that the order of the records should determine the order in which ‘married,’ ‘divorced,’ ‘never married,’ etc should appear on a categorical axis. The default order is alphabetical, and the user can change that order manually.

    But perhaps this brings up a research question: How do learners regard categorical values, and how do they come to understand the different ways they can be used in data visualization and analysis?

    #808

    Bill Finzer
    Keymaster

    Hi Andee,

    Regarding the formula workaround, I just thought of a simplification:

    (caseIndex+10)+”_”+State

    The goal is that the computed value sort alphabetically to the same order as the states appear in the table.

    caseIndex is the row number in the table. By adding 10, we avoid the complication that alphameric sorting places “10_” before “2_”. So once the states are sorted in decreasing order of Lyme disease incidence, the values computed by the formula will also sort in that order and that will be the default order in which they appear on a categorical axis.

    Confusing? You bet! 🙂

    #813

    Tim Erickson
    Participant

    Another take on this: Andee’s situation is one in which the State’s name or abbreviation is _unique_ in the data set. If we knew an attribute was not only categorical but also unique, we could conceivably do what Andee found natural: sort the table any way you want, and then plot the cases.

    I bet it would not be too hard (ha ha ha) to determine automatically if an attribute had this “unique” property.

    This may be part of the broader issue of “roles of attributes” where another one we have discussed is “time-like,” so that some time-series things might happen naturally in visualizations if it were on an axis.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic.