Topic: feature request—setting to show datapoints with missing values in graph : CODAP

This topic has 2 replies, 2 voices, and was last updated 3 months ago by Dan Damelin.

Viewing 3 posts - 1 through 3 (of 3 total)

Author

Posts
January 24, 2024 at 8:47 pm #8092 Score: 0
Bill Finzer
Keymaster
Andrew Ross was unable to post the following due to problems within the forum, so I’m posting it for him.

If I make a graph of a categorical variable, it will not show dots for the values that are missing. For example, using this document <https://codap.concord.org/app/static/dg/en/cert/index.html#shared=https%3A%2F%2Fcfm-shared.concord.org%2FHIJpiUlEXXzUy13gMIqJ%2Ffile.json>, make a new graph, and drag “InterestLocalSchools” to the x-axis. It will show 3 categories, with no indication that roughly half (1480ish out of 2867ish) of the cases have missing data for that attribute. I realize that not showing missing data is, in some sense, a feature. But as we all get more used to asking bias-related questions like “Who or what is missing from this dataset”, it would be nice to have some automatic indication of how much data is missing. I’m basically asking for CODAP to put it right in our face: hey, here’s info on missingness in the data for this graph. So we don’t have to remember (though reminding students to ask about it is good pedagogy I suppose).

Maybe it could be a per-graph setting via the eye icon: show missing data in this graph, or not? I would argue for it to default to showing the missing data just to be safe, though that might break a fair number of existing analyses. If missing data is not shown, it would be nice to have a notice on the graph saying “data with missing values not shown”, again to be safe. Or it could be a global setting somewhere.

I’m not even sure how I would get it to show the missing data right now. Introduce a new column that recodes the current column, including a code for missing, then plot that? Seems like a lot of work, especially if I needed to do it for a lot of columns.
- This topic was modified 3 months ago by Bill Finzer.
- This topic was modified 3 months ago by Bill Finzer.
January 24, 2024 at 9:37 pm #8095
Bill Finzer
Keymaster
Hello Andrew,

Thank you for your thoughtful analysis of CODAP’s current treatment (or non-treatment?) of missing values. I especially appreciate your reference to bias-related questions.

Just for fun I recoded the InterestLocalSchools attribute and enclosed the resulting dot chart. I agree that it could be quite a bit of work in some datasets with many attributes. And that doesn’t help when students are working with data they find themselves unless they’re sophisticated enough to recognize that there may be a problem and do the recoding.

As you point out, one can see CODAP’s current behavior as “feature” in that it keeps things simple for novices—one less complication to cope with as they get started in their data journey.

I’m going to log a feature request for CODAP to design a non-obtrusive way to indicate the count of missing values in graphs.

Thanks again, Bill
Attachments:
1. missing-cases.png
January 26, 2024 at 3:44 am #8097
Dan Damelin
Keymaster
Andrew,

One quick way to do recode the categorical variable the way Bill described above is to:
1. Drag the attribute to the far left to group by those categories
2. Fill in the empty cell with whatever you want to designate “missing”.
3. Drag it back to where it was before.
Click to see a video example.
Author

Posts

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.

feature request—setting to show datapoints with missing values in graph

Attachments: