Random sample of a dataset

Tags:
Viewing 1 post (of 1 total)
  • Author
    Posts
  • #604 Score: 0
    Tim Erickson
    Participant

    There are two basic strategies I know of for taking a sample:

    (1) Use the sampler.
    (a) With your data already in a table, choose the Sampler from the Plugins tool in the toolbar.
    (b) Choose Collector at the bottom of the sampler, then set the sample size and number of samples
    (c) Click Start. You will get a new table with your sample. This new table is hierarchical; if you take another sample, it will appear in a new sub-table.

    (By the way: the Sampler lets you choose, in its Options panel, whether this sample is with or without replacement. If you are taking repeated samples, this is the way you want to do it!)

    (2) Make a new attribute in the existing table, like this:
    (a) Make a new attribute and give it a name, such as “r.” (for random)
    (b) Give it the formula random( ).
    (c) Sort on that attribute. You have now “scrambled” the order of cases.
    (d) Select the first however many you want in your sample.
    (e) In the table’s “eyeball” menu, choose Set Aside Unselected Cases.

    (This technique produces samples without replacement. If you are making a “training set”, this might be the technique you want. )

Viewing 1 post (of 1 total)
  • You must be logged in to reply to this topic.