Sunday, January 31, 2010

Data Classification Lab – Andrew V Murphy


Data Classification Lab
Notice that the data only goes out to 2 or three decimals since I limited the values to 1 more than the significant value.  Presenting the additional zeros was simply map clutter.

The question asked which classification best fit or represented the given data of these four methods
Standard Deviation was ruled out first as it was limited by the number of breaks which seemed unsatisfactory to me.  Another reason to rule out the Standard Deviation is captured in the name.  Since the data does not fall into a normal distribution, this option is simply not a good choice.
Natural Break held promise but since I did not know much about the data then the weakness of this choice became evident.  As noted in our text, "the class limits are subjective and can vary" and was thus ruled out. 
Equal Intervals and Quantiles both seemed to work for me.
Quantiles, like Equal Intervals,  can be computed manually.  Another advantage is that an equal number of data in each class. Clarke goes in great detail on page 62 of our text.  In the end I ruled it out because one can have gaps which could confuse the reader as they wonder why some data was left out.

I chose Equal Intervals for my final selection.  It has several advantages - it is easy to calculate (you can choose how many classes are needed),  easy to interpret, and the limits do not contain gaps.


                                        

1 comment:

  1. The point of the lab was to understand the methods and use that logic..so great job. There was not one correct answer.

    ReplyDelete