Abnormal Data

Over the past few days I’ve been analyzing data sets from some of the experiments I have conducted. It seems that the data I am generating is consistently not normal. This is caused by the design of the experiment. I have a cluster of data on the lowest end of the scale that indicates plants that have died during the experiment, then there is a normal distribution from the survivors.  Here is an example:

Non-transformed root weight data (blue line = mean).

Notice the spike on the far left (that’s the dead plants). I can make the data set look a little prettier (but far from clean) by generating a  square root transformed data set.

Square root transformed data (blue line = mean).

No simple bell curves here. When I was planning my analyses I was thinking about parametric tests. Now I’m starting to set up SAS code for non-parametric tests.

So my data are not normal, they are extraordinary.



4 responses to “Abnormal Data

  1. Byran

    Have you considered first doing logistic regression in which you’d collapse your data into just two categories, dead or alive? I know that throws away a lot of information, but potentially you could do that analysis first, make conclusions about what factors are killing things, and then do another traditional regression, conditional on the fact that the plants lived (i.e. get rid of all the dead plants).

    • That’s an interesting thought, I think I’ll look into that. I have to admit, the thought of running an analysis conditional on the plants living never crossed my mind. Thanks for the suggestion.

  2. Adam

    You and your fancy numbers and graphs. Back in my day, the only thing that mattered about a plant was how dirty its uniform got when it hit the ball the other way and moved a runner over.

