[R-group] Post on ANOVAs

Michael Renton michael.renton at uwa.edu.au
Tue Oct 7 09:03:07 WST 2014


Good discussion I agree. I'll add my two cents worth...

I get a bit bemused by the idea that there should even be some clear right or wrong about whether to drop non-significant terms. Essentially they are just different models of the data, and as the quote says, all models are wrong, but some are useful. 

I think the way to approach this stuff depends quite a bit on whether I have a specifically designed and controlled experiment, or whether I'm doing a more exploratory analysis with lots of covariates and without a specific design to test each possible factor independently. 



If I'm analysing a specifically designed and controlled experiment, then the questions should be pretty focussed - presumably someone designed the experiment to test some specific hypotheses - if they designed a factorial experiment then I'd include interactions in the model, otherwise, why would we have bothered doing a factorial experiment? If we weren't interested in interactions then we wouldn't have included them in the design. But it would be a fairly boring experiment without at least second order interactions probably? :)

Simplest case, 2-way factorial design, it would make no sense to not include the interaction. If the interaction's significant, or even close to significant, then that is evidence that the effect of the factors depend on each other, and so any main effects have to be interpreted carefully. It's not really a question of whether it's right or wrong to look at the significance of the main effect, but I guess if we want a black/white rule it's fair to say that you really can't blindly interpret main effects if the interaction is significant. Like Bruno says - graphical interpretation is the key. The data should drive the interpretation, not the p-values. And if the interaction isn't significant, would you drop it? If it's really insignificant then it should make little difference whether we drop it or not. If it does, say it changes one of the main effects from non-significant to significant, then that main effect is pretty borderline whether or not the interaction is included and whether or not it's just on the right or wrong side of the magic p=0.05. And if the interaction itself is borderline significant then really I think you've got two alternative viable models - one where the effect of the factors depend on each other and one where they don't and you should think about each of them. Still they shouldn't say anything really different to each other. 


In a more exploratory analysis with lots of covariates I would agree that it makes sense to only include terms and interactions that you are interested in (that make biological sense) - but we are probably interested in just about everything, and suspect that all the variables and many of the interactions might be biologically important (otherwise why would we have measured them?). But if there are interactions that clearly don't make sense then I'd leave them out of the initial model. They might turn out to be significant and then we'd have to try to explain them! :)
In this more exploratory analysis with lots of covariates I think a model selection approach dropping terms that aren't significant (or that improve AIC, or whatever) makes a lot of sense - we probably need to do that to get any kind of clarity about the data. But we must be aware it's a very 'grey' process (as opposed to black and white) - and the conclusions we get to are going to be tentative in most cases. We'll often have lots of alternative viable models that are hard to prioritise, especially if there is correlation and co-linearity in the explanatory covariates . People have proposed all kinds of techniques for trying to deal with having lots of viable alternative models in a rule-based way , like model-weighting, but I don't think they add much clarity. Fact is, you just have different possible models that you need to consider. And we have to understand that any kind of automated model selection process based on dropping non-significant terms (or AIC) may well miss good models, or even 'the best'.


And I guess if we have a lot of covariates, but clear hypotheses about what we want to test between, then model selection among a set of carefully chosen candidate models rather than exploratory model simplification makes sense. But still the question of model selection is rarely going to be black and white. 



-----Original Message-----
From: r-group-bounces at maillists.uwa.edu.au [mailto:r-group-bounces at maillists.uwa.edu.au] On Behalf Of Clelia Gasparini
Sent: Monday, 6 October 2014 3:15 PM
To: Bruno Buzatto; r-group at maillists.uwa.edu.au
Subject: RE: [R-group] Post on ANOVAs


I agree completely with you Bruno.
I also think that  model selection based on what it's known to be biologically important  make more sense than deciding to include or drop off an effect based solely on statistical significance. 

Btw, I love this discussion mailing list a lot!

cheers,
clelia 

*********************************************

Dr Clelia Gasparini

Dipartimento di Biologia
Universita' di Padova
via U Bassi 58/B
35100 Padova
Italy

Centre for Evolutionary Biology
School of Animal Biology (M092)
The University of Western Australia
Crawley, WA 6009
Australia
________________________________________
From: r-group-bounces at maillists.uwa.edu.au [r-group-bounces at maillists.uwa.edu.au] On Behalf Of Bruno Buzatto [bruno.buzatto at uwa.edu.au]
Sent: Monday, 6 October 2014 10:41 AM
To: r-group at maillists.uwa.edu.au
Subject: RE: [R-group] Post on ANOVAs

Thanks for sharing that post, very interesting.

I agree with the central message about how we should follow this recommendation: "decide ahead of time if you care about the interaction term and/or the main effects and only include the terms you are interested in in the model, then interpret them all together".

More specifically about the interactions, I also agree that they can be dropped when not significant (some people only do it if p > 0.25, which is a conservative approach that I endorse), and they should always be visually interpreted when significant. The importance of main effects, when interactions are significant, can vary greatly - it can go in opposite directions depending in different treatments, have different intensities on different treatments, etc...

Model simplification is more controversial I guess. I have never liked the practice of dropping non-significant main effects, I would stick with only dropping non-significant interactions and drawing conclusions from full models most of the time. And I would also chose proper model selection, rather than model simplification, again emphasizing the importance of using biology to decide on which models should be put to compete with each other, if possible. But I also need to acknowledge that model selection is a big topic to study, and one on which I am no expert for sure!

Cheers,
Bruno



--
Bruno Alves Buzatto
Postdoctoral Research Associate
Centre for Evolutionary Biology
School of Animal Biology; University of Western Australia - Crawley, WA - Australia
emails: bruno.buzatto at gmail.com / bruno.buzatto at uwa.edu.au
Phones: +61 8 425831125 / +61 8 64882699
ResearcherID: B-6583-2011
ORCID: 0000-0002-2711-0336
Personal website: www.buzatto.info

****** NEWS: The Evolution of Insect Mating Systems, now available from bookshops or directly from here:
http://ukcatalogue.oup.com/product/9780199678037.do
________________________________________
From: r-group-bounces at maillists.uwa.edu.au [r-group-bounces at maillists.uwa.edu.au] On Behalf Of Amy Prendergast [20354565 at student.uwa.edu.au]
Sent: Saturday, October 04, 2014 5:11 PM
To: r-group at maillists.uwa.edu.au
Subject: [R-group] Post on ANOVAs

Thought I'd share - any opinions on best practice for how to interpreting interactions and main effect, and on model simplification? http://dynamicecology.wordpress.com/2014/10/02/interpreting-anova-interactions-and-model-selection/
_______________________________________________
Ready to present an R meeting? Book a date on
http://goo.gl/tws96 and send the group a timely announcement to the R-group mailing list (R-group at maillists.uwa.edu.au).

You can subscribe and unsubscribe via our web page:
http://maillists.uwa.edu.au/mailman/listinfo/r-group
_______________________________________________
Ready to present an R meeting? Book a date on
http://goo.gl/tws96 and send the group a timely announcement to the R-group mailing list (R-group at maillists.uwa.edu.au). 

You can subscribe and unsubscribe via our web page: 
http://maillists.uwa.edu.au/mailman/listinfo/r-group


More information about the R-group mailing list