Sunday, August 28, 2011

Comparing two regression slopes by means of an ANCOVA

Regressions are commonly used in biology to determine the causal relationship between two variables. This analysis is most commonly used in morphological studies, where the allometric relationship between two morphological variables is of fundamental interest. Comparing scaling parameters (i.e. slopes) between groups can be used by biologist to assess different growth patterns or the development of different forms or shapes between groups. For examples, the regression between head size and body size may be different between males and females if they grow differently. This difference in allometric growth should manifest itself as a different slope in both regression lines.


The analysis of covariance (ANCOVA) is used to compare two or more regression lines by testing the effect of a categorical factor on a dependent variable (y-var) while controlling for the effect of a continuous co-variable (x-var). When we want to compare two or more regression lines, the categorical factor splits the relationship between x-var and y-var into several linear equations, one for each level of the categorical factor.
Regression lines are compared by studying the interaction of the categorical variable (i.e. treatment effect) with the continuous independent variable (x-var). If the interaction is significantly different from zero it means that the effect of the continuous covariate on the response depends on the level of the categorical factor. In other words, the regression lines have different slopes (Right graph on the figure below). A significant treatment effect with no significant interaction shows that the covariate has the same effect for all levels of the categorical factor. However, since the treatment effect is important, the regression lines although parallel have different intercepts. Finally, if the treatment effect is not significant nor its interaction with the covariate (but the coariate is significant), this means there is a single regression line. A reaction norm is used to graphically represent the possible outcomes of an ANCOVA.
For this example we want to determine how body size (snout-vent length) relates to pelvic canal width in both male and female alligators (data from Handbook of Biological Statistics). In this specific case, sex is a categorical factor with two levels (i.e. male and female) while snout-vent length is the regressor (x-var) and pelvic canal width is the response variable (y-var). The ANCOVA will be used to assess if the regression between body size and pelvic width are the comparable between the sexes.


An ANCOVA is able to test for differences in slopes and intercepts among regression lines. Both concepts have different biological interpretations. Differences in intercepts are interpreted as differences in magnitude but not in the rate of change. If we are measuring sizes and regression lines have the same slope but cross the y-axis at different values, lines should be parallel. This means that growth is similar for both lines but one group is simply larger than the other. A difference in slopes is interpreted as differences in the rate of change. In allometric studies, this means that there is a significant change in growth rates among groups.
Slopes should be tested first, by testing for the interaction between the covariate and the factor. If slopes are significantly different between groups, then testing for different intercepts is somewhat inconsequential since it is very likely that the intercepts differ too (unless they both go through zero). Additionally, if the interaction is significant testing for natural effects is meaningless (see The Infamous Type III SS). If the interaction between the covariate and the factor is not significantly different from zero, then we can assume the slopes are similar between equations. In this case, we may proceed to test for differences in intercept values among regression lines.

Performing an ANCOVA

For an ANCOVA our data should have a format very similar to that needed for an Analysis of Variance. We need a categorical factor with two or more levels (i.e. sex factor has two levels: male and female) and at least one independent variable and one dependent or response variable (y-var).
> head(gator)
sex snout pelvic
1 male  1.10   7.62
2 male  1.19   8.20
3 male  1.13   8.00
4 male  1.15   9.60
5 male  0.96   6.50
6 male  1.19   8.17
The preceding code shows the first six lines of the gator object which includes three variables: sex, snout and pelvic, which hold the sex, snout-vent size and the pelvic canal width of alligators, respectively. The sex variable is a factor with two levels, while the other two variables are numeric in their type.
We can do an ANCOVA both with the lm() and aov() commands. For this tutorial, we will use the aov() command due to its simplicity.
> mod1 <- aov(pelvic~snout*sex, data=gator)

> summary(mod1)             Df Sum Sq Mean Sq  F value    Pr(>F)
snout        1 51.871  51.871 134.5392 8.278e-13 ***
sex          1  2.016   2.016   5.2284   0.02921 *
snout:sex    1  0.005   0.005   0.0129   0.91013
Residuals   31 11.952   0.386                   

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
The previous code shows the ANCOVA model, pelvic is modeled as the dependent variable with sex as the factor and snout as the covariate. The summary of the results show a significant effect of snout and sex, but no significant interaction. These results suggest that the slope of the regression between snout-vent length and pelvic width is similar for both males and females.
A second more parsimonious model should be fit without the interaction to test for a significant differences in the slope.
> mod2 <- aov(pelvic~snout+sex, data=gator)
> summary(mod2)
        Df Sum Sq Mean Sq  F value    Pr(>F)
snout        1 51.871  51.871 138.8212 3.547e-13 ***
sex          1  2.016   2.016   5.3948   0.02671 *
Residuals   32 11.957   0.374                   
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
The second model shows that sex has a significant effect on the dependent variable which in this case can be interpreted as a significant difference in ‘intercepts’ between the regression lines of males and females. We can compare mod1 and mod2 with the anova() command to assess if removing the interaction significantly affects the fit of the model:
> anova(mod1,mod2)
Analysis of Variance Table

Model 1: pelvic ~ snout * sex
Model 2: pelvic ~ snout + sex
Res.Df    RSS Df  Sum of Sq      F Pr(>F)
1     31 11.952                        
2     32 11.957 -1 -0.0049928 0.0129 0.9101
The anova() command clearly shows that removing the interaction does not significantly affect the fit of the model (F=0.0129, p=0.91). Therefore, we may conclude that the most parsimonious model is mod2. Biologically we observe that for alligators, body size has a significant and positive effect on pelvic width and the effect is similar for males and females. However, we still don’t know how the slopes change.
At this point we are going to fit linear regressions separately for males and females. In most cases, this should have been performed before the ANCOVA. However, in this example we first tested for differences in the regression lines and once we were certain of the significant effects we proceeded to fit regression lines.
To accomplish this, we are now going to sub-set the data matrix into two sets, one for males and another for females. We can do this with the subset() command or using the extract functions []. We will use both in the following code for didactic purposes:
> machos <- subset(gator, sex=="male")
> hembras <- gator[gator$sex=='female',]
Separate regression lines can also be fitted using the subset option within the lm() command, however we will use separate data frames to simplify the creation of graphs:
> reg1 <- lm(pelvic~snout, data=machos); summary(reg1)
lm(formula = pelvic ~ snout, data = machos)

 Min       1Q   Median       3Q      Max
-0.85665 -0.40653 -0.08933  0.04518  1.57408

        Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.4527     0.9697   0.467    0.647
snout         6.5854     0.8625   7.636 6.85e-07 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7085 on 17 degrees of freedom
Multiple R-squared: 0.7742, Adjusted R-squared: 0.761
F-statistic:  58.3 on 1 and 17 DF,  p-value: 6.846e-07

> reg2 <- lm(pelvic~snout, data=hembras); summary(reg2)
lm(formula = pelvic ~ snout, data = hembras)

 Min       1Q   Median       3Q      Max
-0.69961 -0.19364 -0.07634  0.04907  1.15098

        Estimate Std. Error t value Pr(>|t|)
(Intercept)  -0.2199     0.9689  -0.227    0.824
snout         6.7471     0.9574   7.047  5.8e-06 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4941 on 14 degrees of freedom
Multiple R-squared: 0.7801, Adjusted R-squared: 0.7644
F-statistic: 49.67 on 1 and 14 DF,  p-value: 5.797e-06 
The regression lines indicate that males have a higher intercept (a=0.45) than females (a=-0.2199), which means that males are larger. We can now plot both regression lines as follows:
> plot(pelvic~snout, data=gator, type='n')
> points(machos$snout,machos$pelvic, pch=20)
> points(hembras$snout,hembras$pelvic, pch=1)
> abline(reg1, lty=1)
> abline(reg2, lty=2)
> legend("bottomright", c("Male","Female"), lty=c(1,2), pch=c(20,1) )
The resulting plot shows the regression lines for males and females on the same plot.


We can fit both regression models with a single call to the lm() command using the nested structure of snout nested within sex (i.e. sex/snout) and removing the single intercept for the model so that separate intercepts are fit for each equation.
> reg.todo <- lm(pelvic~sex/snout - 1, data=gator)
> summary(reg.todo)
lm(formula = pelvic ~ sex/snout - 1, data = gator)
  Min       1Q   Median       3Q      Max
-0.85665 -0.33099 -0.08933  0.05774  1.57408
             Estimate Std. Error t value Pr(>|t|)
sexfemale        -0.2199     1.2175  -0.181    0.858
sexmale           0.4527     0.8498   0.533    0.598
sexfemale:snout   6.7471     1.2031   5.608 3.76e-06 ***
sexmale:snout     6.5854     0.7558   8.713 7.73e-10 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.6209 on 31 degrees of freedom
Multiple R-squared: 0.9936,     Adjusted R-squared: 0.9928
F-statistic:  1213 on 4 and 31 DF,  p-value: < 2.2e-16


The results shown in the previous figure are consistent with our previous findings, pelvic width is positively related to snout-vent length. The relationship is linear for both males and females. The regression slope is positive and similar for both males and females (b ≈ 7.07; weighted average), which means that pelvic width grows faster than snout-vent length. Finally, the regression line of males intercepts with the y-axis at a higher value than for females, which means that males are larger.


  1. in your example, you showed an ancova where the interaction term is not significant (ie best model has different intercepts but same slope)...yet you have plotted it with separate slopes. How would you plot it with different intercepts but the same slope? thanks.

  2. Tom,

    If you have estimated the common slope, then you can plot a differnt line for each group with different intercepts and the same slope using the abline() command


    where a is the intercept for males or females and b is the common slope.

  3. An interesting alternative to:
    reg.todo <- lm(pelvic~sex/snout - 1, data=gator)

    is this formula:
    reg.todo <- lm(pelvic~snout*sex, data=gator)

    All of the coefficients will be the same for the two models (i.e. both fit a separate slope and intercept for M and F gators) but there are two differences, one superficial, one really interesting. The superficial one is the way the coefficients are presented. In your example, you get the true coefficients, in the second example (snout*sex) you'll get the coefficients for the first factor (females) and then the differences in the coefficient for the male factor. You can just add the male coefficients to the female ones so they're on the same scale.

    The interesting thing in the summary of the second model is that the t score and p-value test different hypotheses than they do in the first model. In the first one they test whether or not the regression has a nonzero slope, in the second one, they test whether or not the slopes and intercepts from different factors are statistically significant! Nifty, right?

  4. This comment has been removed by the author.

  5. Thanks James W. for your comment. I knew about lm(y~x*f) but I didn't know the differences regarding hypotheses testing. Thanks for the insight.

  6. Very useful Campitor!!

    James W I did not know that neither! so interesting. I did it as individual regresions to get de R^2 and P of the individuals valors for each level of the factor.

  7. Thank you- this was very helpful. If you have 3 factors (Treatments A,B,C) and you find that the slopes are not different among the treatments (no significant interaction- as in model 1), but that the intercepts are different (as in model 2). How do you test for which treatments are significantly different from each other? Thank you!

    1. This comment has been removed by the author.

    2. Hope this URL would be helpful. ####

  8. AR - I've seen people use the Tukey HSD test to do this, and this should be okay because theoretically the different treatments/factors are independent, but I'm not sure. You could also bypass the formal test and simply present the intercepts and their standard error to demonstrate how they relate to each other.

  9. Hi,
    Thank you for this guide. I am a bit confused that the slopes on the plot for male and female regression lines look the same. However, you say that the slops are totally different (one positive and one negative), but that does not seem to be the case. Am I misunderstanding something, or is this a typo?

  10. Aras,
    You are right, that was a typo. the difference is in "intercepts" rather than slopes. Slopes are very similar between males and females indeed (6.5 and 6.7, respectively). the slopes are different and they indicate a change in the mean size of each group. In this case, males are significantly larger than females, as was shown by the ANOVA.

  11. Thanks for clarifying that. So when you say "A second more parsimonious model should be fit without the interaction to test for a significant differences in the slope." should that also be intercept and not "slope"? I am not trying to be picky, its just that in my data I think there is a significant difference in intercept and not slope, and I am following your tutorial to prove it. Just want to cover all the bases.

  12. Aras,

    No, the interaction term test for differences in slopes. The difference in intercepts (or means) is tested by the natural factor (i.e. sex). If both regression lines have the same intercept, but dramatically different slopes (imagine two lines diverging from the same point on the y-axis), the interaction would be significant. You may use James W procedure (in the comments) to test for differences in intercepts. Hope this helps.

  13. Thank you for your reply, its much more clear now.

  14. Hello Campitor,

    thank you very much for your easy to follow and excellently structured walk-through ANCOVA.

    Just for curiosity. You determine the regression coefficients with subsets for "hembras" and "machos".

    Is there no function in R which can calculate these parameters based on the model Y ~ X + A for all levels of A without the need of subsets?

    Thanks a lot for your nice blog.

    By the way... I could not find an "impressum". Is that by intention?



  15. Hi Karl,

    thanks for your comments. Regarding your first question, you can get parameters for all levels in two ways, the first one is described in the Advanced section. The other way is following James W instructions in the comments.

    I'm not sure what you mean by "Impressum", if you mean a printer ready version of this topic, the reason it's not there is because I don't know how to do it. :).

  16. Hi Campitor,

    Great tutorial! I have a question about adding a second covariate. Say that you want to look at the effect of an intervention. You have a pre-test score and a post-test score, two groups (treatment, controls). Normally, you would model it as:

    post_test~pre_test + group

    But what if you wanted to add a second covariate. How do you factor out the influence of gender(M,F)? Does it look like this:


  17. Hi,

    thanks a lot for your very helpful tutorial.
    I wonder, how would you interpret a result that indicates that the slopes of males and females ARE different and, for example, cross over in the middle? What if the slopes are different BUT they will actually cross over at a point far beyond the range of your data?


  18. A really helpful tutorial, many thanks!

    I may have missed something but I was wondering where the '(b ≈ 7.07; weighted average)' came from you speak of in the conclusion. I don't think this value is found in any of the outputs.


  19. Hello, this is a great tutorial! I used it a month ago for my own research. Now, I am wondering if I could use the figures and logic for a course that I am teaching. I'm wondering too if you made all those figures yourself for this post, so that I could attribute things correctly.

    Thanks, hopefully. Great job!

  20. @Jeff Cardille

    Hi. I'm glad the tutorial was useful. You may definitely use the logic and images for your class. I did make all of them.

    I'm going to post a new tutorial soon, but I'm still deciding on what specific topic. Be sure to keep posted.

  21. Ok, thanks much! Your other postings are excellent as well by the way...

  22. Thank you so much. I search for R help a lot and never find such concise, well-written answers. I got everything on the first try and it actually looks the way you said it would....that never happens.

  23. Thank you so much. I search for R help a lot and never find such concise, well-written answers. I got everything on the first try and it actually looks the way you said it would....that never happens.

  24. Hi, this is a nice post.
    I am wondering whether the non-significant intercepts (in reg1 and reg2) are indeed different. According to the average value and the sd of the value its seems that probably the "true" value include zero for both cases, so I do not get how it is possible to say that are different. Perhaps I am missing something.

  25. Greetings, I am curious how you would modify this code to include an interaction model that is adjusted for other covariates?

  26. Greetings, I am curious how you would modify this code to include an interaction model that is adjusted for other covariates?

  27. Is there any way you can share the data? (gator) I would be very interesting in being able to replicate the analysis.

    Many thanks,


  28. This was incredibly useful, thank you!!! First time I've come across help with R that i actually understand. If you ever felt like writing something about how to look for statistical differences between nonlinear regressions that would be amazing :)

  29. Hi Rachel, thanks for your comments. I have not had much time to write in this blog, but I should probably get back to it this summer.

    By non-linear do you mean glm or really non-linear regression (like exponential or Gompertz?)


  30. Hello,

    I have a similar case, but my categorical factor has 4 treatments. The interaction is non significant, but the treatment effect and coavariate are significant. Does that means that the 4 treatments have different intercepts? Or it could be that just some of them have differences between each other? In that case, is it possible to do a contrast within intercepts?

    Thanks a lot, your blog is very helpful!


  31. Monica,

    If you have significant effect of treatment, you are going to have 4 different intercepts (provided all 4 treatment levels are different from each other). If the covariate has a significant effect, you are going to have a slope different from 0 (think about it, if the covariate didn't effect the response, then across the entire range of the covariate, the response would be the same and you would have a horizontal line). If you don't have an interaction, that means the slope of that line (which is significantly different from 0 since the covariate is significant) is not statistically different between treatments, so the lines will be parallel with different intercepts.

    IF the interaction was significant, you would have different intercepts, and different slopes for each treatment line, indicating that the effects of your treatment change along the gradient of your covariate!

    Hope this helps.

  32. Thank you for the great guide!
    I have a simple question that might be a bit silly. Regarding scaling relationships, the common function used to describe how one trait scales with body size is: y = bx^a. Only when you log it, it becomes a linear relationship: logy=alogx+logb
    My question is, when we are analyzing data for allometry using linear regression, do we have to log it? It seams so taking the equations into account.
    And then when you used the log data for the analysis, the given intercept is a real b or a log(b)?
    Thank you!

  33. Thank you so much for the much needed tutorial.
    I failed to transfer the points into the plot window with the given function. I got the following error message.
    "Error in points(FIN$scale(CP), FIN$log(Cstock + 1), pch = 20) :
    attempt to apply non-function"

    Would you help me, please?

  34. Thank you for this very helpful guide.

    I have a similar situation where the interaction term is significant, but I have many treatments. Do you know of any way to test between the individual slopes applied to each treatment? I expect that there will be a few treatments which respond much more strongly to the covariate and would like to identify them statistically rather than visually.

  35. HI, first of all, thank you very much for this tutorial, it is by far the most useful I've ever read. I have one doubt: in my study, I found no significant interaction effect, but the regression lines overlap.

    For instance in mod1 the effect of sex is significant, but the interaction effect is not
    In mod2, the sex is still significant.

    Then when I run anova(mod1,mod2), the F is non-significant, like in your example.

    So, why my regression lines overlap?

    Where did I go wrong? Is this normal?

  36. How do you compare regression lines with nonparametric data? Can you still use ANCOVA?