As we talked about previously, usually the 'sig.level' = 0.05 and 'power' = 0.8. In an ideal world, the power will be 0.8, which means that you have an 80% chance of detecting a significant result if one exists.
Calculate effect size:
The effect size used for comparing proportions is known as h, and is calculated according to the following equation:
'asin' is short for 'arcsine' (which is the inverse trigonometric function of 'sine', if you are interested!), and p1 and p2 are proportions.
DON'T PANIC!! R has a function that will calculate this for you very straightforwardly. All you need to do is to enter the two proportions (i.e., for group 1 and group 2, for example) into R as:
Remember, p1 and p2 are proportions 1 and 2.
Cohen (1988) suggested that h values of 0.2, 0.5, and 0.8 represent small, medium, and large effect sizes respectively.
Calculate sample size:
OK, so now we have the effect size, we can carry out the power analysis to calculate the required sample size. The basic command in library(pwr) in R is:
where h is the effect size you calculated above, n is the sample size (same in each group), sig.level is the significance level (usually 0.05) and power is the required power (usually 0.8). The command 'alternative' asks whether the hypothesis will be two-tailed ("two.sided"), or one tailed ("less", "greater").
So, if the effect size (h) = 0.5 (medium effect), the sig.level and power are set at 0.05 and 0.8 respectively, for the difference between two proportions, R calculates the effect size as:
which means that in order for me to be 80% confident that I will detect the h = 0.5 effect size at the 0.05 level, I will need a minimum of 63 subjects in each group. Note that I left the 'n = ...' command out as that is what I want R to tell me.
That is quite a lot! What happens if I only have 70 subjects at my disposal (i.e., 35 in each group). Well, let's ask R. Note that this time, I will put the 'n = ...' in as I know what I have. I also want to know what the likelihood of me detecting the effect is with my sample size, so I leave out the 'power = 0.8' command.
That is quite a lot! What happens if I only have 70 subjects at my disposal (i.e., 35 in each group). Well, let's ask R. Note that this time, I will put the 'n = ...' in as I know what I have. I also want to know what the likelihood of me detecting the effect is with my sample size, so I leave out the 'power = 0.8' command.
This means that if you run the experiment, you can only be 55% confident that you will detect the effect at p=0.05. Time to re-design your experiment I think!
If you have unequal sample sizes, the command is basically the same, but you need to tell R the sample size in each group:
where n1 and n2 are the two sample sizes.