University of Michigan Center for Statistical 


Part 3: Finding Optimal Two Stage Designs

In the previous examples, you explored power for various two stage designs and probably noticed that these designs can substantially reduce the toal number of genotypes required to achieve any given power. Identifying the optimal two-stage design requires consideration not only of the total number of genotypes required to complete the study, but also of the relative genotyping costs for Stage 1 and Stage 2. On a per genotype basis, costs are typically very low in the massive throughput platforms used for Stage 1 of these studies, but much higher in Stage 2.

CaTS provides an automatic way to search for these Optimal designs, taking into account information on genotyping costs that you specify. To do this, you should first select the Optimization tab and specify the genotyping cost ratio, which is the relative increase in per genotype costs for stage 2. Then, you will need to decide if you want to find the least expensive two-stage design for a given power -- if so, select Target Power (%) -- or if you want to find the most powerful design for a given cost --- if so, select Target Cost (%).

Let's return to the genome-wide association study example with 1500 cases, 1500 controls and 300,000 independent markers. With a type error rate of .00001, we had 82% power to detect a disease variant with GRR of 1.30 and population allele frequency .30 (with prevalence of .10). Suppose we are willing to settle for 80% power in a two-stage design that requires a minimal amount of genotyping. To find this optimal design, we first select the Optimization tab. Then set target power to 80% and change the genotyping cost ratio to 1, within that tab, and press the Optimize! button. After a few seconds, your results should match the screen shot below:


The optimal two-stage design will genotype 35.37% of the sample in stage 1 and follow-up 11.96% of the markers in stage 2. Assuming genotyping costs are the same in both stages, this design reduces the cost of study by nearly 57%. It is straightforward to repeat the analysis for other genotyping cost ratios. For example, try setting the genotype cost ratio to 20 and Optimize! the design again. You will see an increase in the proportion of samples genotyped in Stage 1 (to 56.70%) and a decrease in the proportion of markers followed up in Stage 2 (to 0.83%). Overall, the cost of the study is now higher and only a savings of 36.1% is possible relative to a one-stage design.

If the approach above results in design that is too expensive, you can specify the maximum cost you can afford, as proportion of the one-stage design cost. Let's say the genotyping all the samples at all markers costs $3,000,000 and you only have $1,500,000 available. Click on the drop down box that says Target Power (%) and select Target Cost (%) instead. Then, set the target slider at 50% and click the Optimize! button. You should see a screen like that below:


Unfortunately, 80% power cannot be achieved in this setting; and it appears the best you can do is 71.28% power. If you decide that it is important to have power of 80%, you could try adjusting the false positive rate or increasing the total sample size available.

The last section in the tutorial, provides practical guidance on the calculation of tests statistics for two-stage studies. To return to the main tutorial menu, click here.


University of Michigan | School of Public Health | Abecasis Lab