blastercros.blogg.se

Skat package
Skat package





skat package

Therefore, we propose an adaptive procedure, termed AP-SKAT, for the highly efficient calculation of SKAT statistics. However, resampling requires a huge amount of computation time to obtain high-resolution p-values for the correction of multiple comparisons, and hence a more efficient resampling method is necessary. To obtain accurate p-values, resampling methods such as the permutation test can be implemented in SKAT. Because the p-values based on SKAT are derived from an asymptotic distribution of its statistics, the p-values for datasets with an insufficient number of samples may be inaccurate, which causes inflation or power loss. This type of strategy is called collapsing, and the sequential kernel association test (SKAT) is one of the most effective collapsing methods. To address this issue, rare and low-frequency variants are often grouped at the gene or pathway level, and the effects of multiple variants are evaluated. Unlike common variants, the power of rare and low-frequency variants on single-variant association tests is low because of the lack of allele counts, even with thousands of individuals. Thus far, associations between SNPs and disease phenotypes have been studied for genotype data from HTS and SNP arrays, and the recent focus has moved to rare and low-frequency variants.

skat package

In addition, with population-specific reference panels comprised of detected variants from HTS, low-frequency variants can be imputed accurately from single-nucleotide polymorphism (SNP) array genotype data. High-throughput sequencing (HTS) technologies enable the detection of rare and common variants at the genome-wide scale for thousands of individuals. Additionally, this procedure can be used in other association tests by employing alternative methods to calculate the statistics. This demonstrates that the procedure is sufficiently powerful for recent whole genome sequencing and SNP array data with increasing numbers of phenotypes.

skat package

Conclusionsįor several types of genetic data, the developed procedure could achieve competitive power and sample size under small and large sample size conditions with controlling considerable type I error rates, and estimate p-values of significant SNP sets that are consistent with those estimated by the standard permutation test within a realistic time. Through computational experiments using whole genome sequencing and SNP array data, we show that our proposed procedure is highly efficient and has comparable accuracy to the standard procedure. To evaluate the performance, we first compare the power and sample size calculation and the type I error rates estimate of SKAT, SKAT-O, and the proposed procedure using genotype data in the SKAT R package and from 1000 Genome Project. Our procedure adaptively stops the permutation test when the significance level is outside some confidence interval of the estimated p-value for a binomial distribution. To address this problem, we devise an adaptive SKAT procedure termed AP-SKAT that efficiently classifies significant SNP sets and ranks them according to the permuted p-values. Although this bias can be corrected by applying permutation procedures for the test statistics, the computational cost of obtaining p-values with high resolution is prohibitive. However, the reported p-values from SKAT tend to be biased because the asymptotic property of the statistic is used to calculate the p-value. One of such strategies, known as the sequential kernel association test (SKAT), is a widely used collapsing method. To address the small sample size for rare variants, association studies tend to group gene or pathway level variants and evaluate the effect on the set of variants. Genome-wide association studies have revealed associations between single-nucleotide polymorphisms (SNPs) and phenotypes such as disease symptoms and drug tolerance.







Skat package