Projects

Biostatistical Analyses of Population Level Data for the 14th IHIWS

Chairs: Glenys Thomson and Richard Single

In collaboration with:
(Biostatistics and Anthropology) Alex Lancaster, Diogo Meyer, Owen Solberg, Steven Mack, Henry Erlich,
(Type 1 diabetes) Jan Dorman, Ann Steenkiste, Ana Maria Valdes, Alberto Pugliese, and
(Transplantation) Mari Malkki, Effie Petersdorf, Mary Carrington

PyPop

Our population genetics analysis package (PyPop) for analyses of HLA multi-locus data, as well as non-HLA data, is available at http://allele5.biol.berkeley.edu/pypop/. At this time PyPop performs locus specific analyses of allele counts, Hardy Weinberg proportions (HWP), and the Ewens Watterson test of neutrality. For multi-locus data haplotype frequencies are estimated, as well as linkage disequilibrium (LD) parameters, and a test of significance of the LD. PyPop 0.6.0 is now in beta testing and due for release in the next few weeks. New features include the addition of a Monte-Carlo test for the Hardy Weinberg "exact test" (randomization without the Markov chain component) and the code for the Guo & Thompson test (Markov Chain Monte-Carlo) (now under GNU GPL). Allele count files can now be filtered through the filter apparatus (particularly the Sequence and Anthony Nolan) in the same way as genotype files. Other enhancements include a number of new command line options to allow processing of multiple files. This release also fixes a bug in the original Guo & Thompson test as well as some other numerical fixes to other modules and other minor problems that have been reported in earlier releases.

We have also added the application of these analyses at the amino acid level (see below), and have implemented an exact test for Hardy Weinberg of individual genotypes, to complement the current overall “exact test.” These features are currently undergoing testing and will be included in the release subsequent to 0.6.0.

Anthropology/ Human Diversity data

Analyses for all populations submitted to the 13th Workshop (WS) have been completed and are part of the HLA 2004 book. We are currently extending these analyses for a subset of populations from the 13th WS data. Populations were selected to meet the following criteria: the sample size was of 40 or more individuals; there was no significant deviation from HWP; typing for the population was available either for A, B, and C (20 populations), or for DRB1 and DQB1 (17 populations); the populations were not known to have experienced recent admixture; and the HLA typing was carried out at high molecular resolution. Beyond testing the null hypothesis of neutral evolution, our goals are to study interpopulation differentiation, the effects of demography on HLA variation, and compare these results with appropriate non-HLA data.

To better determine the “level” at which selection is acting on HLA loci, we are analyzing the 12th and 13th WS data, in addition to data from the literature, at the individual amino acid codon level, and later with combinations of codons. We are applying single and multi-locus analyses from PyPop.

In collaboration with Steven J. Mack and Henry Erlich of the Anthropology component we will apply these analyses to data sets submitted to the 14th WS.

Hematopoietic Cell Transplant (HCT) data

A new index called “haplotype specific heterozygosity” (HSH), has been developed to measure microsatellite (msat) diversity on HLA haplotypes. This index was tested on an independent transplantation data set and a paper has been submitted. Each of the five common Caucasian haplotypes (HLA-A1-B8-DR3, A3-B7-DR15, A2-B44-DR4, A29-B44-DR7 and A2-B7-DR15) studied had at least one msat marker with an HSH value of zero indicating that only one msat allele was observed for the particular HLA haplotype. In addition, the ability of msats to predict HLA-A-B-DRB1 haplotypes was studied. Over 90% prediction probability of two common haplotypes (HLA-A1-B8-DR3 and HLA-A3-B7-DR15) was achieved with information from three msats.

The HSH index can be used in the selection of informative msats for transplantation and disease association studies. Markers with low HSH values can be used to predict specific HLA haplotypes or multi-locus genotypes to supplement the screening of HLA matched donors for transplantation. Markers with high HSH values will be most informative in studies investigating MHC-region disease susceptibility genes where HLA haplotypic effects are known to exist.

Type 1 diabetes data (see Type 1 Diabetes project page)

BACK

Contacts:

Glenys Thomsen
Department of Integrative Biology, MC#3140
3060 Valley life Sciences Building
University of California
Berkeley, CA 94720-3140
Ph: 510-642-7025

Richard Single
Statistics Program
Department of Mathematics and Statistics
University of Vermont
16 Colchester Avenue
Burlington, VT 05401-1455