Wednesday, August 25, 2010

Genetic Privacy

The following article is interesting from several angles; the use of the SMGF results in combination with the Family Search database; and the question of maintaining privacy of genetic results. The different orientations of medical researchers with Institutional Review Boards to consider, and of genealogists searching for identifiable links makes for an interesting tension in the research protocols each brings to the reading of this article.

Since I am a multiple relative of Emma Hale, wife of Joseph Smith, I have an interest in the early Mormon families.


Steven C. Perkins


Copyright © 2009 The American Society of Human Genetics. All rights reserved. The American Journal of Human Genetics, Volume 84, Issue 2, 251-258, 13 February 2009

Inferential Genotyping of Y Chromosomes in Latter-Day Saints Founders and Comparison to Utah Samples in the HapMap Project [Link to PDF here]

Jane Gitschier1,*
One concern in human genetics research is maintaining the privacy of study participants. The growth in genealogical registries may contribute to loss of privacy, given that genotypic information is accessible online to facilitate discovery of genetic relationships. Through iterative use of two such web archives, FamilySearch and Sorenson Molecular Genealogy Foundation, I was able to discern the likely haplotypes for the Y chromosomes of two men, Joseph Smith and Brigham Young, who were instrumental in the founding of the Latter-Day Saints Church. I then determined whether any of the Utahns who contributed to the HapMap project (the ‘‘CEU’’ set) is related to either man, on the basis of haplotype analysis of the Y chromosome. Although none of the CEU contributors appear to be a male-line relative, I discovered that predictions could be made for the surnames of the CEU participants by a similar process. For 20 of the 30 unrelated CEU samples, at least one exact match was revealed, and for 17 of these, a potential ancestor from Utah or a neighboring state could be identified. For the remaining ten samples, a match was nearly perfect, typically deviating by only one marker repeat unit. The same query performed in two other large databases revealed fewer individual matches and helped to clarify which surname predictions are more likely to be correct. Because large data sets of genotypes from both consenting research subjects and individuals pursuing genetic genealogy will be accessible online, this type of triangulation between databases may compromise the privacy of research subjects.