Genetic assignment methods for the direct, real-time estimation of migration rate: a simulation-based exploration of accuracy and power

Affiliation.

  • 1 Department of Zoology and Entomology, University of Queensland, St. Lucia, QLD 4072, Australia.
  • PMID: 14653788
  • DOI: 10.1046/j.1365-294x.2004.02008.x

Genetic assignment methods use genotype likelihoods to draw inference about where individuals were or were not born, potentially allowing direct, real-time estimates of dispersal. We used simulated data sets to test the power and accuracy of Monte Carlo resampling methods in generating statistical thresholds for identifying F0 immigrants in populations with ongoing gene flow, and hence for providing direct, real-time estimates of migration rates. The identification of accurate critical values required that resampling methods preserved the linkage disequilibrium deriving from recent generations of immigrants and reflected the sampling variance present in the data set being analysed. A novel Monte Carlo resampling method taking into account these aspects was proposed and its efficiency was evaluated. Power and error were relatively insensitive to the frequency assumed for missing alleles. Power to identify F0 immigrants was improved by using large sample size (up to about 50 individuals) and by sampling all populations from which migrants may have originated. A combination of plotting genotype likelihoods and calculating mean genotype likelihood ratios (DLR) appeared to be an effective way to predict whether F0 immigrants could be identified for a particular pair of populations using a given set of markers.

Publication types

  • Research Support, Non-U.S. Gov't
  • Animal Migration*
  • Computer Simulation
  • Genetics, Population*
  • Likelihood Functions
  • Linkage Disequilibrium
  • Models, Genetic*
  • Monte Carlo Method
  • Population Dynamics
  • Sample Size

Loading metrics

Open Access

Review articles synthesize the best available evidence on a topic relevant to the pathogens community.

See all article types »

Genetic Assignment Methods for Gaining Insight into the Management of Infectious Disease by Understanding Pathogen, Vector, and Host Movement

* E-mail: [email protected]

Affiliation Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, Georgia, United States of America

Affiliation Institute of Parasitic Disease, Sichuan Provincial Center for Disease Control and Prevention, Chengdu, Sichuan, People's Republic of China

Affiliation Environmental Health Sciences, School of Public Health, University of California, Berkeley, Berkeley, California, United States of America

Affiliation School of Marine and Tropical Biology, James Cook University, Townsville, Queensland, Australia

  • Justin V. Remais, 
  • Ning Xiao, 
  • Adam Akullian, 
  • Dongchuan Qiu, 
  • David Blair

PLOS

Published: April 28, 2011

  • https://doi.org/10.1371/journal.ppat.1002013
  • Reader Comments

Table 1

For many pathogens with environmental stages, or those carried by vectors or intermediate hosts, disease transmission is strongly influenced by pathogen, host, and vector movements across complex landscapes, and thus quantitative measures of movement rate and direction can reveal new opportunities for disease management and intervention. Genetic assignment methods are a set of powerful statistical approaches useful for establishing population membership of individuals. Recent theoretical improvements allow these techniques to be used to cost-effectively estimate the magnitude and direction of key movements in infectious disease systems, revealing important ecological and environmental features that facilitate or limit transmission. Here, we review the theory, statistical framework, and molecular markers that underlie assignment methods, and we critically examine recent applications of assignment tests in infectious disease epidemiology. Research directions that capitalize on use of the techniques are discussed, focusing on key parameters needing study for improved understanding of patterns of disease.

Citation: Remais JV, Xiao N, Akullian A, Qiu D, Blair D (2011) Genetic Assignment Methods for Gaining Insight into the Management of Infectious Disease by Understanding Pathogen, Vector, and Host Movement. PLoS Pathog 7(4): e1002013. https://doi.org/10.1371/journal.ppat.1002013

Editor: Marianne Manchester, University of California San Diego, United States of America

Copyright: © 2011 Remais et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: This work was supported in part by the Ecology of Infectious Disease program of the National Science Foundation under Grant No. 0622743, by the National Institute for Allergy and Infectious Disease (grant K01AI091864), and the Global Health Institute Faculty Distinction Fund at Emory University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

For many infectious diseases, transmission is strongly influenced by pathogen, host, and vector migration across complex landscapes [1] . This is especially true for pathogens with environmental stages, or those carried by vectors and intermediate hosts. The spread of rabies, for instance, has been shown to be regulated by rivers that act as barriers to host movement [2] , and the onset of diseases such as measles or foot-and-mouth disease is governed in part by human or animal hosts migrating across heterogeneous landscapes [3] , [4] . Disease persistence, synchrony, and establishment are known to be modified by host migrations between populations [5] – [9] , and thus direct measures of migration rates in real transmission systems are very much needed to optimize disease management and improve intervention campaigns.

Genetic assignment methods can provide such measures; they are a set of powerful statistical approaches that, at their most basic, can be used to establish population membership of individuals. When applied to organisms distributed among spatially distinct, interconnected populations, the techniques can be used to derive quantitative estimates of movement across a network, and determine the degree to which landscape features aid or impede movement. Genetic assignment methods have, for the most part, been limited to applications in ecology and conservation biology. This is despite their utility for estimating the magnitude and direction of key movements in infectious disease systems, where they could reveal important environmental and ecological features that facilitate or limit the spread of disease with important implications for control.

For example, estimates of pathogen transport can be used to design more efficient anthelmintic treatment campaigns for important macroparasites of humans [10] , and where environmental change is occurring, estimates of the associated change in migration can aid in the identification of new risks that arise from vectors and hosts moving effectively closer than they have been historically [1] . Genetic assignment tests (ATs) have potential for estimating these pathogen, host, and vector movements, and recent improvements in theory underpinning ATs have increased their utility at fine spatial and temporal scales, while overcoming the cost, time, and scale limitations of traditional approaches such as mark-recapture experiments [11] . Here, we discuss the molecular and statistical methodologies that make possible the application of ATs. We review current applications of ATs in infectious disease epidemiology, and discuss research directions that are positioned to capitalize on use of the techniques. We use the term “migration” to encompass the movement of human hosts, the dispersal of animal hosts and vectors, and the transport of pathogens in environmental media (e.g., flowing water).

Estimating Migration Rates

While many free-living pathogens, vectors, and intermediate hosts are capable of moving several kilometers, their specific mobilities are rarely estimated or incorporated into efforts to control disease [10] , [12] . Historically, ecological migration rates were estimated using direct measures such as mark-recapture and radio tagging, which obviously present limitations when applied to small organisms, large populations with small numbers of migrants, or organisms that are difficult to durably mark [13] . Indirect genetic methods are also available, such as inferring Nm , the number of migrants exchanged between populations per generation, using gene flow estimators based on Wright's infinite island model [14] , [15] . This approach makes a number of simplifying assumptions, such as assuming symmetrical, constant migration and constant population size, assumptions which were partially relaxed with the development of coalescent-based methods [16] .

Coalescent theory describes the statistical properties of gene trees under a standard demographic model (namely the Fisher-Wright model). Present day samples of a non-recombining gene can be seen as lying on a branch of a gene tree rooted at the most recent common ancestor of the sample. Moving backward in time from each branch, genes coalesce until the common ancestor is reached, and in this way, present-day samples can be used to infer the past, including past migration among mating populations. Coalescent-based estimates of migration rates, obtained by comparison of allele frequency distributions observed in population samples, assume that all potential source populations have been sampled and that populations have followed relatively simple demographic progressions (constant size or deterministic expansion) while experiencing constant migration [16] , [17] . Migration rates obtained in this fashion reflect the effect of migration occurring over long time scales, and do not reflect (i.e., are insensitive to) contemporary changes such as interventions (e.g., vector control) and recent environmental change. ATs, through the combination of highly variable genetic markers with Bayesian statistical methods, allow the estimation of recent migration rates that strongly reflect the influence of contemporary changes.

Assignment Tests

ATs use multilocus genotypes to identify the source population of individuals that have migrated within the past several generations [18] . Early ATs estimated the probability of an individual's multilocus genotype in relation to the frequency of alleles at different loci in potential source populations. After all sampled individuals were assigned, the migration rate between two populations was estimated by dividing the number of identified migrants by the sample size of the origin population [18] – [20] . A notable recent Bayesian method [21] directly estimates migration rates (and infers inbreeding coefficients and individual migrant ancestries) by detecting the temporary disequilibrium in immigrants' genotypes relative to the population under consideration, while relaxing the assumption that genotypes within subpopulations are in Hardy–Weinberg equilibrium. A related class of clustering methods [19] , [22] , [23] aims to partition individuals into genetically distinct subpopulations without prior assumptions about population membership; i.e., the methods calculate the probability that each individual genotype originates from one of K populations, with K , the number of subpopulations, among the inferred parameters.

Bayesian models (also known as fully probabilistic models) provide a convenient means to deal with complex (and inherently stochastic) phenomena that determine the genetic properties of individuals and populations [24] . Like other Bayesian approaches, Bayesian ATs take the position that model parameters and data are random variables with a joint probability distribution specified by a probabilistic model. The model structure and parameters proposed by Wilson and Rannala's [21] notable recent method are described in detail in Text S1 . The data and parameters of the inference model implemented in [21] are summarized in Table S1 , and Figure S1 shows a probabilistic graphical model indicating the conditional dependencies in [21] . Population assignment is a trivial task if there are fixed differences (no shared alleles) between populations. However, this is rarely the case: typically historical connections, ongoing gene flow, and perhaps convergent evolution lead to the sharing of alleles between populations. Consequently, computationally intensive approaches are required to identify the likely source population of any given individual (see Text S1 ). Software implementations of Bayesian and maximum likelihood–based methods for inferring migration and population clustering parameters are widely available ( Table 1 ). The extent of population differentiation, the number of individuals that can be sampled, the number of loci, and the specific genetic markers and their polymorphism, all interact in determining the power of any approach [25] . Markers appropriate for ATs are reviewed in detail in Text S2 , and different classes of genetic markers and their corresponding advantages and disadvantages are summarized in Table S2 .

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.ppat.1002013.t001

Application of ATs in Infectious Disease Systems

Recent infectious disease applications of ATs have estimated pathogen, vector, and host dispersal characteristics in order to explain patterns of transmission and better target control activities. Here, we review four such applications.

Case 1: Chagas Disease

In the absence of a vaccine or effective theraputics, Chagas disease control is largely dependent on elimination of the vector, members of the genus Triatoma , using insecticides. The hematophagous triatomines carry Trypanosoma cruzi , the protozoan parasite that causes Chagas disease in much of Latin America. The insects are present in sylvatic and peridomestic populations, with transient and seasonal invasion of homes leading to blood meals and transmission [26] . In the Mexican Yucatán, Dumonteil, Tripet, and colleagues [26] evaluated the genetic structure of T. dimidiata to assess dispersal of individuals, better understand domestic infestation, and inform vector control. Insects were sampled from domestic, peridomestic, and sylvatic populations, genotyped at eight microsatellite loci, and analyzed using F statistics and both Bayesian- and likelihood-based ATs [18] , [27] . The authors found that T. dimidiata is capable of dispersal over large geographic distances in the Yucatán Peninsula (up to 280 km) as suggested by low population differentiation and weak genetic structure. In this case, ATs provided a clearer picture than conventional Fst, allowing for the identification of immigrants even among populations with low genetic differentiation and no detectable correlation between genetic and geographic distance (isolation by distance). ATs indicated that 10%–22% of the insects collected within homes were immigrants from the peridomestic and sylvatic areas. Dispersal was detected in the opposite direction as well, with several insects in peridomestic and sylvatic areas having originated from populations within homes. The ecological basis of genetic structure in this study provided dispersal information that supports pesticide application and refuge removal in peridomestic areas. This zone appears to serve as an important “transit area” between sylvatic and domestic populations, contributing to household reinfestation after control, and largely agreeing with the findings from a small study in Bolivia [28] .

Case 2: Coccidioides Species

The Coccidioides soil fungi, found in arid zones of the southwestern United States and northwestern Mexico, can cause community-acquired pneumonia and severe disseminated disease (coccidioidomycosis) when inhaled by a vertebrate host [29] . Several western US states have seen dramatic increases in the incidence of coccidioidomycosis (from 2.5 to 8.4 cases per 100,000 in California between 1996 and 2006, and from 21 to 91 cases per 100,000 in Arizona between 1997 and 2006), raising the need for improved surveillance measures [30] , [31] . The diagnosis and clinical management of coccidioidomycosis in areas such as New York, where the disease is not endemic, pose unique challenges, and the source of Coccidioides infections in these settings is poorly understood. To improve molecular surveillance, identify sources of infection, and allow the early detection and management of outbreaks, Fisher et al. [32] used an AT to assign Coccidioides spp. clinical isolates to their populations of origin. The application of ATs to these organisms was complicated by their haploid, rather than diploid, genome, requiring the authors to modify existing AT methods.

More than 160 isolates from eight geographical populations of Coccidioides immitis and Coccidioides posadasii were genotyped at nine microsatellite loci. Isolates were both clinical and environmental in origin, and spanned the worldwide distribution of Coccidioides spp. Sixteen clinical isolates of unknown origin were obtained from patients diagnosed in the nonendemic state of New York. Using a modified AT procedure, 12 of these isolates were assigned to source populations with high probability, most to a source that matched the recent travel history of the patient. Thus, source identification in this nonendemic area was able to detect common-source infections. In two cases, however, travel history did not match assignment, raising questions about whether genetic differentiation was driven by host travel or pathogen dispersal; either an incomplete travel history or exposure to an isolate that had dispersed a great distance could explain the mismatches [32] .

Case 3: Hosts and Vectors of Yersinia pestis

Yersinia pestis , the bacterium that causes plague, is readily passed between wildlife and humans via flea vectors. In the plains regions of North America, black-tailed prairie dogs ( Cynomys ludovicianus ) live in high-density, communal colonies that favor the spread of plague, making this species an important host for Y. pestis . Oropsylla hirsuta is a flea very commonly associated with C. ludovicianus , and is thought to contribute substantially to Y. pestis transmission [33] . Because fleas (and many other ectoparasitic disease vectors) rely on their hosts for dispersal, quantifying host movement can aid in understanding the spread of flea-borne diseases. In a study in the northern US, Jones and Britten [33] investigated the role that prairie dogs play in dispersing fleas infected with Y. pestis . The dominant hypothesis in this transmission system, and many others, is that host movements determine vector movements, and thus concordance between host and vector population genetic characteristics would be expected. The study used ATs, among other genetic analyses, to test this hypothesis, sampling 112 prairie dogs from six colonies in north-central Montana and genotyping them at 14 microsatellite loci. At the same time, 84 fleas were collected directly from prairie dog burrows and genotyped at seven microsatellite loci. Genetic structure and variability were analyzed using multiple methods, including the estimation of recent migration rates of prairie dogs and fleas using the Bayesian techniuque described in detail in Text S1 [21] .

The authors found that the host and vector differed widely in genetic structure: prairie dog hosts exhibited low intercolony migration (eight of 30 intercolony migration rates showed m ≥0.05), and the scale of their genetic neighborhood was on the order of a typical colony size. In contrast, the vector was well mixed, showing considerable migration between colony pairs (22 of 30 intercolony migration rates showed m ≥0.05) and limited colony-level population structure. Because fleas and prairie dog hosts sampled from the same locations show limited concordance in population genetics, it is likely that prairie dogs are not the primary means of O. hirsuta dispersal in these colonies. Thus, the authors concluded that other hosts should be considered when responding to plague outbreaks, as O. hirsuta occurs on a variety of host species that may be important in dispersing Y. pestis –infected fleas [33] .

Case 4: Oral Rabies Vaccination of Racoons

The common raccoon ( Procyon lotor ) is widely distributed throughout North and Central America, and is capable of occupying a broad range of habitats in close proximity to humans. P. lotor is also the most frequently reported rabid wildlife species, and is a particularly important carrier of the rabies virus in the mid-Atlantic and northeastern US. Because of the risk of transmission of rabies to humans, the US Department of Agriculture conducts routine oral rabies vaccination programs targeting P. lotor and several other important wildlife species. In a large and expensive annual program, recombinant virus vaccine is delivered to P. lotor populations in the eastern US in attractive baits. A key question in optimizing these oral rabies vaccine programs is how geographic features (e.g., rivers, mountains, etc.) can be used to better target delivery of baits along important P. lotor dispersal corridors, reducing their virus trafficing potential. In a study in southwestern Pennsylvania state, Root, Puskas,and colleagues [34] used ATs to investigate which geographic features, if any, hinder or enhance P. lotor dispersal, and thus can be used to improve oral vaccination programs.

Live raccoons were trapped from five study sites distributed along valleys separated by a high elevation ridge; the authors aimed to test the hypothesis that the ridge isolated the populations on either side. DNA from a total of 185 raccoons was genotyped at nine microsatellite loci, and Bayesian clustering [19] and ATs [18] were used to assess the number of genetic clusters and infer the population of origin of P. lotor specimens. Specimens from all five study sites were found to compose a single genetic population, and few animals were assigned to their population of origin, with many assigned to sources across the ridge (i.e., sampled from one valley, but assigned to the valley on the opposite side of the ridge; [34] ). The results indicate that neither ridge nor valley features in this setting influence P. lotor dispersal, as individuals can transcend ridges and can readily traffic virus between (and within) valleys. Thus, ridge and valley features may not be suitable for use in optimizing the geographic placement of oral vaccine baits, despite the finding in other settings that major rivers and mountains may constrain P. lotor dispersal [34] .

Contemporary movements of hosts can contribute to increased frequency and intensity of malaria epidemics in some regions [35] , [36] , while transport of free-living pathogen stages can determine the effectiveness of strategies for reducing schistosomiasis infections [10] . Thus, quantifying these movements is of great interest to the study of complex epidemiological systems, and the routine use of ATs for this purpose is anticipated [24] .

Among the epidemiological methods that can benefit from ATs are spatial models of infectious disease transmission, which incorporate knowledge of the location, movement rate, and travel direction of hosts, vectors, and pathogens to explain observed patterns of transmission and evaluate intervention options. ATs can provide a quantitative description of migration between populations in transmission models, particularly in the context of network models that explicitly represent the exchange of individuals between populations [1] . Indeed, rigorous quantification of movement between nodes has been called for in network models [4] , [37] , and ATs offer a powerful alternative to traditional methods (e.g., mark-recapture) that are difficult to apply to these systems.

Challenging epidemiological questions can be addressed by ATs. The source of infection for recombining organisms (as opposed to those organisms where genetic structure is principally clonal) can be determined. As in the Coccidioides case, independent loci can be used to estimate the relatedness between isolates and, when combined with travel patterns of infected hosts, assignments can be used to improve surveillance in nonendemic areas, leading to the identification of common source cases that may have otherwise gone undiagnosed [32] . Moreover, ATs can also provide valuable confirmation (or refutation) that a particular host is responsible for the spread of pathogens or vectors [33] .

Another key epidemiologcal use for ATs is in assessing the landscape determinants of disease spread. ATs make it possible to formally test previously held beliefs about the role of specific landscape features in governing the mobility of vectors, hosts, and pathogens. Just as valleys and ridges were found not to govern the movement of racoon vectors of rabies [34] , conventional wisdom on other landscape determinants of spread can give way to quantitative evidence from ATs. For this to happen, landscape factors must be rigorously characterized and included in the analysis. Simple Euclidean distance between populations has been shown to be inadequate for this purpose [3] , [4] , and thus alternative (non-Euclidean) distance measures that account for landscape complexity [1] must be employed following the lead of the ecological sciences where much has been learned using this approach [38] , [39] .

Diffusive processes are ubiquitous in infectious disease transmission [1] , and despite limited efforts to quantify these processes in the past, research interest is growing rapidly. The authors of this review are engaged in an application of ATs to Schistosoma japonicum , the parasite that causes schistosomiasis in East and Southeast Asia. This organism is subject to transport in the environment via multiple pathways [10] : parasites are carried in advective flows along canals and streams as both larvae and ova; within snail intermediate hosts, parasites are conveyed among and between aquatic and riparian habitats; and for adult worms, human and animal hosts serve as vehicles. ATs provide a powerful means to comprehensively assess the role of these diffusive processes in schistosome transmission, and when combined with landscape data, can offer insights into how anthropogenic change can modify diffusion parameters, thereby influencing transmission. High priority research questions can be addressed, such as which environmental pathways are most influential in maintaining parasite transmission in endemic areas, and which are efficient at spreading the parasite into new regions or among new vulnerable subpopulations?

ATs represent just one analytical avenue in a sophisticated suite of powerful genetic analysis tools available for such epidemiological applications, including other methods for inferring demographic parameters and for identifying genes or genomic regions involved in human diseases [24] , [40] . There is diversity even within the set of techniques for estimating migration, and thus, looking forward, comparisons among estimators will be increasingly important, both to validate methods for application to specific hypotheses and to establish confidence in estimates for a particular system.

Supporting Information

Probabilistic graphical model indicating the conditional dependencies (directed edges) in the Wilson and Rannala [21] method. Nodes represent observed (data; squares) and unobserved (parameters; circles) random variables. The observed variables are the vector of sampled source populations S and the matrix of multilocus genotypes of sampled specimens, X . Among the unobserved variables (parameters) are the quantities of interest in infectious disease systems, including the interpopulation migration rates in matrix m and the specific migrant ancestry of individuals in vector M .

https://doi.org/10.1371/journal.ppat.1002013.s001

Data and parameters of the inference model implemented in Wilson and Rannala's [21] Bayesian assignment test.

https://doi.org/10.1371/journal.ppat.1002013.s002

Descriptions of different types of genetic markers and the corresponding advantages and disadvantages when analyzed using assignment tests.

https://doi.org/10.1371/journal.ppat.1002013.s003

Bayesian assignment tests.

https://doi.org/10.1371/journal.ppat.1002013.s004

Genetic markers.

https://doi.org/10.1371/journal.ppat.1002013.s005

Acknowledgments

We are grateful for the assistance of Paul Brindley of George Washington University and Jessica McCoury at Emory University.

  • View Article
  • Google Scholar
  • 15. Wright S (1969) Evolution and the genetics of populations: the theory of gene frequencies. Volume 2. Chicago: University of Chicago Press.
  • 16. Clobert J (2001) Dispersal. New York: Oxford University Press.
  • 36. Prothero RM (1965) Migrants and malaria. London: Longmans.
  • 40. Ziegler A, König I (2006) A statistical approach to genetic epidemiology: concepts and applications. Weinheim: Wiley-VCH. 335 p.

Assignment tests    (Go to assignment test web calculator [old, unmaintained version])

The idea behind assignment tests is to use individual genotypes to assign individuals to populations or clusters. Paetkau et al. (1995) developed the first assignment test approach for use on bears.   The idea was fairly simple.   Given a set of populations, and the allele frequencies of those populations, what is the likelihood of a given individualÔøΩs genotype in the population in which it was sampled versus its likelihood in the other populations in the set?   An individual is assigned to the population for which it has the highest likelihood.  

LetÔøΩs take a simple example with four alleles at one locus ( a, b, c , and d ) and three alleles at a second locus ( k, l, and m ):

Now, say we have an individual with the genotype abll .   Its probability in Pop 1 is (2*0.2*0.1)*0.3 2 = 0.0036.   Its probability in Pop 2 is (2*0.5*0.3)*0.6 2 = 0.108.   The likelihood in Pop 2 is considerably higher than the likelihood in Pop 1.   We ÔøΩassignÔøΩ it to Pop 2.   Conversely, an individual with the genotype cdkl would be much more likely in Pop 1 (0.072) than in Pop 2 (0.0024).   [ We need to calculate heterozygote probabilities as 2 times the product of the gene frequencies; remember the Punnett square idea that it could get A from Dad and a from Mom OR a from Dad and A from Mom ].   With a large number of loci, and the presence of rare alleles, we use logarithms to make the numbers easier to assess.  

Fig. 1.   Assignment graph for individuals sampled in two populations.   Individuals in red were sampled in Pop 1, those in blue in Pop 2.   Above the line are individuals ÔøΩassignedÔøΩ to Pop 2, below the line individuals assigned to Pop 1.   Note that, for this example, one Pop1 individual is assigned to Pop 2, and one Pop 2 individual is assigned to Pop 1.   Such ÔøΩmisassignedÔøΩ individuals might represent immigrants from the other population, or the descendants of such immigrants (or it could occur on the wrong side of the line by chance; the further from the line, the less likely the deviation occurs by chance; the more highly polymorphic loci we have, the stronger the evidence).  

The problem of zeros .   If an allele does not occur in a population ( p x = 0) then the assignment probability of an individual to that population will be 0.   Several methods (described on the now-unsupported assignment calculator web site http://www2.biology.ualberta.ca/jbrzusto/Doh.php ) exist for adjusting the allele frequencies in the event of zeros.  

Several alternatives now exist for assignment tests.   One of these is to use Bayesian methods for deciding on the likelihood of assignment.   J.-M. Cornuet has a package of such approaches at his web site

http://www.montpellier.inra.fr/URLB/geneclass/geneclass.html

The program runs in Windows, and has a range of options for dealing with the zero problem.  

A variant on the assignment approach is to allow the data themselves to determine the population has subpopulations and, if so, how many.   The program Structure uses Hardy-Weinberg equilibrium (HWE) assumptions to create clusters.   That is, it sequentially assesses the fit of portions of the data to a set of k clusters, with each cluster maximally obeying HWE structure.   Because an algorithm/equation underlies the expectation, maximum likelihood estimation (MLE) is suitable for the decisions about the k clusters.  

The program Structurama by Huelsenbeck (http://www.structurama.org/), also calculates the algorithms of Pritchard (2000) with a few extra twists. 

genetic assignment method

Fig. 2. Assignment graphic for a three cluster ( k = 3) assessment of Burrowing Owl populations, using the program Structure .   Each of the three vertices represents exclusive assignment to one of the three clusters; points more toward the center have affinities with all three clusters.   The square symbols are Florida owls (note that their centroid is very close to the Florida vertex), the triangles are CA populations (centroid basically right in the middle of the line between CA and MW, meaning we really can't tell much about owls sampled in California -- they could just as well belong to the MW cluster) and the stars are Rocky Mt. region individuals.   Many of the California and Rocky Mt. region individuals were ÔøΩmisassignedÔøΩ.  

References:

Paetkau, D, W. Calvert, I. Stirling, and C. Strobeck. 1995. Microsatellite analysis of population structure in Canadian polar bears. Mol. Ecol. 4: 347-354.

Pritchard, J.K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155: 945-959.

  •   Hjem
  • Øvrige samlinger
  • Publikasjoner fra CRIStin - NTNU
  • Vis innførsel

Genetic assignment of individuals to source populations using network estimation tools

Kuismin, markku ; saatoglu, dilan ; niskanen, alina katariina ; jensen, henrik ; sillanpää, mikko j., peer reviewed, journal article, accepted version.

Thumbnail

Permanent lenke

Utgivelsesdato.

  • Institutt for biologi [2420]
  • Publikasjoner fra CRIStin - NTNU [34929]

Originalversjon

IMAGES

  1. The Genetic Method

    genetic assignment method

  2. Genetic Web Assignment

    genetic assignment method

  3. Genetic structure and assignment of individuals into classes as...

    genetic assignment method

  4. Probability level (%) of genetic assignment and number of assign- ments...

    genetic assignment method

  5. (PDF) A Firepower Assignment Method for Ship-to- air Missile System Based on Hybrid Genetic

    genetic assignment method

  6. Results of genetic assignment based on Bayesian method implemented in...

    genetic assignment method

VIDEO

  1. Assignment Problem Part-1 by Tejashree

  2. Introduction To Genetic Lec #1 #genetics #geneticdrift #geneticcode

  3. Deciphering Genetic code l Genetic code part 2 l easy to learn

  4. Revised question in Assignment work of mathematical method of bsc3rd sem#assignmentwork #ddu

  5. Language Teaching Method Assignment : Teaching Descriptive Text Through Collaborative Learning

  6. Gene expression Analysis

COMMENTS

  1. What Are Methods of Studying Human Behavior?

    There are several methods used in studying human behavior, such as observation, experiments, correlation studies, surveys, case studies and testing. Human behaviors manifest in many ways and are determined by culture, emotions, attitudes, v...

  2. Where Is Genetic Information Stored?

    Genetic information is stored in several places, which are DNA molecules, genes, chromosomes, mitochondria and the genome. Different amounts and types of genetic information are stored in these locations.

  3. What Causes Genetic Variation?

    Genetic variation is the result of mutation, gene flow between populations and sexual reproduction. In asexually reproducing organisms, some genetic variation may still result from random mutation.

  4. Genetic assignment methods

    The

  5. Genetic assignment of individuals to ...

    By using genetic assignment methods, individuals with unknown genetic origin can be assigned to source populations. This knowledge is

  6. Genetic assignment methods for the direct, real-time estimation of

    Genetic assignment methods use genotype likelihoods to draw inference about where individuals were or were not born, potentially allowing direct, real-time

  7. Genetic assignment of individuals to ...

    Our examples illustrate how the network estimation method adapts to population assignment, combining the efficiency and attractive properties of

  8. Assignment methods: matching biological questions with

    Genetic assignment methods for the direct, real-time estimation of migration

  9. Genetic Assignment Methods for Gaining Insight into the ...

    Genetic assignment methods can provide such measures; they are a set of powerful statistical approaches that, at their most basic, can be used

  10. Genetic assignment of individuals to source ...

    Graph methods are not only restricted to gene networks but they have also been utilized in land- scape genetics (see, e.g., Garroway et al.

  11. (PDF) Genetic Assignment Methods for Gaining Insight into the

    Abstract ... Genetic assignment methods are a set of powerful statistical approaches useful for establishing population membership of individuals.

  12. Assignment tests

    Genetics 155: 945-959.

  13. Visualizations for genetic assignment analyses using the

    The visualization method makes it straightforward to detect features of population structure and to judge the discriminative power of the genetic data for

  14. Genetic assignment of individuals to source populations using

    Our examples illustrate how the network estimation method adapts to population assignment, combining the efficiency and attractive properties of sparse network