Mendelian inheritance¶
-
allel.
mendel_errors
(parent_genotypes, progeny_genotypes)[source]¶ Locate genotype calls not consistent with Mendelian transmission of alleles.
Parameters: - parent_genotypes : array_like, int, shape (n_variants, 2, 2)
Genotype calls for the two parents.
- progeny_genotypes : array_like, int, shape (n_variants, n_progeny, 2)
Genotype calls for the progeny.
Returns: - me : ndarray, int, shape (n_variants, n_progeny)
Count of Mendel errors for each progeny genotype call.
Examples
The following are all consistent with Mendelian transmission. Note that a value of 0 is returned for missing calls:
>>> import allel >>> import numpy as np >>> genotypes = np.array([ ... # aa x aa -> aa ... [[0, 0], [0, 0], [0, 0], [-1, -1], [-1, -1], [-1, -1]], ... [[1, 1], [1, 1], [1, 1], [-1, -1], [-1, -1], [-1, -1]], ... [[2, 2], [2, 2], [2, 2], [-1, -1], [-1, -1], [-1, -1]], ... # aa x ab -> aa or ab ... [[0, 0], [0, 1], [0, 0], [0, 1], [-1, -1], [-1, -1]], ... [[0, 0], [0, 2], [0, 0], [0, 2], [-1, -1], [-1, -1]], ... [[1, 1], [0, 1], [1, 1], [0, 1], [-1, -1], [-1, -1]], ... # aa x bb -> ab ... [[0, 0], [1, 1], [0, 1], [-1, -1], [-1, -1], [-1, -1]], ... [[0, 0], [2, 2], [0, 2], [-1, -1], [-1, -1], [-1, -1]], ... [[1, 1], [2, 2], [1, 2], [-1, -1], [-1, -1], [-1, -1]], ... # aa x bc -> ab or ac ... [[0, 0], [1, 2], [0, 1], [0, 2], [-1, -1], [-1, -1]], ... [[1, 1], [0, 2], [0, 1], [1, 2], [-1, -1], [-1, -1]], ... # ab x ab -> aa or ab or bb ... [[0, 1], [0, 1], [0, 0], [0, 1], [1, 1], [-1, -1]], ... [[1, 2], [1, 2], [1, 1], [1, 2], [2, 2], [-1, -1]], ... [[0, 2], [0, 2], [0, 0], [0, 2], [2, 2], [-1, -1]], ... # ab x bc -> ab or ac or bb or bc ... [[0, 1], [1, 2], [0, 1], [0, 2], [1, 1], [1, 2]], ... [[0, 1], [0, 2], [0, 0], [0, 1], [0, 1], [1, 2]], ... # ab x cd -> ac or ad or bc or bd ... [[0, 1], [2, 3], [0, 2], [0, 3], [1, 2], [1, 3]], ... ]) >>> me = allel.stats.mendel_errors(genotypes[:, :2], genotypes[:, 2:]) >>> me array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]])
The following are cases of ‘non-parental’ inheritance where one or two alleles are found in the progeny that are not present in either parent. Note that the number of errors may be 1 or 2 depending on the number of non-parental alleles:
>>> genotypes = np.array([ ... # aa x aa -> ab or ac or bb or cc ... [[0, 0], [0, 0], [0, 1], [0, 2], [1, 1], [2, 2]], ... [[1, 1], [1, 1], [0, 1], [1, 2], [0, 0], [2, 2]], ... [[2, 2], [2, 2], [0, 2], [1, 2], [0, 0], [1, 1]], ... # aa x ab -> ac or bc or cc ... [[0, 0], [0, 1], [0, 2], [1, 2], [2, 2], [2, 2]], ... [[0, 0], [0, 2], [0, 1], [1, 2], [1, 1], [1, 1]], ... [[1, 1], [0, 1], [1, 2], [0, 2], [2, 2], [2, 2]], ... # aa x bb -> ac or bc or cc ... [[0, 0], [1, 1], [0, 2], [1, 2], [2, 2], [2, 2]], ... [[0, 0], [2, 2], [0, 1], [1, 2], [1, 1], [1, 1]], ... [[1, 1], [2, 2], [0, 1], [0, 2], [0, 0], [0, 0]], ... # ab x ab -> ac or bc or cc ... [[0, 1], [0, 1], [0, 2], [1, 2], [2, 2], [2, 2]], ... [[0, 2], [0, 2], [0, 1], [1, 2], [1, 1], [1, 1]], ... [[1, 2], [1, 2], [0, 1], [0, 2], [0, 0], [0, 0]], ... # ab x bc -> ad or bd or cd or dd ... [[0, 1], [1, 2], [0, 3], [1, 3], [2, 3], [3, 3]], ... [[0, 1], [0, 2], [0, 3], [1, 3], [2, 3], [3, 3]], ... [[0, 2], [1, 2], [0, 3], [1, 3], [2, 3], [3, 3]], ... # ab x cd -> ae or be or ce or de ... [[0, 1], [2, 3], [0, 4], [1, 4], [2, 4], [3, 4]], ... ]) >>> me = allel.stats.mendel_errors(genotypes[:, :2], genotypes[:, 2:]) >>> me array([[1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 1, 2], [1, 1, 1, 2], [1, 1, 1, 2], [1, 1, 1, 1]])
The following are cases of ‘hemi-parental’ inheritance, where progeny appear to have inherited two copies of an allele found only once in one of the parents:
>>> genotypes = np.array([ ... # aa x ab -> bb ... [[0, 0], [0, 1], [1, 1], [-1, -1]], ... [[0, 0], [0, 2], [2, 2], [-1, -1]], ... [[1, 1], [0, 1], [0, 0], [-1, -1]], ... # ab x bc -> aa or cc ... [[0, 1], [1, 2], [0, 0], [2, 2]], ... [[0, 1], [0, 2], [1, 1], [2, 2]], ... [[0, 2], [1, 2], [0, 0], [1, 1]], ... # ab x cd -> aa or bb or cc or dd ... [[0, 1], [2, 3], [0, 0], [1, 1]], ... [[0, 1], [2, 3], [2, 2], [3, 3]], ... ]) >>> me = allel.stats.mendel_errors(genotypes[:, :2], genotypes[:, 2:]) >>> me array([[1, 0], [1, 0], [1, 0], [1, 1], [1, 1], [1, 1], [1, 1], [1, 1]])
The following are cases of ‘uni-parental’ inheritance, where progeny appear to have inherited both alleles from a single parent:
>>> genotypes = np.array([ ... # aa x bb -> aa or bb ... [[0, 0], [1, 1], [0, 0], [1, 1]], ... [[0, 0], [2, 2], [0, 0], [2, 2]], ... [[1, 1], [2, 2], [1, 1], [2, 2]], ... # aa x bc -> aa or bc ... [[0, 0], [1, 2], [0, 0], [1, 2]], ... [[1, 1], [0, 2], [1, 1], [0, 2]], ... # ab x cd -> ab or cd ... [[0, 1], [2, 3], [0, 1], [2, 3]], ... ]) >>> me = allel.stats.mendel_errors(genotypes[:, :2], genotypes[:, 2:]) >>> me array([[1, 1], [1, 1], [1, 1], [1, 1], [1, 1], [1, 1]])
-
allel.
paint_transmission
(parent_haplotypes, progeny_haplotypes)[source]¶ Paint haplotypes inherited from a single diploid parent according to their allelic inheritance.
Parameters: - parent_haplotypes : array_like, int, shape (n_variants, 2)
Both haplotypes from a single diploid parent.
- progeny_haplotypes : array_like, int, shape (n_variants, n_progeny)
Haplotypes found in progeny of the given parent, inherited from the given parent. I.e., haplotypes from gametes of the given parent.
Returns: - painting : ndarray, uint8, shape (n_variants, n_progeny)
An array of integers coded as follows: 1 = allele inherited from first parental haplotype; 2 = allele inherited from second parental haplotype; 3 = reference allele, also carried by both parental haplotypes; 4 = non-reference allele, also carried by both parental haplotypes; 5 = non-parental allele; 6 = either or both parental alleles missing; 7 = missing allele; 0 = undetermined.
Examples
>>> import allel >>> haplotypes = allel.HaplotypeArray([ ... [0, 0, 0, 1, 2, -1], ... [0, 1, 0, 1, 2, -1], ... [1, 0, 0, 1, 2, -1], ... [1, 1, 0, 1, 2, -1], ... [0, 2, 0, 1, 2, -1], ... [0, -1, 0, 1, 2, -1], ... [-1, 1, 0, 1, 2, -1], ... [-1, -1, 0, 1, 2, -1], ... ], dtype='i1') >>> painting = allel.stats.paint_transmission(haplotypes[:, :2], ... haplotypes[:, 2:]) >>> painting array([[3, 5, 5, 7], [1, 2, 5, 7], [2, 1, 5, 7], [5, 4, 5, 7], [1, 5, 2, 7], [6, 6, 6, 7], [6, 6, 6, 7], [6, 6, 6, 7]], dtype=uint8)
-
allel.
phase_by_transmission
(g, window_size, copy=True)[source]¶ Phase genotypes in a trio or cross where possible using Mendelian transmission.
Parameters: - g : array_like, int, shape (n_variants, n_samples, 2)
Genotype array, with parents as first two columns and progeny as remaining columns.
- window_size : int
Number of previous heterozygous sites to include when phasing each parent. A number somewhere between 10 and 100 may be appropriate, depending on levels of heterozygosity and quality of data.
- copy : bool, optional
If False, attempt to phase genotypes in-place. Note that this is only possible if the input array has int8 dtype, otherwise a copy is always made regardless of this parameter.
Returns: - g : GenotypeArray
Genotype array with progeny phased where possible.
-
allel.
phase_progeny_by_transmission
(g)[source]¶ Phase progeny genotypes from a trio or cross using Mendelian transmission.
Parameters: - g : array_like, int, shape (n_variants, n_samples, 2)
Genotype array, with parents as first two columns and progeny as remaining columns.
Returns: - g : ndarray, int8, shape (n_variants, n_samples, 2)
Genotype array with progeny phased where possible.
Examples
>>> import allel >>> g = allel.GenotypeArray([ ... [[0, 0], [0, 0], [0, 0]], ... [[1, 1], [1, 1], [1, 1]], ... [[0, 0], [1, 1], [0, 1]], ... [[1, 1], [0, 0], [0, 1]], ... [[0, 0], [0, 1], [0, 0]], ... [[0, 0], [0, 1], [0, 1]], ... [[0, 1], [0, 0], [0, 1]], ... [[0, 1], [0, 1], [0, 1]], ... [[0, 1], [1, 2], [0, 1]], ... [[1, 2], [0, 1], [1, 2]], ... [[0, 1], [2, 3], [0, 2]], ... [[2, 3], [0, 1], [1, 3]], ... [[0, 0], [0, 0], [-1, -1]], ... [[0, 0], [0, 0], [1, 1]], ... ], dtype='i1') >>> g = allel.stats.phase_progeny_by_transmission(g) >>> print(g.to_str(row_threshold=None)) 0/0 0/0 0|0 1/1 1/1 1|1 0/0 1/1 0|1 1/1 0/0 1|0 0/0 0/1 0|0 0/0 0/1 0|1 0/1 0/0 1|0 0/1 0/1 0/1 0/1 1/2 0|1 1/2 0/1 2|1 0/1 2/3 0|2 2/3 0/1 3|1 0/0 0/0 ./. 0/0 0/0 1/1 >>> g.is_phased array([[False, False, True], [False, False, True], [False, False, True], [False, False, True], [False, False, True], [False, False, True], [False, False, True], [False, False, False], [False, False, True], [False, False, True], [False, False, True], [False, False, True], [False, False, False], [False, False, False]], dtype=bool)
-
allel.
phase_parents_by_transmission
(g, window_size)[source]¶ Phase parent genotypes from a trio or cross, given progeny genotypes already phased by Mendelian transmission.
Parameters: - g : GenotypeArray
Genotype array, with parents as first two columns and progeny as remaining columns, where progeny genotypes are already phased.
- window_size : int
Number of previous heterozygous sites to include when phasing each parent. A number somewhere between 10 and 100 may be appropriate, depending on levels of heterozygosity and quality of data.
Returns: - g : GenotypeArray
Genotype array with parents phased where possible.