Welcome to Data Module’s documentation!¶
!!! This module is a work in progress and is not finished or well documented !!!
Contents:
Useful links¶
Indices and tables¶
Docs¶
-
class
asmc::HapsMatrixType¶ A class that stores information in a #sites x #haps matrix of booleans.
Public Functions
-
unsigned long
getNumIndividuals() const¶ - Return
the number of individuals, determined from the .sample[s] file
-
unsigned long
getNumHaps() const¶ - Return
the number of haps: twice the number of individuals
-
unsigned long
getNumSites() const¶ - Return
the number of sites, determined from the .map file
-
const std::vector<unsigned long> &
getPhysicalPositions() const¶ - Return
a vector of physical positions, read in from the .map file
-
const std::vector<double> &
getGeneticPositions() const¶ - Return
a vector of genetic positions, in centimorgans, read in from the .map file
-
const mat_uint8_t &
getData() const¶ - Return
the matrix of raw uint8_t data
-
mat_float_t
getDataAsFloat() const¶ - Return
the matrix of raw data, cast to float
-
rvec_uint8_t
getSite(unsigned long siteId) const¶ Get all haplotype data for a single site. This is a row from the data matrix and will be a boolean row vector of length 2N where N is the number of individuals.
- Return
the ith row of the data matrix, where i is siteId.
- Parameters
siteId: the id of the site
-
cvec_uint8_t
getHap(unsigned long hapId) const¶ Get all site data for a single haplotype. This is a column from the data matrix and will be a boolean column vector of length N where N is the number of sites.
- Return
the jth row of the data matrix, where j is hapId.
- Parameters
hapId: the id of the haplotype
-
mat_uint8_t
getIndividual(unsigned long individualId) const¶ Get all site data for a single individual. This is two adjacent columns from the data matrix, returned as a matrix.
- Return
the jth row of the data matrix, where j is hapId.
- Parameters
individualId: the id of the individual
-
unsigned long
getMinorAlleleCount(unsigned long siteId) const¶ Get the minor allele count for a given site. This is a number in [0, #haps/2].
- Return
the minor allele count for the given site
- Parameters
siteId: the site ID
-
unsigned long
getDerivedAlleleCount(unsigned long siteId) const¶ Get the derived allele count for a given site. This is the raw count, assuming 1 means derived.
- Return
the derived allele count for the given site
- Parameters
siteId: the site ID
-
cvec_ul_t
getMinorAlleleCounts() const¶ Get the raw minor allele counts for all sites. Each is a number in [0, #haps/2].
- Return
a vector of minor allele counts, one for each site
-
cvec_ul_t
getDerivedAlleleCounts() const¶ Get the derived allele count for all sites. These are the raw counts, assuming 1 means derived.
- Return
a vector of derived allele counts, one for each site
-
double
getMinorAlleleFrequency(unsigned long siteId) const¶ Get the minor allele frequency for a given site. This is a number in [0, 0.5].
- Return
the minor allele frequency for the given site
- Parameters
siteId: the site ID
-
double
getDerivedAlleleFrequency(unsigned long siteId) const¶ Get the derived allele frequency for a given site. This is the raw frequency, assuming 1 means derived.
- Return
the derived allele frequency for the given site
- Parameters
siteId: the site ID
-
cvec_dbl_t
getMinorAlleleFrequencies() const¶ Get the minor allele frequencies for all sites. These are numbers in [0, 0.5].
- Return
a vector of minor allele frequencies, one for each site
-
cvec_dbl_t
getDerivedAlleleFrequencies() const¶ Get the derived allele frequencies for all sites. These are the raw frequencies, assuming 1 means derived.
- Return
a vector of minor allele frequencies, one for each site
-
double
getFrequency(unsigned long siteId) const¶
Public Static Functions
-
static HapsMatrixType
createFromHapsPlusSamples(std::string_view hapsFile, std::string_view samplesFile, std::string_view mapFile)¶ Create a HapsMatrixType from a .hap[s][.gz], a .sample[s] file, and a .map file.
- Return
instance of a HapsMatrixType
- Parameters
hapsFile: path to the .hap[s][.gz] filesamplesFile: path to the .sample[s] filemapFile: path to the .map file
-
unsigned long