Reads a size regression file
read_size_regression.Rd
Reads a size regression file
Arguments
- filename
Path to file (character).
- exceptions
Optionally a list providing sizes for alleles not covered by the regression. See examples for how this can be used to assign sizes to X and Y at the Amelogenin locus.
- repeat_length_by_marker
Optionally a named integer vector with repeat lengths by marker. If not provided, then a .3 allele will not convert to e.g. .75 for a tetranucleotide.
Details
Read a regression file from disk and returns a function that provides the fragment length (bp) for a given locus and allele.
DNA profiles consist of the observed peaks (alleles or stutter products) at several loci as well as the peak heights and sizes. The size refers to the fragment length (bp). A linear relationship exists between the size of a peak and the size. When peaks are sampled in the sample_mixture_from_genotypes function, a size is assigned using a size regression. The read_size_regression
function reads such a regression from disk.
Examples
filename <- system.file("extdata",
"GlobalFiler_SizeRegression.csv",
package = "simDNAmixtures")
regression <- read_size_regression(filename)
# obtain size for the 12 allele at the vWA locus
regression("vWA", 12)
#> [1] 160.7627
# now add AMEL sizes
regression_with_AMEL <- read_size_regression(filename, exceptions = list(
AMEL = setNames(c(98.5, 104.5), nm = c("X", "Y"))))
# check that we can obtain size for X at AMEL
stopifnot(regression_with_AMEL("AMEL", "X") == 98.5)
# pass in repeat_length_by_marker for more precise estimates
gf <- gf_configuration()
regression_with_repeat_length <- read_size_regression(filename,
repeat_length_by_marker = gf$repeat_length_by_marker)
# obtain size for the 15.3 allele at the D1S1656 locus
stopifnot(regression_with_repeat_length("D1S1656", 15.3) ==
121.628203912362 + 15.75 * 4.2170043570021)