Title: Haplotype-Aware Hidden Markov Model for RNA
Version: 1.0.0
Description: Haplotype-aware Hidden Markov Model for RNA (HaHMMR) is a method for detecting copy number variations (CNVs) from bulk RNA-seq data. Additional examples, documentations, and details on the method are available at https://github.com/kharchenkolab/hahmmr/.
Depends: R (≥ 4.1.0)
Imports: data.table, dplyr, GenomicRanges, ggplot2, glue, IRanges, methods, patchwork, Rcpp, stringr, tibble, zoo
Suggests: ggrastr, testthat
LinkingTo: Rcpp, RcppArmadillo, roptim
NeedsCompilation: yes
License: MIT + file LICENSE
Encoding: UTF-8
RoxygenNote: 7.2.3
LazyData: true
Packaged: 2023-10-25 14:22:02 UTC; tenggao
Author: Teng Gao ORCID iD [aut, cre], Evan Biederstedt [aut], Peter Kharchenko [aut]
Maintainer: Teng Gao <tgaoteng@gmail.com>
Repository: CRAN
Date/Publication: 2023-10-25 18:00:10 UTC

centromere regions (hg19)

Description

centromere regions (hg19)

Usage

acen_hg19

Format

An object of class tbl_df (inherits from tbl, data.frame) with 22 rows and 3 columns.


centromere regions (hg38)

Description

centromere regions (hg38)

Usage

acen_hg38

Format

An object of class tbl_df (inherits from tbl, data.frame) with 22 rows and 3 columns.


Analyze allele profile

Description

Analyze allele profile

Usage

analyze_allele(
  bulk,
  t = 1e-05,
  theta_min = 0.08,
  gamma = 20,
  nu = 0.5,
  r = 0.015,
  hmm = "S5",
  fit_theta = FALSE,
  fit_gamma = FALSE,
  theta_start = 0.05,
  verbose = TRUE
)

Arguments

bulk

dataframe Bulk allele profile

t

numeric Transition probability

theta_min

numeric Minimum allele fraction

gamma

numeric Overdispersion parameter

nu

numeric Phase switch rate

r

numeric Alternative allele count bias

hmm

character HMM model to use (S3 or S5)

fit_theta

logical Whether to fit theta_min

fit_gamma

logical Whether to fit gamma

theta_start

numeric Starting value for theta_min

verbose

logical Whether to print progress

Value

dataframe Bulk allele profile with CNV states

Examples

bulk_example = analyze_allele(bulk_example, hmm = 'S5')

Analyze allele and expression profile

Description

Analyze allele and expression profile

Usage

analyze_joint(
  bulk,
  t = 1e-05,
  gamma = 20,
  theta_min = 0.08,
  logphi_min = 0.25,
  hmm = "S15",
  nu = 1,
  min_genes = 10,
  r = 0.015,
  theta_start = 0.05,
  exclude_neu = TRUE,
  fit_gamma = FALSE,
  fit_theta = FALSE,
  verbose = TRUE
)

Arguments

bulk

dataframe Bulk allele and expression profile

t

numeric Transition probability

gamma

numeric Overdispersion parameter

theta_min

numeric Minimum allele fraction

logphi_min

numeric Minimum log2 fold change

hmm

character HMM model to use (S7 or S15)

nu

numeric Phase switch rate

min_genes

integer Minimum number of genes per segment

r

numeric Alternative allele count bias

theta_start

numeric Starting value for theta_min

exclude_neu

logical Whether to exclude neutral segments in retest

fit_gamma

logical Whether to fit gamma

fit_theta

logical Whether to fit theta_min

verbose

logical Whether to print progress

Value

dataframe Bulk allele and expression profile with CNV states

Examples

bulk_example = analyze_joint(bulk_example, hmm = 'S15')

Annotate genetic distance between markers

Description

Annotate genetic distance between markers

Usage

annot_cm(bulk, genetic_map)

Arguments

bulk

dataframe Pseudobulk profile

genetic_map

dataframe Genetic map

Value

dataframe Annotated pseudobulk profile


Annotate copy number segments after HMM decoding

Description

Annotate copy number segments after HMM decoding

Usage

annot_segs(bulk, var = "cnv_state")

Arguments

bulk

dataframe Pseudobulk profile

Value

a pseudobulk dataframe


Laplace approximation of the posterior of expression fold change phi

Description

Laplace approximation of the posterior of expression fold change phi

Usage

approx_phi_post(
  Y_obs,
  lambda_ref,
  d,
  alpha = NULL,
  beta = NULL,
  mu = NULL,
  sig = NULL,
  lower = 0.2,
  upper = 10,
  start = 1
)

Arguments

Y_obs

numeric vector Gene expression counts

lambda_ref

numeric vector Reference expression levels

d

numeric Total library size

alpha

numeric Shape parameter of the gamma distribution

beta

numeric Rate parameter of the gamma distribution

mu

numeric Mean of the normal distribution

sig

numeric Standard deviation of the normal distribution

lower

numeric Lower bound of phi

upper

numeric Upper bound of phi

start

numeric Starting value of phi

Value

numeric MLE of phi and its standard deviation


Find optimal theta of a CNV segment using forward-backward; uses a 2-state allele HMM

Description

Find optimal theta of a CNV segment using forward-backward; uses a 2-state allele HMM

Usage

approx_theta_post_s2(
  pAD,
  DP,
  R,
  p_s,
  upper = 0.499,
  start = 0.25,
  gamma = 20,
  r = 0.015
)

Find optimal theta of a chromosome using forward-backward; uses a 3-state allele HMM

Description

Find optimal theta of a chromosome using forward-backward; uses a 3-state allele HMM

Usage

approx_theta_post_s3(
  pAD,
  DP,
  R,
  p_s,
  t,
  upper = 0.45,
  start = 0.05,
  gamma = 20,
  r = 0.015
)

example pseudobulk dataframe

Description

example pseudobulk dataframe

Usage

bulk_example

Format

An object of class tbl_df (inherits from tbl, data.frame) with 10321 rows and 58 columns.


Calculate LLR for an allele HMM

Description

Calculate LLR for an allele HMM

Usage

calc_allele_LLR(pAD, DP, R, p_s, theta_mle, theta_0 = 0, gamma = 20, r = 0.015)

Arguments

pAD

numeric vector Phased allele depth

DP

numeric vector Total allele depth

R

numeric vector Allelic bias direction

p_s

numeric vector Phase switch probabilities

theta_mle

numeric MLE of imbalance level theta (alternative hypothesis)

theta_0

numeric Imbalance level in the null hypothesis

gamma

numeric Dispersion parameter for the Beta-Binomial allele model

Value

numeric Log-likelihood ratio


Calculate allele likelihoods for 2-state allele HMM

Description

Calculate allele likelihoods for 2-state allele HMM

Usage

calc_allele_lik_s2(pAD, DP, R, p_s, theta, gamma = 20, r = 0.015)

Arguments

pAD

integer vector Paternal allele counts

DP

integer vector Total alelle counts

R

numeric vector Variant mapping bias direction

p_s

numeric vector Phase switch probabilities

theta

numeric Haplotype imbalance

gamma

numeric Overdispersion in the allele-specific expression

Value

numeric vector Allele likelihoods


Calculate allele likelihoods for 3-state allele HMM

Description

Calculate allele likelihoods for 3-state allele HMM

Usage

calc_allele_lik_s3(pAD, DP, R, p_s, t, theta, gamma = 20, r = 0.015)

Arguments

pAD

integer vector Paternal allele counts

DP

integer vector Total alelle counts

p_s

numeric vector Phase switch probabilities

theta

numeric Haplotype imbalance

gamma

numeric Overdispersion in the allele-specific expression


Calculate LLR for an expression HMM

Description

Calculate LLR for an expression HMM

Usage

calc_exp_LLR(
  Y_obs,
  lambda_ref,
  d,
  phi_mle,
  mu = NULL,
  sig = NULL,
  alpha = NULL,
  beta = NULL
)

Arguments

Y_obs

numeric vector Gene expression counts

lambda_ref

numeric vector Reference expression levels

d

numeric vector Total library size

phi_mle

numeric MLE of expression fold change phi (alternative hypothesis)

mu

numeric Mean parameter for the PLN expression model

sig

numeric Dispersion parameter for the PLN expression model

alpha

numeric Hyperparameter for the gamma poisson model (not used)

beta

numeric Hyperparameter for the gamma poisson model (not used)

Value

numeric Log-likelihood ratio


Calculate the transition matrix for 15-state joint HMM

Description

Calculate the transition matrix for 15-state joint HMM

Usage

calc_trans_mat_s15(t, p_s, w, states_cn, states_phase)

Arguments

t

numeric; CNV state transition probability

p_s

numeric; phase switch probability

w

numeric; relative abundance of states

states_cn

character; CNV states

states_phase

character; haplotype phase states

Value

array; transition matrix


Calculate the transition matrix for 7-state joint HMM

Description

Calculate the transition matrix for 7-state joint HMM

Usage

calc_trans_mat_s7(t, p_s, w, states_cn, states_phase)

Arguments

t

numeric; CNV state transition probability

p_s

numeric; phase switch probability

w

numeric; relative abundance of states

states_cn

character; CNV states

states_phase

character; haplotype phase states

Value

array; transition matrix


Check if columns are present in dataframe

Description

Check if columns are present in dataframe

Usage

check_cols(df, cols)

Arguments

df

dataframe Dataframe

cols

character vector Column names

Value

dataframe Dataframe


Check the format of a count matrix

Description

Check the format of a count matrix

Usage

check_matrix(count_mat)

Arguments

count_mat

matrix Count matrix

Value

matrix Count matrix


chromosome sizes (hg19)

Description

chromosome sizes (hg19)

Usage

chrom_sizes_hg19

Format

An object of class data.table (inherits from data.frame) with 22 rows and 2 columns.


chromosome sizes (hg38)

Description

chromosome sizes (hg38)

Usage

chrom_sizes_hg38

Format

An object of class data.table (inherits from data.frame) with 22 rows and 2 columns.


classify alleles using viterbi and forward-backward

Description

classify alleles using viterbi and forward-backward

Usage

classify_alleles(bulk)

Arguments

bulk

dataframe Pesudobulk profile

Value

dataframe Pesudobulk profile


Combine allele and expression pseudobulks

Description

Combine allele and expression pseudobulks

Usage

combine_bulk(allele_bulk, exp_bulk, gtf)

Arguments

allele_bulk

dataframe Bulk allele profile

exp_bulk

dataframe Bulk expression profile

gtf

dataframe Transcript gtf

Value

dataframe Pseudobulk allele and expression profile


Beta-binomial distribution density function A distribution is beta-binomial if p, the probability of success, in a binomial distribution has a beta distribution with shape parameters alpha > 0 and beta > 0 For more details, see extraDistr::dbbinom

Description

Beta-binomial distribution density function A distribution is beta-binomial if p, the probability of success, in a binomial distribution has a beta distribution with shape parameters alpha > 0 and beta > 0 For more details, see extraDistr::dbbinom

Usage

dbbinom(x, size, alpha = 1, beta = 1, log = FALSE)

Arguments

x

vector of quantiles

size

number of trials (zero or more)

alpha

numeric (default=1)

beta

numeric (default=1)

log

boolean (default=FALSE)

Value

numeric Probability density values

Examples

dbbinom(1, 1, 1, 1)

example allele count dataframe

Description

example allele count dataframe

Usage

df_allele_example

Format

An object of class data.table (inherits from data.frame) with 9957 rows and 11 columns.


Returns the density for the Poisson lognormal distribution with parameters mu and sig

Description

Returns the density for the Poisson lognormal distribution with parameters mu and sig

Usage

dpoilog(x, mu, sig, log = FALSE)

Arguments

x

vector of integers, the observations

mu

mean of lognormal distribution

sig

standard deviation of lognormal distribution

log

boolean Return the log density if TRUE (default=FALSE)

Value

numeric Probability density values

Examples

p = dpoilog(1, 1, 1)

filter for mutually expressed genes

Description

filter for mutually expressed genes

Usage

filter_genes(count_mat, lambdas_ref, gtf, verbose = FALSE)

Arguments

count_mat

dgCMatrix Gene expression counts

lambdas_ref

named numeric vector A reference expression profile

gtf

dataframe Transcript gtf

Value

vector Genes that are kept after filtering


fit gamma maximum likelihood

Description

fit gamma maximum likelihood

Usage

fit_gamma(AD, DP, r = 0.015, start = 20)

Arguments

AD

numeric vector Variant allele depth

DP

numeric vector Total allele depth

r

numeric Degree of allele bias

Value

a fit


Fit MLE of log-normal Poisson model

Description

Fit MLE of log-normal Poisson model

Usage

fit_lnpois_cpp(Y_obs, lambda_ref, d)

Arguments

Y_obs

Vector of observed counts

lambda_ref

Vector of reference rates

d

integer Total depth

Value

NumericVector MLE estimates of mu and sigma


Fit a reference profile from multiple references using constrained least square

Description

Fit a reference profile from multiple references using constrained least square

Usage

fit_ref_sse(Y_obs, lambdas_ref, gtf, min_lambda = 2e-06, verbose = FALSE)

Arguments

Y_obs

vector

lambdas_ref

matrix

gtf

dataframe

Value

fitted expression profile


Forward-backward algorithm for allele HMM

Description

Forward-backward algorithm for allele HMM

Usage

forward_back_allele(hmm)

Arguments

hmm

HMM object; expect variables x (allele depth), d (total depth), logPi (log transition prob matrix), delta (prior for each state), alpha (alpha for each state), beta (beta for each state), states (states), p_s (phase switch probs)

Value

numeric matrix; posterior probabilities

Examples

forward_back_allele(pre_likelihood_hmm)

genome gap regions (hg19)

Description

genome gap regions (hg19)

Usage

gaps_hg19

Format

An object of class data.table (inherits from data.frame) with 28 rows and 3 columns.


genome gap regions (hg38)

Description

genome gap regions (hg38)

Usage

gaps_hg38

Format

An object of class data.table (inherits from data.frame) with 30 rows and 3 columns.


example gene expression counts matrix

Description

example gene expression counts matrix

Usage

gene_counts_example

Format

An object of class matrix (inherits from array) with 1758 rows and 1 columns.


Generate alphabetical postfixes

Description

Generate alphabetical postfixes

Usage

generate_postfix(n)

Arguments

n

vector of integers

Value

vector of alphabetical postfixes


Aggregate into pseudobulk alelle profile

Description

Aggregate into pseudobulk alelle profile

Usage

get_allele_bulk(df_allele, gtf, genetic_map = NULL, nu = 0.5, min_depth = 0)

Arguments

df_allele

dataframe Single-cell allele counts

gtf

dataframe Transcript gtf

genetic_map

dataframe Genetic map

nu

numeric Phase switch rate

min_depth

integer Minimum coverage to filter SNPs

Value

dataframe Pseudobulk allele profile

Examples

bulk_example = get_allele_bulk(
    df_allele = df_allele_example,
    gtf = gtf_hg38)

Make a 2-state allele HMM - no transitions to netural state

Description

Make a 2-state allele HMM - no transitions to netural state

Usage

get_allele_hmm_s2(pAD, DP, R, p_s, theta, gamma = 20, r = 0.015)

Arguments

pAD

integer vector Paternal allele counts

DP

integer vector Total alelle counts

p_s

numeric vector Phase switch probabilities

theta

numeric Haplotype imbalance

gamma

numeric Overdispersion in the allele-specific expression

Value

HMM object


Make a 3-state allele HMM - allow transitions to netural state

Description

Make a 3-state allele HMM - allow transitions to netural state

Usage

get_allele_hmm_s3(pAD, DP, R, p_s, t, theta_min, gamma = 20, r = 0.015)

Arguments

pAD

integer vector Paternal allele counts

DP

integer vector Total alelle counts

p_s

numeric vector Phase switch probabilities

t

numeric Transition probability between haplotype states

theta_min

numeric Haplotype imbalance

gamma

numeric Overdispersion in the allele-specific expression

r

numeric Variant mapping bias

Value

HMM object


Produce combined bulk expression and allele profile

Description

Produce combined bulk expression and allele profile

Usage

get_bulk(
  count_mat,
  lambdas_ref,
  df_allele,
  gtf,
  genetic_map = NULL,
  min_depth = 0,
  nu = 1,
  verbose = TRUE
)

Arguments

count_mat

matrix Gene expression counts

lambdas_ref

matrix Reference expression profiles

df_allele

dataframe Allele counts

gtf

dataframe Transcript gtf

genetic_map

dataframe Genetic map

min_depth

integer Minimum coverage to filter SNPs

nu

numeric Phase switch rate

verbose

logical Whether to print progress

Value

dataframe Pseudobulk gene expression and allele profile

Examples

bulk_example = get_bulk(
    count_mat = gene_counts_example,
    lambdas_ref = ref_hca,
    df_allele = df_allele_example,
    gtf = gtf_hg38)

Aggregate into bulk expression profile

Description

Aggregate into bulk expression profile

Usage

get_exp_bulk(count_mat, lambdas_ref, gtf, verbose = FALSE)

Arguments

count_mat

dgCMatrix Gene expression counts

lambdas_ref

matrix Reference expression profiles

gtf

dataframe Transcript gtf

Value

dataframe Pseudobulk gene expression profile


Helper function to calculate transition porbabilities for 15-state joint HMM cn/phase are sclars, only p_s is vectorized

Description

Helper function to calculate transition porbabilities for 15-state joint HMM cn/phase are sclars, only p_s is vectorized

Usage

get_trans_probs_s15(t, p_s, w, cn_from, phase_from, cn_to, phase_to)

Arguments

t

numeric; CNV state transition probability

p_s

numeric; phase switch probability

w

numeric; relative abundance of states

cn_from

character; originating CNV state

phase_from

character; originating haplotype phase state

cn_to

character; destination CNV state

phase_to

character; destination haplotype phase state

Value

numeric; transition probability


Helper function to calculate transition porbabilities for 7-state joint HMM cn/phase are sclars, only p_s is vectorized

Description

Helper function to calculate transition porbabilities for 7-state joint HMM cn/phase are sclars, only p_s is vectorized

Usage

get_trans_probs_s7(t, p_s, w, cn_from, phase_from, cn_to, phase_to)

Arguments

t

numeric; CNV state transition probability

p_s

numeric; phase switch probability

w

numeric; relative abundance of states

cn_from

character; originating CNV state

phase_from

character; originating haplotype phase state

cn_to

character; destination CNV state

phase_to

character; destination haplotype phase state

Value

numeric; transition probability


gene model (hg19)

Description

gene model (hg19)

Usage

gtf_hg19

Format

An object of class data.table (inherits from data.frame) with 26841 rows and 5 columns.


gene model (hg38)

Description

gene model (hg38)

Usage

gtf_hg38

Format

An object of class data.table (inherits from data.frame) with 26807 rows and 5 columns.


gene model (mm10)

Description

gene model (mm10)

Usage

gtf_mm10

Format

An object of class data.table (inherits from data.frame) with 30336 rows and 5 columns.


calculate joint likelihood of allele data

Description

calculate joint likelihood of allele data

Usage

l_bbinom(AD, DP, alpha, beta)

Arguments

AD

numeric vector Variant allele depth

DP

numeric vector Total allele depth

alpha

numeric Alpha parameter of Beta-Binomial distribution

beta

numeric Beta parameter of Beta-Binomial distribution

Value

numeric Joint log likelihood

Examples

l_bbinom(c(1, 2), c(1, 2), 1, 1)

calculate joint likelihood of a PLN model

Description

calculate joint likelihood of a PLN model

Usage

l_lnpois(Y_obs, lambda_ref, d, mu, sig, phi = 1)

Arguments

Y_obs

numeric vector Gene expression counts

lambda_ref

numeric vector Reference expression levels

d

numeric Total library size

mu

numeric Global mean expression

sig

numeric Global standard deviation of expression

phi

numeric Fold change of expression

Value

numeric Joint log likelihood

Examples

l_lnpois(c(1, 2), c(1, 2), 1, 1, 1)

Only compute total log likelihood from an allele HMM

Description

Only compute total log likelihood from an allele HMM

Usage

likelihood_allele(hmm)

Arguments

hmm

HMM object; expect variables x (allele depth), d (total depth), logPi (log transition prob matrix), delta (prior for each state), alpha (alpha for each state), beta (beta for each state), states (states), p_s (phase switch probs)

Value

numeric; total log likelihood

Examples

likelihood_allele(pre_likelihood_hmm)

logSumExp function

Description

logSumExp function

Usage

logSumExp(x)

Arguments

x

NumericVector

Value

double logSumExp of x


Plot a group of pseudobulk HMM profiles

Description

Plot a group of pseudobulk HMM profiles

Usage

plot_bulks(bulks, ..., ncol = 1, title = TRUE, title_size = 8)

Arguments

bulks

dataframe Pseudobulk profiles annotated with "sample" column

...

additional parameters passed to plot_psbulk()

ncol

integer Number of columns

title

logical Whether to add titles to individual plots

title_size

numeric Size of titles

Value

a ggplot object

Examples

p = plot_bulks(bulk_example)

Plot a pseudobulk HMM profile

Description

Plot a pseudobulk HMM profile

Usage

plot_psbulk(
  bulk,
  use_pos = TRUE,
  allele_only = FALSE,
  min_LLR = 5,
  min_depth = 8,
  exp_limit = 2,
  phi_mle = TRUE,
  theta_roll = FALSE,
  dot_size = 0.8,
  dot_alpha = 0.5,
  legend = TRUE,
  exclude_gap = TRUE,
  genome = "hg38",
  text_size = 10,
  raster = FALSE
)

Arguments

bulk

dataframe Pseudobulk profile

use_pos

logical Use marker position instead of index as x coordinate

allele_only

logical Only plot alleles

min_LLR

numeric LLR threshold for event filtering

min_depth

numeric Minimum coverage depth for a SNP to be plotted

exp_limit

numeric Expression logFC axis limit

phi_mle

logical Whether to plot estimates of segmental expression fold change

theta_roll

logical Whether to plot rolling estimates of allele imbalance

dot_size

numeric Size of marker dots

dot_alpha

numeric Transparency of the marker dots

legend

logical Whether to show legend

exclude_gap

logical Whether to mark gap regions and centromeres

genome

character Genome build, either 'hg38' or 'hg19'

text_size

numeric Size of text in the plot

raster

logical Whether to raster images

Value

ggplot Plot of pseudobulk HMM profile

Examples

p = plot_psbulk(bulk_example)

HMM object for unit tests

Description

HMM object for unit tests

Usage

pre_likelihood_hmm

Format

An object of class list of length 10.


reference expression magnitudes from HCA

Description

reference expression magnitudes from HCA

Usage

ref_hca

Format

An object of class matrix (inherits from array) with 24756 rows and 12 columns.


reference expression counts from HCA

Description

reference expression counts from HCA

Usage

ref_hca_counts

Format

An object of class matrix (inherits from array) with 24857 rows and 12 columns.


Run a 3-state allele HMM

Description

Run a 3-state allele HMM

Usage

run_allele_hmm_s3(
  pAD,
  DP,
  R,
  p_s,
  t = 1e-05,
  theta_min = 0.08,
  gamma = 20,
  r = 0.015
)

Arguments

pAD

integer vector Paternal allele counts

DP

integer vector Total alelle counts

R

numeric vector Variant mapping bias direction

p_s

numeric vector Phase switch probabilities

t

numeric Transition probability between copy number states

theta_min

numeric Minimum haplotype frequency deviation threshold

gamma

numeric Overdispersion in the allele-specific expression

Value

character vector Decoded states


Run a 5-state allele-only HMM - two theta levels

Description

Run a 5-state allele-only HMM - two theta levels

Usage

run_allele_hmm_s5(
  pAD,
  DP,
  p_s,
  t = 1e-05,
  theta_min = 0.08,
  gamma = 20,
  prior = NULL,
  ...
)

Arguments

pAD

integer vector Paternal allele counts

DP

integer vector Total alelle counts

p_s

numeric vector Phase switch probabilities

t

numeric Transition probability between copy number states

theta_min

numeric Minimum haplotype frequency deviation threshold

gamma

numeric Overdispersion in the allele-specific expression

prior

numeric vector Prior probabilities for each state

...

Additional parameters

Value

character vector Decoded states

Examples

with(bulk_example, {
    run_allele_hmm_s5(pAD = pAD, DP = DP, R = R, p_s = p_s, theta_min = 0.08, gamma = 30)
})

Run 15-state joint HMM on a pseudobulk profile

Description

Run 15-state joint HMM on a pseudobulk profile

Usage

run_joint_hmm_s15(
  pAD,
  DP,
  p_s,
  Y_obs = 0,
  lambda_ref = 0,
  d_total = 0,
  theta_min = 0.08,
  theta_neu = 0,
  bal_cnv = TRUE,
  phi_del = 2^(-0.25),
  phi_amp = 2^(0.25),
  phi_bamp = phi_amp,
  phi_bdel = phi_del,
  mu = 0,
  sig = 1,
  r = 0.015,
  t = 1e-05,
  gamma = 18,
  prior = NULL,
  exp_only = FALSE,
  allele_only = FALSE,
  classify_allele = FALSE,
  debug = FALSE,
  ...
)

Arguments

pAD

integer vector Paternal allele counts

DP

integer vector Total alelle counts

p_s

numeric vector Phase switch probabilities

Y_obs

numeric vector Observed gene counts

lambda_ref

numeric vector Reference expression rates

d_total

integer Total library size for expression counts

theta_min

numeric Minimum haplotype imbalance threshold

theta_neu

numeric Haplotype imbalance threshold for neutral state

bal_cnv

logical Whether to include balanced CNV states

phi_del

numeric Expected fold change for deletion

phi_amp

numeric Expected fold change for amplification

phi_bamp

numeric Expected fold change for balanced amplification

phi_bdel

numeric Expected fold change for balanced deletion

mu

numeric Global expression bias

sig

numeric Global expression variance

r

numeric Variant mapping bias

t

numeric Transition probability between copy number states

gamma

numeric Overdispersion in the allele-specific expression

prior

numeric vector Prior probabilities for each state

exp_only

logical Whether to only use expression data

allele_only

logical Whether to only use allele data

classify_allele

logical Whether to classify allele states

debug

logical Whether to print debug messages

...

Additional parameters

Value

character vector Decoded states

Examples

with(bulk_example, {
    run_joint_hmm_s15(pAD = pAD, DP = DP, p_s = p_s, Y_obs = Y_obs, lambda_ref = lambda_ref, 
    d_total = na.omit(unique(d_obs)), mu = mu, sig = sig, t = 1e-5, gamma = 30, theta_min = 0.08)
})

Run 7-state joint HMM on a pseudobulk profile

Description

Run 7-state joint HMM on a pseudobulk profile

Usage

run_joint_hmm_s7(
  pAD,
  DP,
  R,
  p_s,
  Y_obs,
  lambda_ref,
  d_total,
  theta_min = 0.08,
  phi_del = 2^(-0.25),
  phi_amp = 2^(0.25),
  t = 1e-05,
  mu = 0,
  sig = 1,
  gamma = 20,
  r = 0.015,
  debug = FALSE
)

Arguments

pAD

integer vector Paternal allele counts

DP

integer vector Total alelle counts

R

numeric vector Variant mapping bias direction

p_s

numeric vector Phase switch probabilities

Y_obs

numeric vector Observed gene counts

lambda_ref

numeric vector Reference expression rates

d_total

integer Total library size for expression counts

theta_min

numeric Minimum haplotype imbalance threshold

phi_del

numeric Expected fold change for deletion

phi_amp

numeric Expected fold change for amplification

t

numeric Transition probability between copy number states

mu

numeric Global expression bias

sig

numeric Global expression variance

gamma

numeric Overdispersion in the allele-specific expression

r

numeric Variant mapping bias

debug

logical Whether to print debug messages

Value

character vector Decoded states


example CNV segments dataframe

Description

example CNV segments dataframe

Usage

segs_example

Format

An object of class data.table (inherits from data.frame) with 27 rows and 30 columns.


Smooth the segments after HMM decoding

Description

Smooth the segments after HMM decoding

Usage

smooth_segs(bulk, min_genes = 10)

Arguments

bulk

dataframe Pseudobulk profile

min_genes

integer Minimum number of genes to call a segment

Value

dataframe Pseudobulk profile


predict phase switch probablity as a function of genetic distance

Description

predict phase switch probablity as a function of genetic distance

Usage

switch_prob_cm(d, nu = 1, min_p = 1e-10)

Arguments

d

numeric vector Genetic distance in cM

nu

numeric Phase switch rate

min_p

numeric Minimum phase switch probability

Value

numeric vector Phase switch probability


Estimate of imbalance level theta in a segment

Description

Estimate of imbalance level theta in a segment

Usage

theta_hat_seg(major_count, minor_count)

Arguments

major_count

vector of major allele count

minor_count

vector of minor allele count

Value

estimate of theta


example VCF header

Description

example VCF header

Usage

vcf_meta

Format

An object of class character of length 65.


Viterbi algorithm for allele HMM

Description

Viterbi algorithm for allele HMM

Usage

viterbi_allele(hmm)

Arguments

hmm

HMM object; expect variables x (allele depth), d (total depth), logPi (log transition prob matrix), delta (prior for each state), alpha (alpha for each state), beta (beta for each state), states (states), p_s (phase switch probs)

Value

character vector Decoded states


Viterbi algorithm for joint HMM, can only handle one set of observations

Description

Viterbi algorithm for joint HMM, can only handle one set of observations

Usage

viterbi_joint(hmm)

Arguments

hmm

HMM object; expect variables x (allele depth), d (total depth), y (expression count), l (cell total library size), lambda (reference expression rate), mu (global expression mean), sig (global expression standard deviation), logPi (log transition prob matrix), phi (expression fold change for each state), delta (prior for each state), alpha (alpha for each state), beta (beta for each state), states (states), p_s (phase switch probs)

Value

character vector Decoded states


Generalized viterbi algorithm for joint HMM, can handle multiple sets of observations

Description

Generalized viterbi algorithm for joint HMM, can handle multiple sets of observations

Usage

viterbi_joint_mat(hmm)

Arguments

hmm

HMM object; expect variables x (allele depth), d (total depth), y (expression count), l (cell total library size), lambda (reference expression rate), mu (global expression mean), sig (global expression standard deviation), logPi (log transition prob matrix), phi (expression fold change for each state), delta (prior for each state), alpha (alpha for each state), beta (beta for each state), states (states), p_s (phase switch probs)

Value

character vector; decoded states