Title: | Bioinformatics Library for Integrated Tools |
Version: | 0.2.0 |
Description: | An all-encompassing R toolkit designed to streamline the process of calling various bioinformatics software and then performing data analysis and visualization in R. With 'blit', users can easily integrate a wide array of bioinformatics command line tools into their workflows, leveraging the power of R for sophisticated data manipulation and graphical representation. |
License: | GPL (≥ 3) |
Imports: | cli, processx, R6 (≥ 2.4.0), rlang (≥ 1.1.0), utils |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/WangLabCSU/blit |
BugReports: | https://github.com/WangLabCSU/blit/issues |
NeedsCompilation: | no |
Packaged: | 2025-03-29 11:40:40 UTC; yun |
Author: | Yun Peng |
Maintainer: | Yun Peng <yunyunp96@163.com> |
Repository: | CRAN |
Date/Publication: | 2025-03-29 12:00:02 UTC |
blit: Bioinformatics Library for Integrated Tools
Description
An all-encompassing R toolkit designed to streamline the process of calling various bioinformatics software and then performing data analysis and visualization in R. With 'blit', users can easily integrate a wide array of bioinformatics command line tools into their workflows, leveraging the power of R for sophisticated data manipulation and graphical representation.
Author(s)
Maintainer: Yun Peng yunyunp96@163.com (ORCID)
Authors:
Shixiang Wang w_shixiang@163.com (ORCID)
Other contributors:
Jennifer Lu jennifer.lu717@gmail.com (Author of the included scripts from Kraken2 and KrakenTools libraries) [copyright holder]
Li Song Li.Song@dartmouth.edu (Author of included scripts from TRUST4 library) [copyright holder]
X. Shirley Liu xsliu@ds.dfci.harvard.edu (Author of included scripts from TRUST4 library) [copyright holder]
See Also
Useful links:
R6 Class to prepare command parameters.
Description
Command
is an R6 class used by developers to create new command. It should
not be used by end users.
Methods
Public methods
Method new()
Create a new Command
object.
Usage
Command$new(...)
Arguments
...
Additional argument passed into command.
Method build_command()
Build the command line
Usage
Command$build_command(help = FALSE, verbose = TRUE)
Arguments
help
A boolean value indicating whether to build parameters for help document or not.
verbose
A boolean value indicating whether the command execution should be verbose.
envir
An environment used to Execute command.
Returns
An atomic character combine the command and parameters.
Method get_on_start()
Get the command startup code
Usage
Command$get_on_start()
Returns
A list of quosures
.
Method get_on_exit()
Get the command exit code
Usage
Command$get_on_exit()
Returns
A list of quosures
.
Method get_on_fail()
Get the command failure code
Usage
Command$get_on_fail()
Returns
A list of quosures
.
Method get_on_succeed()
Get the command succeessful code
Usage
Command$get_on_succeed()
Returns
A list of quosures
.
Method print()
Build parameters to run command.
Usage
Command$print(indent = NULL)
Arguments
indent
A single integer number giving the space of indent.
Returns
The object itself.
Method clone()
The objects of this class are cloneable with this method.
Usage
Command$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
See Also
make_command
Run alleleCount
Description
The alleleCount
program primarily exists to prevent code duplication
between some other projects, specifically AscatNGS
and Battenberg
.
Usage
allele_counter(
hts_file,
loci_file,
ofile,
...,
odir = getwd(),
alleleCounter = NULL
)
Arguments
hts_file |
A string of path to sample HTS file. |
loci_file |
A string of path to loci file. |
ofile |
A string of path to the output file. |
... |
<dynamic dots> Additional arguments passed to |
odir |
A string of path to the output directory. |
alleleCounter |
A string of path to |
Value
A command
object.
See Also
Other commands
:
cellranger()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
python()
,
samtools()
,
seqkit()
,
trust4()
Manage Environment with micromamba
Description
Manage Environment with micromamba
Usage
appmamba(...)
install_appmamba(force = FALSE)
uninstall_appmamba()
appmamba_rc(edit = FALSE)
Arguments
... |
<dynamic dots> Additional arguments passed to
|
force |
A logical value indicating whether to reinstall |
edit |
A logical value indicating whether to open the config file for editing. |
Functions
-
appmamba()
:blit
utilizesmicromamba
to manage environments. This function simply executes the specifiedmicromamba
command. -
install_appmamba()
: Install appmamba (micromamba
). -
uninstall_appmamba()
: Remove appmamba (micromamba
). -
appmamba_rc()
: Get therun commands
config file of themicromamba
.
Examples
install_appmamba()
appmamba()
appmamba("env", "list")
# uninstall_appmamba() # Uninstall the `micromamba`
Deliver arguments of command
Description
arg()
is intended for user use, while arg0()
is for developers and does
not perform argument validation.
Usage
arg(tag, value, indicator = FALSE, lgl2int = FALSE, format = "%s", sep = " ")
arg0(
tag,
value,
indicator = FALSE,
lgl2int = FALSE,
format = "%s",
sep = " ",
allow_null = FALSE,
arg = caller_arg(value),
call = caller_call()
)
Arguments
tag |
A string specifying argument tag, like "-i", "-o". |
value |
Value passed to the argument. |
indicator |
A logical value specifying whether value should be an
indicator of tag. If |
lgl2int |
A logical value indicates whether transfrom value |
format |
The format of the value, details see |
sep |
A character string used to separate |
allow_null |
A single logical value indicates whether |
arg |
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem. |
call |
The execution environment of a currently running function. |
Value
A string.
Run cellranger
Description
Run cellranger
Usage
cellranger(subcmd = NULL, ..., cellranger = NULL)
Arguments
subcmd |
Sub-Command of cellranger. |
... |
<dynamic dots> Additional arguments passed to |
cellranger |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
python()
,
samtools()
,
seqkit()
,
trust4()
Schedule expressions to run
Description
Schedule expressions to run
Usage
cmd_on_start(command, ...)
cmd_on_exit(command, ...)
cmd_on_fail(command, ...)
cmd_on_succeed(command, ...)
Arguments
command |
A |
... |
The expressions input will be captured with
|
Value
-
cmd_on_start
: Thecommand
object itself, with the start code updated.
-
cmd_on_exit
: Thecommand
object itself, with the exit code updated.
-
cmd_on_fail
: Thecommand
object itself, with the failure code updated.
-
cmd_on_succeed
: Thecommand
object itself, with the successful code updated.
Functions
-
cmd_on_start()
: define the startup code of the command -
cmd_on_exit()
: define the exit code of the command -
cmd_on_fail()
: define the failure code of the command -
cmd_on_succeed()
: define the successful code of the command
Execute a list of commands
Description
Execute a list of commands
Usage
cmd_parallel(
...,
stdouts = FALSE,
stderrs = FALSE,
stdins = NULL,
stdout_callbacks = NULL,
stderr_callbacks = NULL,
timeouts = NULL,
threads = NULL,
verbose = TRUE
)
Arguments
... |
A list of |
stdouts , stderrs |
Specifies how the output/error streams of the child process are handled. One of or a list of following values:
For When a single file path is specified, the stdout/stderr of all commands will be merged into this single file. |
stdins |
should the input be diverted? One of or a list of following values:
|
stdout_callbacks , stderr_callbacks |
One of or a list of following values:
|
timeouts |
Timeout in seconds. Can be a single value or a list, specifying the maximum elapsed time for running the command in the separate process. |
threads |
Number of threads to use. |
verbose |
A single boolean value indicating whether the command execution should be verbose. |
Value
A list of exit status invisiblely.
See Also
Execute command
Description
-
cmd_run
: Run the command. -
cmd_help
: Print the help document for this command. -
cmd_background
: Run the command in the background. This function is provided for completeness. Instead of using this function, we recommend usingcmd_parallel()
, which can run multiple commands in the background while ensuring that all processes are properly cleaned up when the process exits.
Usage
cmd_run(
command,
stdout = TRUE,
stderr = TRUE,
stdin = TRUE,
stdout_callback = NULL,
stderr_callback = NULL,
timeout = NULL,
spinner = FALSE,
verbose = TRUE
)
cmd_help(
command,
stdout = TRUE,
stderr = TRUE,
stdout_callback = NULL,
stderr_callback = NULL,
verbose = TRUE
)
cmd_background(
command,
stdout = FALSE,
stderr = FALSE,
stdin = NULL,
verbose = TRUE
)
Arguments
command |
A |
stdout , stderr |
Specifies how the output/error streams of the child process are handled. Possible values include:
For For For When using a |
stdin |
should the input be diverted? Possible values include:
|
stdout_callback , stderr_callback |
Possible values include:
|
timeout |
Timeout in seconds. This is a limit for the elapsed time running command in the separate process. |
spinner |
Whether to show a reassuring spinner while the process is running. |
verbose |
A single boolean value indicating whether the command execution should be verbose. |
Value
-
cmd_run
: Exit status invisiblely.
-
cmd_help
: The inputcommand
invisiblely.
-
cmd_background
: Aprocess
object.
See Also
Setup the context for the command
Description
Setup the context for the command
Usage
cmd_wd(command, wd = NULL)
cmd_envvar(command, ..., action = "replace", sep = NULL)
cmd_envpath(command, ..., action = "prefix", name = "PATH")
cmd_conda(command, ..., root = NULL, action = "prefix")
Arguments
command |
A |
wd |
A string or |
... |
<dynamic dots>:
|
action |
Should the new values |
sep |
A string to separate new and old value when |
name |
A string define the PATH environment variable name. You
can use this to define other |
root |
A string specifying the path to the conda root prefix. By
default, it utilizes the environment variable
|
Value
-
cmd_wd
: Thecommand
object itself, with working directory updated.
-
cmd_envvar
: Thecommand
object itself, with running environment variable updated.
-
cmd_envpath
: Thecommand
object itself, with running environment variable specified inname
updated.
-
cmd_conda
: Thecommand
object itself, with running environment variablePATH
updated.
Functions
-
cmd_wd()
: define the working directory. -
cmd_envvar()
: define the environment variables. -
cmd_envpath()
: define thePATH
-like environment variables. -
cmd_conda()
: Setconda-like
path to thePATH
environment variables.
See Also
Run conda
Description
Run conda
Usage
conda(subcmd = NULL, ..., conda = NULL)
Arguments
subcmd |
Sub-Command of conda. |
... |
<dynamic dots> Additional arguments passed to |
conda |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
python()
,
samtools()
,
seqkit()
,
trust4()
Invoke a System Command
Description
Invoke a System Command
Usage
exec(cmd, ...)
Arguments
cmd |
Command to be invoked, as a character string. |
... |
<dynamic dots> Additional arguments passed to |
Value
A command
object.
command
collections
See Also
Examples
cmd_run(exec("echo", "$PATH"))
FASTQ PAIR
Description
Rewrite paired end fastq files to make sure that all reads have a mate and to separate out singletons.
Usually when you get paired end read files you have two files with a /1 sequence in one and a /2 sequence in the other (or a /f and /r or just two reads with the same ID). However, often when working with files from a third party source (e.g. the SRA) there are different numbers of reads in each file (because some reads fail QC). Spades, bowtie2 and other tools break because they demand paired end files have the same number of reads.
Usage
fastq_pair(
fq1,
fq2,
...,
hash_table_size = NULL,
max_hash_table_size = NULL,
fastq_pair = NULL
)
fastq_read_pair(fastq_files)
Arguments
fq1 , fq2 |
A string of fastq file path. |
... |
<dynamic dots> Additional arguments passed to |
hash_table_size |
Size of hash table to use. |
max_hash_table_size |
Maximal hash table size to use. |
fastq_pair |
A string of path to |
fastq_files |
A character of the fastq file paths. |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
python()
,
samtools()
,
seqkit()
,
trust4()
Run GISTIC2
Description
The GISTIC module identifies regions of the genome that are significantly amplified or deleted across a set of samples. Each aberration is assigned a G-score that considers the amplitude of the aberration as well as the frequency of its occurrence across samples. False Discovery Rate q-values are then calculated for the aberrant regions, and regions with q-values below a user-defined threshold are considered significant. For each significant region, a "peak region" is identified, which is the part of the aberrant region with greatest amplitude and frequency of alteration. In addition, a "wide peak" is determined using a leave-one-out algorithm to allow for errors in the boundaries in a single sample. The "wide peak" boundaries are more robust for identifying the most likely gene targets in the region. Each significantly aberrant region is also tested to determine whether it results primarily from broad events (longer than half a chromosome arm), focal events, or significant levels of both. The GISTIC module reports the genomic locations and calculated q-values for the aberrant regions. It identifies the samples that exhibit each significant amplification or deletion, and it lists genes found in each "wide peak" region.
Usage
gistic2(seg, refgene, ..., odir = getwd(), gistic2 = NULL)
Arguments
seg |
A data.frame of segmented data. |
refgene |
Path to reference genome data input file (REQUIRED, see below for file description). |
... |
<dynamic dots> Additional arguments passed to |
odir |
A string of path to the output directory. |
gistic2 |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
fastq_pair()
,
kraken2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
python()
,
samtools()
,
seqkit()
,
trust4()
Running Kraken2
Description
Kraken is a taxonomic sequence classifier that assigns taxonomic labels to DNA sequences. Kraken examines the k-mers within a query sequence and uses the information within those k-mers to query a database. That database maps k-mers to the lowest common ancestor (LCA) of all genomes known to contain a given k-mer.
Usage
kraken2(
fq1,
...,
fq2 = NULL,
ofile = "kraken_output.txt",
report = "kraken_report.txt",
classified_out = NULL,
unclassified_out = NULL,
odir = getwd(),
kraken2 = NULL
)
Arguments
fq1 , fq2 |
A string of fastq file path. |
... |
<dynamic dots> Additional arguments passed to |
ofile |
A string of path to save kraken2 output. |
report |
A string of path to save kraken2 report. |
classified_out |
A string of path to save classified sequences, which should be a fastq file. |
unclassified_out |
A string of path to save unclassified sequences, which should be a fastq file. |
odir |
A string of path to the output directory. |
kraken2 |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
python()
,
samtools()
,
seqkit()
,
trust4()
KrakenTools is a suite of scripts to be used alongside the Kraken, KrakenUniq, Kraken 2, or Bracken programs.
Description
These scripts are designed to help Kraken users with downstream analysis of Kraken results.
Usage
kraken_tools(script, ..., python = NULL)
Arguments
script |
Name of the kraken2 script. One of
|
... |
<dynamic dots> Additional arguments passed to |
python |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
perl()
,
pyscenic()
,
python()
,
samtools()
,
seqkit()
,
trust4()
Helper function to create new command.
Description
make_command
is a helper function used by developers to create function for
a new Command
object. It should not be used by end users.
Usage
make_command(name, fun, envir = caller_env())
Arguments
name |
A string of the function name. |
fun |
A function used to initialize the |
envir |
A environment used to bind the created function. |
Value
A function.
See Also
Perl is a highly capable, feature-rich programming language with over 36 years of development.
Description
Perl is a highly capable, feature-rich programming language with over 36 years of development.
Usage
perl(..., perl = NULL)
Arguments
... |
<dynamic dots> Additional arguments passed to |
perl |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
pyscenic()
,
python()
,
samtools()
,
seqkit()
,
trust4()
Run pyscenic
Description
Run pyscenic
Usage
pyscenic(subcmd = NULL, ..., pyscenic = NULL)
Arguments
subcmd |
Sub-Command of pyscenic. |
... |
<dynamic dots> Additional arguments passed to |
pyscenic |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
perl()
,
python()
,
samtools()
,
seqkit()
,
trust4()
Python is a programming language that lets you work quickly and integrate systems more effectively.
Description
Python is a programming language that lets you work quickly and integrate systems more effectively.
Usage
python(..., python = NULL)
Arguments
... |
<dynamic dots> Additional arguments passed to |
python |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
samtools()
,
seqkit()
,
trust4()
Python is a programming language that lets you work quickly and integrate systems more effectively.
Description
Python is a programming language that lets you work quickly and integrate systems more effectively.
Usage
samtools(subcmd = NULL, ..., samtools = NULL)
Arguments
subcmd |
Sub-Command of samtools. Details see: |
... |
<dynamic dots> Additional arguments passed to |
samtools |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
python()
,
seqkit()
,
trust4()
Run seqkit
Description
Run seqkit
Usage
seqkit(subcmd = NULL, ..., seqkit = NULL)
Arguments
subcmd |
Sub-Command of seqkit. |
... |
<dynamic dots> Additional arguments passed to |
seqkit |
A string of path to |
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
python()
,
samtools()
,
trust4()
TRUST4: immune repertoire reconstruction from bulk and single-cell RNA-seq data
Description
TRUST4: immune repertoire reconstruction from bulk and single-cell RNA-seq data
Usage
trust4(
file1,
ref_coordinate,
...,
file2 = NULL,
mode = NULL,
ref_annot = NULL,
ofile = NULL,
odir = getwd(),
trust4 = NULL
)
trust4_imgt_annot(
species = "Homo_sapien",
...,
ofile = "IMGT+C.fa",
odir = getwd(),
perl = NULL
)
trust4_gene_names(imgt_annot, ofile = "bcr_tcr_gene_name.txt", odir = getwd())
Arguments
file1 |
Path to bam file or fastq file. |
ref_coordinate |
Path to the fasta file coordinate and sequence of V/D/J/C genes. |
... |
|
file2 |
Path to the second paired-end read fastq file, only used for
|
mode |
One of "bam" or "fastq". If |
ref_annot |
Path to detailed V/D/J/C gene reference file, such as from IMGT database. (default: not used). (recommended). |
ofile |
|
odir |
A string of path to the output directory. |
trust4 |
A string of path to |
species |
Species to extract IMGT annotation, details see https://www.imgt.org//download/V-QUEST/IMGT_V-QUEST_reference_directory/. |
perl |
A string of path to |
imgt_annot |
Path of IMGT annotation file, created via
|
Value
A command
object.
See Also
Other commands
:
allele_counter()
,
cellranger()
,
conda()
,
fastq_pair()
,
gistic2()
,
kraken2()
,
kraken_tools()
,
perl()
,
pyscenic()
,
python()
,
samtools()
,
seqkit()