Title: | Detect Elevations and Gaps in Mapped Sequencing Read Coverage |
Version: | 0.1.0 |
Maintainer: | Jessie Maier <jlmaier@ncsu.edu> |
Description: | Automate the detection of gaps and elevations in mapped sequencing read coverage using a 2D pattern-matching algorithm. 'ProActive' detects, characterizes and visualizes read coverage patterns in both genomes and metagenomes. Optionally, users may provide gene annotations associated with their genome or metagenome in the form of a .gff file. In this case, 'ProActive' will generate an additional output table containing the gene annotations found within the detected regions of gapped and elevated read coverage. Additionally, users can search for gene annotations of interest in the output read coverage plots. |
License: | GPL-2 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/jlmaier12/ProActive, https://jlmaier12.github.io/ProActive/ |
BugReports: | https://github.com/jlmaier12/ProActive/issues |
Imports: | utils, stats, dplyr, ggplot2, stringr |
Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0), kableExtra |
VignetteBuilder: | knitr |
Depends: | R (≥ 4.2.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-01-20 20:39:54 UTC; jlmaier |
Author: | Jessie Maier |
Repository: | CRAN |
Date/Publication: | 2025-01-21 08:00:02 UTC |
ProActive
Description
'ProActive' automatically detects regions of gapped and elevated read coverage using a 2D pattern-matching algorithm. 'ProActive' detects, characterizes and visualizes read coverage patterns in both genomes and metagenomes. Optionally, users may provide gene annotations associated with their genome or metagenome in the form of a .gff file. In this case, 'ProActive' will generate an additional output table containing the gene annotations found within the detected regions of gapped and elevated read coverage. Additionally, users can search for gene annotations of interest in the output read coverage plots.
Details
The three main functions in 'ProActive' are:
-
ProActiveDetect
performs the pattern-matching and characterization of read coverage patterns. -
plotProActiveResults
plots the results fromProActiveDetect()
-
geneAnnotationSearch
searches classified contigs/chunks for gene annotations that match user-provided keywords.
Author(s)
Jessie Maier jlmaier@ncsu.edu
See Also
Useful links:
Report bugs at https://github.com/jlmaier12/ProActive/issues
Detect gene predictions in elevations and gaps
Description
Extracts subsets of the gffTSV associated with gene predictions that fall within regions of detected gapped or elevated read coverage.
Usage
GPsInElevGaps(
elevGapSummList,
windowSize,
gffTSV,
mode,
chunkContigs,
chunkSize
)
Arguments
elevGapSummList |
A list containing pattern-match information associated with all elevation and gap classifications. (i.e. no NoPattern classifications) |
windowSize |
The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000. |
gffTSV |
Optional, a .gff file (TSV) containing gene predictions associated with the .fasta file used to generate the pileup. |
mode |
Either "genome" or "metagenome" |
chunkContigs |
TRUE or FALSE, If TRUE and 'mode'="metagenome", contigs longer than the ‘chunkSize' will be ’chunked' into smaller subsets and pattern-matching will be performed on each subset. Default is FALSE. |
chunkSize |
If 'mode'="genome" OR if 'mode'="metagenome" and 'chunkContigs'=TRUE, chunk the genome or contigs, respectively, into smaller subsets for pattern-matching. |
Detect elevations and gaps in mapped read coverage patterns.
Description
Performs read coverage pattern-matching and summarizes the results into a list. The first list item summarizes the pattern-matching results. The second list item is the 'cleaned' version of the summary table with all the 'noPattern' classifications removed. (i.e were not filtered out). The third list item contains the pattern-match information needed for pattern-match visualization with 'plotProActiveResults()'. The fourth list item is a table containing all the contigs that were filtered out prior to pattern-matching. The fifth list item contains arguments used during pattern-matching (windowSize, mode, chunkSize, chunkContigs). If the user provides a gffTSV files, then the last list is a table consisting of ORFs found within the detected gaps and elevations in read coverage.
Usage
ProActiveDetect(
pileup,
mode,
gffTSV,
windowSize = 1000,
chunkContigs = FALSE,
minSize = 10000,
maxSize = Inf,
minContigLength = 30000,
chunkSize = 1e+05,
IncludeNoPatterns = FALSE,
verbose = TRUE,
saveFilesTo
)
Arguments
pileup |
A .txt file containing mapped sequencing read coverages averaged over 100 bp windows/bins. |
mode |
Either "genome" or "metagenome" |
gffTSV |
Optional, a .gff file (TSV) containing gene predictions associated with the .fasta file used to generate the pileup. |
windowSize |
The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000. |
chunkContigs |
TRUE or FALSE, If TRUE and 'mode'="metagenome", contigs longer than the ‘chunkSize' will be ’chunked' into smaller subsets and pattern-matching will be performed on each subset. Default is FALSE. |
minSize |
The minimum size (in bp) of elevation or gap patterns. Default is 10000. |
maxSize |
The maximum size (in bp) of elevation or gap patterns. Default is NA (i.e. no maximum). |
minContigLength |
The minimum contig/chunk size (in bp) to perform pattern-matching on. Default is 25000. |
chunkSize |
If 'mode'="genome" OR if 'mode'="metagenome" and 'chunkContigs'=TRUE, chunk the genome or contigs, respectively, into smaller subsets for pattern-matching. ‘chunkSize' determines the size (in bp) of each ’chunk'. Default is 100000. |
IncludeNoPatterns |
TRUE or FALSE, If TRUE the noPattern pattern-matches will be included in the ProActive PatternMatches output list. If you would like to visualize the noPattern pattern-matches in 'plotProActiveResults()', this should be set to TRUE. |
verbose |
TRUE or FALSE. Print progress messages to console. Default is TRUE. |
saveFilesTo |
Optional, Provide a path to the directory you wish to save output to. A folder will be made within the provided directory to store results. |
Value
A list containing 6 objects described in the function description.
Examples
metagenome_results <- ProActiveDetect(
pileup = sampleMetagenomePileup,
mode = "metagenome",
gffTSV = sampleMetagenomegffTSV
)
Change the pileup window size
Description
Re-averages windows of pileup files with 100bp windows to reduce pileup size.
Usage
changewindowSize(pileupSubset, windowSize, mode)
Arguments
pileupSubset |
A subset of the pileup that pertains only to the contig/chunk currently being assessed. |
windowSize |
The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000. |
mode |
Either "genome" or "metagenome" |
Summarizes pattern-matching results
Description
Summarizes the list of pattern-matching classifications into a table.
Usage
classifSumm(pileup, bestMatchList, windowSize, mode, chunkSize)
Arguments
pileup |
A .txt file containing mapped sequencing read coverages averaged over 100 bp windows/bins. |
bestMatchList |
A list containing pattern-match information associated with all contigs/chunks classified by 'ProActive()' pattern-matching |
windowSize |
The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000. |
mode |
Either "genome" or "metagenome" |
chunkSize |
If 'mode'="genome" OR if 'mode'="metagenome" and 'chunkContigs'=TRUE, chunk the genome or contigs, respectively, into smaller subsets for pattern-matching. |
Collect information regarding the pattern-match
Description
Make a list containing the match-score, min and max pattern-match values, the start and stop positions of the elevated or gapped region, the elevation ratio and the classification
Usage
collectBestMatchInfo(pattern, pileupSubset, elevOrGap, leftRightFull)
Arguments
pattern |
A vector containing the values associated with the pattern-match |
pileupSubset |
A subset of the pileup that pertains only to the contig/chunk currently being assessed. |
elevOrGap |
Pattern-matching on 'elevation' or 'gap' pattern. |
leftRightFull |
'Left' or'Right' partial gap/elevation pattern or full elevation/gap pattern. |
'chunk' long contigs
Description
Subset long contigs in metagenome pileup into chunks for pattern-matching
Usage
contigChunks(pileup, chunkSize)
Arguments
pileup |
A .txt file containing mapped sequencing read coverages averaged over 100 bp windows/bins. |
chunkSize |
If 'mode'="genome" OR if 'mode'="metagenome" and 'chunkContigs'=TRUE, chunk the genome or contigs, respectively, into smaller subsets for pattern-matching. ‘chunkSize' determines the size (in bp) of each ’chunk'. Default is 100000. |
Classifies partial elevation/gap pattern-matches
Description
classify the contig/chunk as 'gap' if the elevated region is less than 50 the length of the contig/chunk and otherwise classify as 'elevation'.
Usage
elevOrGapClassif(bestMatchList, pileupSubset)
Arguments
bestMatchList |
A list containing pattern-match information associated with all contigs/chunks classified by 'ProActive()' pattern-matching |
pileupSubset |
A subset of the pileup that pertains only to the contig/chunk currently being assessed. |
Controller function for full elevation/gap pattern-matching
Description
Builds full elevation/gap pattern-matches, shrinks the width, and collects best match information
Usage
fullElevGap(pileupSubset, windowSize, minSize, maxSize, elevOrGap)
Arguments
pileupSubset |
A subset of the pileup that pertains only to the contig/chunk currently being assessed. |
windowSize |
The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000. |
minSize |
The minimum size (in bp) of elevation or gap patterns. Default is 10000. |
maxSize |
The maximum size (in bp) of elevation or gap patterns. Default is NA (i.e. no maximum). |
elevOrGap |
Pattern-matching on 'elevation' or 'gap' pattern. |
Shrink the width of full elevation and gap patterns
Description
Remove values from gapped/elevated region in the pattern-match vector until it reaches the 'minSize'
Usage
fullElevGapShrink(
minCov,
windowSize,
maxCov,
elevLength,
nonElev,
bestMatchInfo,
pileupSubset,
minSize,
elevOrGap
)
Arguments
minCov |
The minimum value of the pattern-match vector. |
windowSize |
The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000. |
maxCov |
The maximum value of the pattern-match vector. |
elevLength |
Length of the elevated/gapped pattern-match region. |
nonElev |
Length of the non-elevated/gapped pattern-match region. |
bestMatchInfo |
The information associated with the current best pattern-match for the contig/chunk being assessed. |
pileupSubset |
A subset of the pileup that pertains only to the contig/chunk currently being assessed. |
minSize |
The minimum size (in bp) of elevation or gap patterns. Default is 10000. |
elevOrGap |
Pattern-matching on 'elevation' or 'gap' pattern. |
Gene annotation plot
Description
Plot read coverage and location of gene annotations that match the keywords and search criteria for contig/chunk currently being assessed
Usage
geneAnnotationPlot(
geneAnnotSubset,
keywords,
pileupSubset,
colIdx,
startbpRange,
endbpRange,
elevRatio,
pattern,
windowSize,
chunkSize,
mode
)
Arguments
geneAnnotSubset |
Subset of gene annotations to be plotted |
keywords |
The key-word(s) used for the search. |
pileupSubset |
A subset of the pileup associated with the contig/chunk being assessed |
colIdx |
The column index 'gene' or 'product' column |
startbpRange |
The basepair at which the search is started if a 'specific' search is used |
endbpRange |
The basepair at which the search is ended if a 'specific' search is used |
elevRatio |
The maximum/minimum values of the pattern-match |
pattern |
The pattern-match information associated with the contig/chunk being assessed |
windowSize |
The number of basepairs to average read coverage values over. |
Search for gene annotations on classified contigs/chunks
Description
Search contigs classified with ProActive for gene-annotations that match a provided key-word(s). Outputs read coverage plots for contigs/chunks with matching annotations.
Usage
geneAnnotationSearch(
ProActiveResults,
pileup,
gffTSV,
geneOrProduct,
keyWords,
inGapOrElev = FALSE,
bpRange = 0,
elevFilter,
saveFilesTo,
verbose = TRUE
)
Arguments
ProActiveResults |
The output from 'ProActive()'. |
pileup |
A .txt file containing mapped sequencing read coverages averaged over 100 bp windows/bins. |
gffTSV |
A .gff file (TSV) containing gene predictions associated with the .fasta file used to generate the pileup. |
geneOrProduct |
"gene" or "product". Search for keyWords associated with genes or gene products. |
keyWords |
The keyWord(s) to search for. Case independent. Searches will return the string that contains the matching keyWord. KeyWord(s) must be in quotes, comma-separated, and surrounded by c() i.e( c("antibiotic", "resistance", "drug") ) |
inGapOrElev |
TRUE or FALSE. If TRUE, only search for gene-annotations in the gap/elevation region of the pattern-match. Default is FALSE (i.e search the entire contig/chunk for the gene annotation key-words) |
bpRange |
If 'inGapOrElev' = TRUE, the user may specify the region (in base pairs) that should be searched to the left and right of the gap/elevation region. Default is 0. |
elevFilter |
Optional, only plot results with pattern-matches that achieved an elevation ratio (max/min) greater than the specified values. Default is no filter. |
saveFilesTo |
Optional, Provide a path to the directory you wish to save output to. A folder will be made within the provided directory to store results. |
verbose |
TRUE or FALSE. Print progress messages to console. Default is TRUE. |
Value
list of ggplot objects
Examples
geneAnnotMatches <- geneAnnotationSearch(sampleMetagenomeResults, sampleMetagenomePileup,
sampleMetagenomegffTSV, geneOrProduct="product",
keyWords=c("toxin", "drug", "resistance", "phage"))
'chunk' genomes
Description
Subset genome pileup into chunks for pattern-matching
Usage
genomeChunks(pileup, chunkSize)
Arguments
pileup |
A .txt file containing mapped sequencing read coverages averaged over 100 bp windows/bins. |
chunkSize |
If 'mode'="genome" OR if 'mode'="metagenome" and 'chunkContigs'=TRUE, chunk the genome or contigs, respectively, into smaller subsets for pattern-matching. ‘chunkSize' determines the size (in bp) of each ’chunk'. Default is 50000. |
Link pattern-matches on contig/genome chunks
Description
Detect partial gap/elevation pattern matches that fall on the edges of chunked genomes/contigs that may be part of the same pattern prior to chunking
Usage
linkChunks(bestMatchList, pileup, windowSize, mode, verbose)
Arguments
bestMatchList |
A list containing pattern-match information associated with all contigs/chunks classified by 'ProActive()' pattern-matching |
pileup |
A .txt file containing mapped sequencing read coverages averaged over 100 bp windows/bins. |
windowSize |
The number of basepairs to average read coverage values over. |
mode |
Either "genome" or "metagenome" |
verbose |
TRUE or FALSE. Print progress messages to console. Default is TRUE. |
No read coverage pattern
Description
Assess whether a contig/chunk does not have an elevated/gapped read coverage pattern. A horizontal line at the mean or median coverage should be the best match if the contig/chunk read coverage is not gapped or elevated.
Usage
noPattern(pileupSubset)
Arguments
pileupSubset |
A subset of the read coverage dataset that pertains only to the contig currently being assessed |
Controller function for partial elevation/gap pattern-matching
Description
Builds partial elevation/gap pattern-match for patterns going off both the left and right sides of the contig/chunk, shrinks the width, and collects best match information
Usage
partialElevGap(pileupSubset, windowSize, minSize, maxSize)
Arguments
pileupSubset |
A subset of the pileup that pertains only to the contig/chunk currently being assessed. |
windowSize |
The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000. |
minSize |
The minimum size (in bp) of elevation or gap patterns. Default is 10000. |
maxSize |
The maximum size (in bp) of elevation or gap patterns. Default is NA (i.e. no maximum). |
Shrink the width of partial elevation and gap patterns
Description
Remove values from gapped/elevated region in the pattern-match vector until it reaches the 'minSize'.
Usage
partialElevGapShrink(
minCov,
windowSize,
maxCov,
elevLength,
nonElev,
bestMatchInfo,
pileupSubset,
minSize,
leftOrRight
)
Arguments
minCov |
The minimum value of the pattern-match vector. |
windowSize |
The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000. |
maxCov |
The maximum value of the pattern-match vector. |
elevLength |
Length of the elevated/gapped pattern-match region. |
nonElev |
Length of the non-elevated/gapped pattern-match region. |
bestMatchInfo |
The information associated with the current best pattern-match for the contig/chunk being assessed. |
pileupSubset |
A subset of the pileup that pertains only to the contig/chunk currently being assessed. |
minSize |
The minimum size (in bp) of elevation or gap patterns. Default is 10000. |
leftOrRight |
'Left' or 'Right' partial gap/elevation pattern. |
Builds pattern-match vectors
Description
Builds the pattern-match (vector) associated with each contig/chunk for visualization.
Usage
patternBuilder(pileupSubset, bestMatchInfo)
Arguments
pileupSubset |
A subset of the pileup that pertains only to the contig/chunk currently being assessed. |
bestMatchInfo |
The information associated with the current best pattern-match for the contig/chunk being assessed. |
Controller function for pattern-matching
Description
Creates the pileupSubset, representative of one contig/chunk, used as input for each individual pattern-matching function. After the information associated with the best match for each pattern is obtained, the pattern-match with the lowest mean absolute difference (match-score) is used for classification.
Usage
patternMatcher(
pileup,
windowSize,
minSize,
maxSize,
mode,
minContigLength,
verbose
)
Arguments
pileup |
A .txt file containing mapped sequencing read coverages averaged over 100 bp windows/bins. |
windowSize |
The number of basepairs to average read coverage values over. |
minSize |
The minimum size (in bp) of elevation or gap patterns. Default is 10000. |
maxSize |
The maximum size (in bp) of elevation or gap patterns. Default is NA (i.e. no maximum). |
mode |
Either "genome" or "metagenome". |
minContigLength |
The minimum contig/chunk size (in bp) to perform pattern-matching on. Default is 25000. |
verbose |
TRUE or FALSE. Print progress messages to console. Default is TRUE. |
Full elevation/gap pattern translator
Description
Translates full elevation/gap patterns across contigs/chunks 1000bp at a time. Translation stops when the elevation pattern is 5000bp from the end of the contig/chunk.
Usage
patternTranslator(contigCov, bestMatchInfo, windowSize, pattern, elevOrGap)
Arguments
contigCov |
The read coverages that pertain to the pileupSubset |
bestMatchInfo |
The information associated with the current best pattern-match for the contig/chunk being assessed. |
windowSize |
The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000. |
pattern |
A vector containing the values associated with the pattern-match |
elevOrGap |
Pattern-matching on 'elevation' or 'gap' pattern. |
Reformat input pileup file
Description
Place columns in correct order, clean accessions by removing text after white space, and name columns
Usage
pileupFormatter(pileup, mode)
Arguments
pileup |
A .txt file containing mapped sequencing read coverages averaged over 100 bp windows/bins. |
mode |
Either "genome" or "metagenome" |
Plot results of 'ProActive()' pattern-matching
Description
Plot read coverage of contigs/chunks with detected gaps and elevations and their associated pattern-match.
Usage
plotProActiveResults(pileup, ProActiveResults, elevFilter, saveFilesTo)
Arguments
pileup |
A .txt file containing mapped sequencing read coverages averaged over 100 bp windows/bins. |
ProActiveResults |
The output from 'ProActive()'. |
elevFilter |
Optional, only plot results with pattern-matches that achieved an elevation ratio (max/min) greater than the specified values. Default is no filter. |
saveFilesTo |
Optional, Provide a path to the directory you wish to save output to. A folder will be made within the provided directory to store results. |
Value
A list containing ggplot objects
Examples
ProActivePlots <- plotProActiveResults(sampleMetagenomePileup,
sampleMetagenomeResults)
Removes 'NoPattern' classifications from best match list
Description
Removes 'NoPattern' classifications from the list of pattern-match information associated with the best pattern-matches for each contig/chunk
Usage
removeNoPatterns(bestMatchList)
Arguments
bestMatchList |
A list containing pattern-match information associated with all contigs/chunks classified by 'ProActive()' pattern-matching |
sampleGenomePileup
Description
A pileup file generated during read mapping to the *Salmonella enterica* LT2 genome. Report...
Usage
sampleGenomePileup
Format
## 'sampleGenomePileup' A data frame with 48,575 rows and 4 columns:
- V1
Accession
- V2
Mapped read coverage averaged over a 100 bp window size
- V3
Starting position (bp) of each 100 bp window. Starts from 100.
- V4
Starting position (bp) of each 100 bp window. Starts from 0.
Details
This dataset was generated by extracting DNA from a culture of *Salmonella enterica* LT2 (LT2) infected with phage P22. The DNA was shotgun sequenced with Illumina (paired-end mode, 150 bp reads). The sequencing reads were mapped to the LT2 reference genome (NCBI RefSeq NC_003197.2). The bbmap.sh bincov parameter with covbinsize=100 was used to create a pileup file with 100 bp windows.
Source
<https://pubmed.ncbi.nlm.nih.gov/25608871/>
sampleGenomegffTSV
Description
Gene annotations associated with the genome in the sampleGenomePileup Report...
Usage
sampleGenomegffTSV
Format
## 'sampleGenomegffTSV' A data frame with 85,575 rows and 9 columns:
- V1
seqid
- V2
source
- V3
type
- V4
start
- V5
end
- V6
score
- V7
strand
- V8
phase
- V9
attributes
Details
This is a standard .gff file format. The .gff file was generated by running PROKKA with default parameters on the *Salmonella enterica* LT2 genome sequence (NCBI RefSeq NC_003197.2) associated with the sampleGenomePileup in the ProActive package.
sampleMetagenomePileup
Description
A subset of contigs from the raw whole-community fraction read coverage pileup file generated during read mapping. Report...
Usage
sampleMetagenomePileup
Format
## 'sampleMetagenomePileup' A data frame with 4,604 rows and 4 columns:
- V1
Contig accession
- V2
Mapped read coverage averaged over a 100 bp window size
- V3
Starting position (bp) of each 100 bp window. Restarts from 0 at the start of each new contig.
- V4
Starting position (bp) of each 100 bp window. Does NOT restart at the start of each new contig.
Details
This dataset was generated from a conventional mouse fecal homogenate. The whole-community extracted DNA was sequenced with Illumina (paired-end mode, 150 bp reads) after which the metagenome was assembled. The sequencing reads were mapped to the assembled contigs using BBMap. The bbmap.sh bincov parameter with covbinsize=100 was used to create a pileup file with 100 bp windows. A subset of 10 contigs from the pileup file were selected for this sample dataset. The contigs were chosen because their associated read coverage patterns exemplify ProActive's pattern-matching and characterization functionality across classifications: NODE_1911: elevation off left NODE_1583: elevation off right NODE_1884: gap off right NODE_1255: gap off left NODE_368: full gap NODE_617: elevation full NODE_1625: no pattern
Source
<https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-020-00935-5>
sampleMetagenomeResults
Description
Output of 'ProActiveDetect()' Report...
Usage
sampleMetagenomeResults
Format
## 'sampleMetagenomeResults' A list with 6 objects:
- SummaryTable
A table containing all pattern-matching classifications
- CleanSummaryTable
A table containing only gap and elevation pattern-match classifications (i.e. noPattern classifications removed)
- PatternMatches
A list object containing information needed to visualize the pattern-matches in 'plotProActiveResults()'
- FilteredOut
A table containing contigs/chunks that were filtered out for being too small or having too low read coverage
- Arguments
A list object containing arguments used for pattern-matching (windowSize, mode, chunkSize, chunkContigs)
- GeneAnnotTable
A table containing gene predictions associated with elevated or gapped regions in pattern-matches
Details
This data was generated by running 'ProActiveDetect()' on the sampleMetagenomePileup and sampleMetagenomegffTSV with default parameters.
sampleMetagenomegffTSV
Description
A subset of gene annotations associated with the metagenome in the sampleMetagenomePileup Report...
Usage
sampleMetagenomegffTSV
Format
## 'sampleMetagenomegffTSV' A data frame with 467 rows and 9 columns:
- V1
seqid
- V2
source
- V3
type
- V4
start
- V5
end
- V6
score
- V7
strand
- V8
phase
- V9
attributes
Details
This is a standard .gff file format. The .gff file was generated by running PROKKA with default parameters on the metagenome assembly associated with the sampleMetagenomePileup in the ProActive package. The gff was subset to only include the data associated with the contigs in the sample data subset.
Source
<https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-020-00935-5>