Compressing GFF, GVF or GTF files
At a glance
​
Compressing
​
$ genozip myfile.gff3
genozip myfile.gff3 : Done (4 seconds, GFF3 compression ratio: 6.6)
​
$ ls -lh myfile.gff3*
-rwxrwxrwx 1 divon divon 26M Aug 2 22:48 myfile.gff3
-rwxrwxrwx 1 divon divon 3.9M Aug 2 22:49 myfile.gff3.genozip
​
Uncompressing
​
$ genounzip myfile.gff3.genozip
​
Viewing
​
$ genocat myfile.gff3.genozip
​
Slicing & dicing your data with genocat
​
Here's a summary of the filtering and subsetting options available for GFF3 / GVF files. See genocat for more information.
​
Option Effect
--downsample Show only one in every X lines
--regions -r Exclude or include certain genomic regions
--regions-file -R Like --regions, but list of regions is specified in a file
--grep Show only lines containing the specified string
--grep-w -g Like --grep, but match whole words
--lines -n Show only a lines from given range of line numbers
--head Show only a certain number of lines from the start of the file
--tail Show only a certain number of lines from the end of the file
--no-header Drop the GFF3 header lines
--header-only Show only the GFF3 header lines
​
Example: display the lines containing “rs1357314184” (strings that match exactly):
​
genocat --grep-w rs1357314184 myfile.gff3.genozip
​
Example: display the lines containing “Dbxref=dbSNP_152:rs” (possibly a substring of a longer string):
​
genocat --grep Dbxref=dbSNP_152:rs myfile.gff3.genozip
​
Example: Get positions 1000 to 2000 on contig 22
​
genocat myfile.gff3.genozip -r 22:1000-2000
​
Supported formats & limitations
​
Genozip can compress the closely related formats GFF2, GFF3, GTF and GVF. It does not support compressing GFF3 files which include a ##FASTA section, and may also not support other tweaks of the GFF format, of which there are many. If you have GFF data which Genozip fails to compress, and you would like us to support it, please let us know!
​
Tip: if you need to compress a file whose format isn't currently supported by Genozip, you can always use --input generic.
​
Questions? support@genozip.com