Events2Join

Study on reference|based FASTQ genome sequences compression


Study on reference-based FASTQ genome sequences compression

We still have a lot of work to do in the field of specialized data compression of FASTQ files. This paper aims to explore a reference-based ...

Study on reference-based FASTQ genome sequences compression

This paper aims to explore a reference-based lossless compression algorithm for genome sequences in FASTQ format.

Light-weight reference-based compression of FASTQ data - PMC

The advance of next generation sequencing (NGS) has greatly promoted the research on genomics analysis, hereditary disease diagnosis, food security, etc. The ...

FastqZip: An Improved Reference-Based Genome Sequence Lossy ...

Reference-based compression algorithms are effective in dealing with these data. Our work focuses on compressing. FASTQ format files with an ...

Study on reference-based FASTQ genome sequences compression

Request PDF | On Jan 21, 2022, Wenlong Li and others published Study on reference-based FASTQ genome sequences compression | Find, read and cite all the ...

SPRING: a next-generation compressor for FASTQ data - PMC

High-Throughput Sequencing technologies produce huge amounts of data in the form of short genomic reads, associated quality values and read identifiers.

FastqZip: An Improved Reference-Based Genome Sequence Lossy ...

Our work focuses on compressing FASTQ format files with an improved reference-based compression algorithm to achieve a higher compression ratio ...

PMFFRC: a large-scale genomic short reads compression optimizer ...

By cascading PMFFRC, in 982 GB fastq format sequencing data, with 274 GB and 3.3 billion short reads, the state-of-the-art and reference-free ...

1 GeneSqueeze: A Novel Lossless, Reference-Free Compression ...

Nucleotide-based omics data (i.e., genomics, transcriptomics, epigenomics) are commonly stored in FASTQ or FASTA text-based formats. FASTA is a ...

Efficient sequencing data compression and FPGA acceleration ...

... sequences to reference genome completely ... “FQZip: lossless reference-based compression of next generation sequencing data in FASTQ ...

A new efficient referential genome compression technique for FastQ ...

Hospitals and medical laboratories create a tremendous amount of genome sequence data every day for use in research, surgery, ...

Reference-free lossless compression of nanopore sequencing ...

A typical human whole genome sequencing experiment can produce 100 s of GBs of data in FASTQ files. Given the high sequencing depth, there is ...

Reference-based genome compression using the longest matched ...

We proposed an algorithm (LMSRGC) based on reference genome sequences, which uses the suffix array (SA) and the longest common prefix (LCP) array to find the ...

High efficiency referential genome compression algorithm

They can further be divided into reference-based and reference-free compression algorithms according to whether reference sequences are exploited during the ...

GeneSqueeze: A Novel Lossless, Reference-Free Compression ...

e., genomics, transcriptomics, epigenomics) are commonly stored in FASTQ or FASTA text-based formats. FASTA is a simple format that stores only ...

Tackling the Challenges of FASTQ Referential Compression

Compression of genomic sequencing reads via hash-based reordering: algorithm and analysis. ... Light-weight reference based compression of FASTQ data. BMC ...

FastqCLS: a FASTQ compressor for long-read sequencing via read ...

AbstractMotivation. Over the past decades, vast amounts of genome sequencing data have been produced, requiring an enormous level of storage ...

Tackling the Challenges of FASTQ Referential Compression

Efficient storage of high throughput DNA sequencing data using reference-based compression. Genome Res. 2011;21:734–740. http://www ...

FQZip: Lossless Reference-Based Compression of Next Generation ...

Particularly, the three components namely metadata, sequence reads, and quality scores in FASTQ files are compressed independently with specific coding schemes.

Design considerations and methodology of .ORA format to achieve ...

Learn how to save compute time and storage costs for genomic data by using DRAGEN ORA, a totally lossless compression technology for FASTQ ...