Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| data:ont [2024/04/11 11:03] – Added information about pod5 tools and splitting the big file Richard Bowers | data:ont [2024/06/10 11:03] (current) – Richard Bowers | ||
|---|---|---|---|
| Line 15: | Line 15: | ||
| ==== Pod5 Files ==== | ==== Pod5 Files ==== | ||
| - | Pod5 is a proprietary format developed by Oxford Nanopore. It can be considered an intermediate format, but can also be reprocessed in a way that the Illumina intermediate files cannot. Thus we always deliver the Pod5 equivalent of the BAM or FASTQ from a sequencing run. | + | Pod5 is a proprietary format developed by Oxford Nanopore. It can be considered an intermediate format, but can also be reprocessed in a way that the Illumina intermediate files cannot. Thus we always deliver the Pod5 files as produced by the sequencer with the BAM or FASTQ data. |
| ===== File Naming ===== | ===== File Naming ===== | ||
| - | The files will be named using the same pattern as files from [[data: | + | The ONT data files have an additional component to them, here referred to as ''< |
| + | |||
| + | In all other respects, the files will be named using the same pattern as files from [[data: | ||
| < | < | ||
| - | < | + | < |
| - | < | + | < |
| - | < | + | < |
| - | < | + | < |
| - | < | + | |
| - | < | + | |
| - | < | + | |
| - | < | + | |
| </ | </ | ||
| Line 35: | Line 33: | ||
| < | < | ||
| - | < | + | < |
| - | < | + | < |
| - | <SLX>.NoIndex.< | + | </code> |
| - | <SLX>.NoIndex.< | + | |
| + | The Pod5 files are delivered in a TAR file. The structure inside this file is the directory structure of the run's //pod5// directory. | ||
| + | |||
| + | < | ||
| + | <SLX>.< | ||
| + | < | ||
| </ | </ | ||
| Line 58: | Line 61: | ||
| The library comes with some Python tools around the C++ core that allow you to manipulate the files. There is one shortcoming in the tool set though: the ability to easily split a large Pod5 file into chunks of a fixed size (by number of reads). We have created a tool for this job, which is available at [[https:// | The library comes with some Python tools around the C++ core that allow you to manipulate the files. There is one shortcoming in the tool set though: the ability to easily split a large Pod5 file into chunks of a fixed size (by number of reads). We have created a tool for this job, which is available at [[https:// | ||
| - | |||
| - | The PromethION creates many Pod5 files as it runs. It is impractical for us to distribute this collection of many files easily, so these small files are merged into one very big Pod5 file (using the '' | ||