The objective of NanoPAL is to introduce an out-of-core approach to port Nanopore analytics on mobile devices such as tablets or smartphones, often used in extreme experimental settings with special ergonomics needs and ease of sterilization.
Mobile (third-generation) sequencing technologies, including Oxford Nanopore’s MinION and SmidgION, have the benefit of outputting long sequence reads (up to hundred thousands of bases) in a portable manner. These sequencing devices fit in the palm of a hand and only require a USB outlet. Unfortunately, the development of data analysis tools for these technologies is in a nascent stage, impeding on the portability of these devices.
NanoPAL is a serial k-mer parser/counter for FAST5 files, and a de Bruijn graph construction tool which can run on a hand-held device. In order to accomplish this portability we develop novel cache oblivious data structures and out-of-core chunked processing algorithms. Our methods, which we refer to as Nanopore Portable Analytics Library (NanoPAL), were implemented in ISO C++14 and compiled for Android devices.
Using MinION data (Zaire Ebolavirus species and others), we evaluate the time required to parse and build the de Bruijn graph with respect to the file sizes and RAM allocation. These metrics were compared to those of minimap/miniasm. On an LG Nexus 5 with 2GB or RAM, 2MB L2 cache and 16GB storage, the out-of-core NanoPAL is able to process FAST5 files at about 30 minutes per 0.5 GB, creating sorted k-mer and de Bruijn graph files. The recompiled minimap/miniasm tool cannot complete FAST5 files larger than 170MB. In conjunction with base calling/error correction, and with addition of assembly procedures downstream, NanoPAL can be effectively used to perform analyses of MinION/SmidgION data locally on a mobile device.
NanoPAL uses libseq, a C++11 and C++14 programming library with facilities designed for Next Generation Sequencing (NGS) analysis. The libseq library makes use of heavy templating in order to achieve a runtime boost by using static polymorphism and class traits.