Identify pathogens from a stream of RNA sequences

Project duration: 2014-2015

Funding: BICRO POC 5

One of the main tasks of medical diagnostics is to determine the presence of pathogens in a human. To do so, a trained professional needs to track the symptoms, and sample of the infected tissue needs to be sent to a lab. The described procedure can take too long a time and still not result in a successful diagnosis. By being able to quickly retrieve the complete genetic content of the infected cell, the new RNA sequencing technology offers what is likely to become a universal method of pathogen detection and identification. We propose a computer based pathogen identification method. By comparing the sequenced transcriptome of the infected cell with the genetic data available in various biological databases, the proposed method will be able to identify not only present organisms, but also genes active at the time of sequencing. This is to be achieved by mapping the pieces of genetic data from the sequencer to the genes of the host cell and probable pathogens, followed by a probabilistic estimation, as put in simple terms, of what goes where.