Analysis Of The Epstein-barr Virus Lytic Transcriptome Using High-throughput Sequencing Methods
The lytic replicative cascade in Epstein-Barr virus and other herpesviruses has traditionally been understood to involve the ordered expression of sets of protein-coding genes. Recent experiments using tiling microarrays and next-generation sequencing, however, have indicated extensive transcription outside of known coding regions. In this study, strand-specific Illumina RNA-Seq reveals abundant antisense and intergenic transcription of the EBV genome during lytic replication. Both polyadenylated and non-polyadenylated transcripts are shown to arise from nearly the entire genome. However, the complex and overlapping nature of these transcripts confounds attempts to resolve their structures with short-read RNA-Seq alone. In order to resolve the structures of polyadenylated transcripts on a global level, the Transciptome Resolution by Integration of Multi-platform Data (TRIMD) method was developed. This method combines data from Pacific Biosciences long-read SMRT sequencing (Iso-Seq protocol) with data from Illumina RNA-Seq and deepCAGE. Using TRIMD we identify nearly 300 unannotated transcripts in replicating Epstein-Barr virus. These transcripts illustrate multiple strategies by which the virus achieves its remarkable level of transcript diversity, including alternative promoter usage, alternative splicing, readthrough transcription and intergenic splicing. The TRIMD method is simple and flexible, and scripts have been developed to facilitate its application to other genomes.