Diverse cis factors controlling Alu retrotransposition: What causes Alu elements to die?
The human genome contains nearly 1.1 million Alu elements (11% of its total DNA content). Alu elements use a copy and paste retrotransposition mechanism that can result in de novo disease insertion alleles. However, the majority of these copies are inactive. There are nearly 900,000 old Alu elements from subfamilies S and J and about 200,000 from subfamily Y or younger, which include a few thousand copies of the Ya5 subfamily which makes up the majority of current activity. Only Alu elements belonging to the younger subfamilies are currently amplifying in the human genome. Given the much higher copy number of older Alu subfamilies, it is not known why the active Alu elements belong to the younger subfamilies. We present a systematic analysis evaluating the impact of the observed sequence variation in the different sections of an Alu on retrotransposition. We have identified several sequence regions of the Alu that contribute to the relative activity levels between old and young elements. We have determined that the length of the longest number of uninterrupted adenines in the A-tail, the degree of A-tail heterogeneity, the length of the 3' unique end after the A-tail and before the RNA polymerase III terminator, and random mutations found in the right monomer all modulate the retrotransposition efficiency. These changes occur over different time frames, where some contribute to the inactivation of an Alu soon after insertion, others are likely to occur at a later evolutionary time point. The combined impact of sequence changes in all of these regions allows us to explain why young Alus are currently causing disease through retrotransposition, and the old Alus have lost their ability to retrotranspose. We present a predictive model to evaluate the retrotransposition capability of individual Alu elements and successfully applied it to identify the first putative source element for a disease-causing Alu insertion in a patient with cystic fibrosis