Phylogenetic Analysis of Three Long Non-coding RNA Genes: AK082072, AK043754 and AK082467

Document Type: Research Article


1 Department of Biology, Faculty of Science, University of Isfahan, Isfahan, Iran

2 Department of Plant Breeding, Yazd Branch, Islamic Azad University, Yazd, Iran


Now, it is clear that protein is just one of the most functional products produced by the eukaryotic genome. Indeed, a major part of the human genome is transcribed to non-coding sequences than to the coding sequence of the protein. In this study, we selected three long non-coding RNAs namely AK082072, AK043754 and AK082467 which show brain expression and local region conservation among vertebrates. Thus, the sequences of these genes are appropriate for phylogenetic analysis. In order to evaluate the evolutionary and molecular trend of lncRNAs in vertebrates, phylogenetic analysis and natural selection process were analyzed during evolution. The nucleotide sequences of selected long non-coding RNAs from different vertebrates were aligned and the phylogenetic trees were constructed using Neighbor Joining method with maximum sequence differences of 0.75. Our analysis of nucleotide sequences to find closely evolved organisms with high similarity by NCBI-BLAST tools and MEGA7 showed that the selected sequence of AK082072 in human and M. fascicularis (macaque) were placed into the same cluster and they may originate from a common ancestor. In addition, the human sequence of AK082467 and AK043754 had the closest similarity with cow. Also, bioinformatic analysis showed that the dN/dS ratio is lower than 1 for all three genes which demonstrates purifying selection for the longest predicted ORF of each lncRNA. Together, these results indicate that lncRNAs act as regulatory genes that have important roles in development.


Amaral PP, Leonardi T, Han N, Vire E, Gascoigne DK, Arias-Carrasco R, Buscher M, Zhang A, Pluchino S, Maracaja-Coutinho V. 2016. Genomic positional conservation identifies topological anchor point (tap) RNAs linked to developmental loci. Genome Biol 19: 32.

Augui S, Nora EP, Heard E. 2011. Regulation of X-chromosome inactivation by the X-inactivation centre. Nat Rev Genet 12: 429-442.

Bertone P, Gerstein M, Snyder M. 2005. Applications of DNA tiling arrays to experimental genome annotation and regulatory pathway discovery. Chromosome Res 13: 259-274.

Carninci P. 2007. Constructing the landscape of the mammalian transcriptome. J Exp Biol 210: 1497-1506.

Chinwalla AT, Cook LL, Delehaunty KD, Fewell GA, Fulton LA, Fulton RS, Graves TA, Hillier LW, Mardis ER, McPherson JD. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520-562.

Chodroff RA, Goodstadt L, Sirey TM, Oliver PL, Davies KE, Green ED, Molnár Z, Ponting CP. 2010. Long non-coding RNA genes: conservation of sequence and brain expression among diverse amniotes. Genome Biol 11: R72.

Church DM, Goodstadt L, Hillier LW, Zody MC, Goldstein S, She X, Bult CJ, Agarwala R, Cherry JL, DiCuccio M. 2009. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol 7:e1000112.

Desper R, Gascuel O. 2004. Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting. Mol Biol Evol 21: 587-598.

He Y, Ding Y, Zhan F, Zhang H, Han B, Hu G, Zhao K, Yang N, Yu Y, Mao L. 2015. The conservation and signatures of lincRNAs in Marek’s disease of chicken. Sci Rep 5: 15184.

Hezroni H, Koppstein D, Schwartz MG, Avrutin A, Bartel DP, Ulitsky I. 2015. Principles of long non-coding RNA evolution derived from direct comparison of transcriptomes in 17 species. Cell Rep 11: 1110-1122.

Hezroni H, Perry RB-T, Meir Z, Housman G, Lubelsky Y, Ulitsky I. 2017. A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes. Genome Biol 18: 162.

Jannat Alipoor F, Asadi MH, Torkzadeh-Mahani M. 2017. LncRNA Miat promotes proliferation of cervical cancer cells and acts as an anti-apoptotic factor. J Genet Resour. 3(2): 80-86.

Klattenhoff CA, Scheuermann JC, Surface LE, Bradley RK, Fields PA, Steinhauser ML, Ding H, Butty VL, Torrey L, Haas S. 2013. Braveheart, a long non-coding RNA required for cardiovascular lineage commitment. Cell 152: 570-583.

Korber B. 2000. HIV Signature and sequence variation analysis. In: Computational analysis of HIV molecular sequences, Allen G and Gerald H (eds). Dordrecht, Kluwer Academic Publishers, Netherlands.

Kretz M, Siprashvili Z, Chu C, Webster DE, Zehnder A, Qu K, Lee CS, Flockhart RJ, Groff AF, Chow J. 2013. Control of somatic tissue differentiation by the long non-coding RNA TINCR. Nature 493: 231-235.

Kutter C, Watt S, Stefflova K, Wilson MD, Goncalves A, Ponting CP, Odom DT, Marques AC. 2012. Rapid turnover of long non-coding RNAs and the evolution of gene expression. PLoS genet 8: e1002841.

Le Meur N, Holder-Espinasse M, Jaillard S, Goldenberg A, Joriot S, Amati-Bonneau P, Guichet A, Barth M, Charollais A, Journel H. 2010. MEF2C haploinsufficiency caused by either microdeletion of the 5q14. 3 region or mutation is responsible for severe mental retardation with stereotypic movements, epilepsy and/or cerebral malformations. J Med Genet 47: 22-29.

Lee S, Kopp F, Chang T-C, Sataluri A, Chen B, Sivakumar S, Yu H, Xie Y, Mendell JT. 2016. Non-coding RNA NORAD regulates genomic stability by sequestering PUMILIO proteins. Cell 164: 69-80.

Li L, Liu B, Wapinski OL, Tsai M-C, Qu K, Zhang J, Carlson JC, Lin M, Fang F, Gupta RA. 2013. Targeted disruption of Hotair leads to homeotic transformation and gene derepression. Cell Rep 5: 3-12.

Lunter G, Ponting CP, Hein J. 2006. Genome-wide identification of human functional DNA using a neutral indel model. PLOS Comput Biol 2: e5.

Managadze D, Rogozin IB, Chernikova D, Shabalina SA, Koonin EV. 2011. Negative correlation between expression level and evolutionary rate of long intergenic non-coding RNAs. Genome Biol Evol 3: 1390-1404.

Marques AC, Ponting CP. 2009. Catalogues of mammalian long non-coding RNAs: modest conservation and incompleteness. Genome Biol 10: R124.

Necsulea A, Soumillon M, Warnefors M, Liechti A, Daish T, Zeller U, Baker JC, Grützner F, Kaessmann H. 2014. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature 505: 635.

Ponjavic J, Oliver PL, Lunter G, Ponting CP. 2009. Genomic and transcriptional co-localization of protein-coding and long non-coding RNA pairs in the developing brain. PLoS Genet 5: e1000617.

Saitou N, Nei M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4: 406-425.

Skinner ME1, Uzilov AV, Stein LD, Mungall CJ, Holmes IH. 2009. JBrowse: a next-generation genome browser. Genome Res 19: 1630-1638.

Tamura K, Nei M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10: 512-526.

Tamura K, Nei M, Kumar S. 2004. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc Natl Acad Sci 101: 11030-11035.

Tichon A, Gil N, Lubelsky Y, Solomon TH, Lemze D, Itzkovitz S, Stern-Ginossar N, Ulitsky I. 2016. A conserved abundant cytoplasmic long non-coding RNA modulates repression by Pumilio proteins in human cells. Nat Commun 7: 12209.

Ulitsky I. 2016. Evolution to the rescue: using comparative genomics to understand long non-coding RNAs. Nat Rev Genet 10: 601.

Valadkhan S, Nilsen TW. 2010. Reprogramming of the non-coding transcriptome during brain development. J Biol 9: 5.

Washietl S, Kellis M, Garber M. 2014. Evolutionary dynamics and tissue specificity of human long non-coding RNAs in six mammals. Genome Res 24: 616-628.

Zhang Z, Schwartz S, Wagner L, Miller W. 2000. A greedy algorithm for aligning DNA sequences. ‎J Comput Biol 7: 203-214.