Descriptive mitogenomics

Avian mitogenome organisation

In the avian ground pattern, the mitogenome contains two rRNA genes, 22 tRNA genes, 13 protein-coding genes, an elongate non-coding control region, and possibly an extended tandem duplication. This pattern is typical for almost all metazoans (Bernt et al., 2014).

Compared to other vertebrates, in avian mitogenomes the positions of the adjacent gene clusters [CYB:T:P] and [ND6:E] are interchanged, with the derived gene order [CYB:T:P:ND6:E] representing an avian ground-pattern apomorphy (Montaña-Lozano et al., 2022).

Mitochondrial gene map, depicting the putative ancestral avian pattern (not to scale). The tandem duplication (TD), extending between the non-coding control region (CR) and gene F, is shown separately as it is unclear whether it pertains to the avian ground pattern. When fully developed, the TD contains a pseudogene Ψ (a degenerate copy of CYB), four functional genes (T, P, ND6 and E), and an extended control region (Urantówka et al., 2020). Note: tRNA genes are depicted by their one-letter amino-acid code; red colour indicates genes that are encoded on the secondary (-) strand; spacers and overlaps are not considered.

Avian tandem duplication

Most avian mitogenomes are distinguished from typical vertebrate mitogenomes by the presence of a large tandem duplication comprising the control region and several adjacent genes (Urantówka et al., 2018, 2020, 2021; Mackiewicz et al., 2019). Dating back to Haring et al. (1999) this region is also referred to as “pseudo-control region”. Duplicated genes are often almost identical to their counterpart, a phenomenon referred to as “sequence homogenisation” or “concerted evolution” (Cadahía et al., 2009; Eberhard et al., 2001; Kim et al.,  2021; Morris-Pocock et al., 2010; Urantówka et al., 2021). The molecular mechanism underlying this phenomenon is unknown. 

In Galloanserae, tandem duplications are absent throughout. It is unclear, however, whether the lack is primary or secondary. Although the presence of a tandem duplication is a ground-pattern trait of most avian orders (Mackiewicz et al., 2019), there is a considerable amount of homoplasy in the observed configurations. Some authors refer to pseudogenes simply as long intergenic spacers (e.g. Bai et al., 2023). 

Various types of tandem duplications may be distinguished: 

Schematic map of the original tandem duplication (type 0) and variously derived configurations (types 1-7). The control region of copy 1 was lost in moa, Dinornithiformes, recently extinct palaeognaths from New Zealand (type 7). 

ATP6

  • ATP8 connection: 10 bp overlap  (ATG.AAC.CTA.A, sometimes T
  • Start codon: ATG
  • Length: 684 bp (227 amino acids)
  • Indels: none
  • Stop codon: TAA
  • CO3 connection: 1 bp overlap 1 (A)

ATP8

  • tRNA Lys (K) connection: none
  • Start codon: ATG
  • Length: 165 bp or 168 bp (54 or 55 amino acids)
  • Indels: 3 bp between 132 and 133 (codons #44 and #45)
  • Stop codon: TAA
  • ATP6 connection: 10 bp overlap  (A.TGA.ACC.TAA, sometimes T). 

CO1

  • tRNA Tyr (Y) connection: none
  • Start codon: GTG (but ATG e.g. in Accipitridae, Jacanidae, Meropidae, Strigidae)
  • Length: 1551 bp (516 amino acids)
  • Indels: none
  • Stop codon: AGG
  • tRNA Ser-UCN (S1) connection: 9 bp overlap  (CAA.GAA.AGG, sometimes G). 

CO2

  • tRNA Asp (D) connection: 1 bp spacer (C/T)
  • Start codon: ATG/GTG
  • Length: 684 bp or 687 bp (227 or 228 amino acids)
  • Indels: 3 bp between 678 and 682 (codons #226 and #228)
  • Stop codon: TAA
  • tRNA Lys (K) connection: 1 bp spacer (C/T, rarely G)

CO3

  • ATP6 connection: 1 bp overlap (A)
  • Start codon: ATG
  • Length: 784 bp (261 amino acids)
  • Indels: none
  • Stop codon: T
  • tRNA Gly (G) connection: 1 bp spacer (G)

CYB

  • ND5 connection: none
  • Start codon: ATG
  • Length: 1143 bp (380 amino acids)
  • Indels: none
  • Stop codon: TAA (sometimes TAG)
  • tRNA Thr (T) connection: none

ND1

  • tRNA Leu-UUR (L1) connection: variable (6-15 spacers)
  • Start codon: ATG
  • Length: 978 bp (325 amino acids), sometimes 975 bp (324 amino acids)
  • Indels: variable (3 bp deletion at positions 4-6; 3 bp deletion at 10-12; 1 bp deletion at 973, creating new stop codon)
  • Stop codon: AGG, sometimes TAA
  • tRNA Ile (I) connection: usually 2 bp overlap (GG), but none when stop codon TAA

ND2

  • tRNA Met (M) connection: none (rarely 1 bp spacer)
  • Start codon: ATG
  • Length: 1041 bp (346 amino acids)
  • Indels: none
  • Stop codon: TAG, sometimes TAA
  • tRNA Trp (W) connection: 2 bp overlap, usually (AG), sometimes (AA)

ND3

  • tRNA Gly (G) connection: none
  • Start codon: ATG, sometimes GTG
  • Length: 352 bp (116 amino acids)
  • Indels: untranslated 1 bp insertion (mostly C) at 174, sometimes absent
  • Stop codon: TAA, sometimes TAG
  • tRNA Arg (R) connection: none

Comment: the protein-coding gene ND3 is peculiar in having an extra nucleotide (mostly cytosine) at position 174. The insertion probably pertains to the avian ground pattern but has been lost many times during avian evolution (Jing et al., 2020, suppl. 12). The extra base, however, appears not to be processed during translation as the downstream reading frame and amino-acid sequence are conserved due to a translational (+1)-frameshift (Mindell et al., 1998b; Al-Arab et al., 2017; Andreu-Sánchez et al., 2020). 

ND4

  • ND4L connection: 7 bp overlap  (ATG.CTA.A, sometimes T
  • Start codon: ATG
  • Length: 1378 bp (459 amino acids)
  • Indels: none
  • Stop codon: T
  • tRNA His (H) connection: none
Comment: it remains unclear, whether the three 1 bp insertions (at positions 180/181, 318/319, 390/391) reported to occur in the non-annotated ND4 gene of Stictonetta naevosa (Anatidae, Oxyurinae) are reliable or whether they are due to sequencing errors (GenBank accession number CM021835). 

ND4L

  • tRNA Arg (R) connection: none
  • Start codon: ATG, rarely GTG
  • Length: 297 bp (98 amino acids)
  • Indels: none
  • Stop codon: TAA
  • ND4 connection: 7 bp overlap (mostly A.TGC.TAA, sometimes T)

ND5

  • tRNA Leu-CUN (L2) connection: none
  • Start codon: ATG/GTG
  • Length: variable (1806, 1809, 1815, 1818, 1821, 1824, 1827)
  • Indels: variable (3 bp insertion between 9 and 10; 3 bp deletion at 13-15; 3 bp deletion at 91-93; 3 bp deletion at 619-621; 3 bp deletion at 1528-30; 3 bp deletion at 1531-33; 3 bp deletion at 1534-36; 3 bp deletion at 1540-42; 3 bp deletion at 1810-12; 2 bp deletion at 1816/1817)
  • Stop codon: AGA/TAA
  • CYB connection: variable (from 4-12 bp spacers to 1 bp overlap)

ND6

  • tRNA Pro (P) connection: none
  • Start codon: ATG
  • Length: 522 (173 amino acids)
  • Indels: none
  • Stop codon: TAG, sometimes AGG or TAA
  • tRNA Glu (E) connection: none

Comment: ND6 is the only protein-coding gene that is encoded on the secondary (-)-strand. 

RNR1

  • tRNA Phe (F) connection: 1 bp overlap (A)
  • Length: variable (965-995 bp)
  • Indels: numerous, mostly in hypervariable regions (HV1-15)
  • tRNA Val (V) connection: usually none, sometimes 1 bp spacer (A, T, C)

Comment: RNR1 exhibits numerous hypervariable regions of varying length. Due to the occurrence of indels, these regions cannot be reliably aligned and therefore must be excluded from phylogenetic analyses. The sequences of Aix galericulata (GenBank accession number KF437906) are quite different from the remaining sequences and need to be verified.

RNR2

  • tRNA Val (F) connection: 0-2 bp overlaps, or 0-2 bp spacers (?)
  • Length: variable (1593-1625 bp)
  • Indels: numerous, mostly in hypervariable regions (HV1-23)
  • tRNA Leu-UUR (L1) connection: 0-2 bp spacer (?)

Comment: RNR2 exhibits numerous hypervariable regions of varying length. Due to the occurrence of indels, these regions cannot be reliably aligned and therefore must be excluded from phylogenetic analyses. 

tRNA Leu-UUR

The D-arm of tRNA Leu-UUR contains a conserved motif (5'-TGGCAGAGCCCGG-3') that may be involved in regulating the transcription of rRNA genes (Valverde et al, 1994; Guo et al., 2022). 

Control region

The control region, which typically has a length of about 1,150 bp, is the only extended non-coding region of the mitogenome. This region is also referred to as ‘D-loop’, although the true D-loop does neither span the entire control region nor is it found in all mtDNA molecules at any given time (Pereira et al., 2008; Nicholls & Minczuk, 2014). 

For descriptive purposes, Brown et al. (1986) first divided the control region into three domains, with a conserved central domain being flanked by highly variable domains on either side. The authors did not, however, define an exact boundary to separate domains 1 and 2 from each other. 

Avian mitochondrial codon translation code

Avian mitochondrial codon translation code (according to the Vertebrate Mitochondrial Code of NCBI Taxonomy). (link) 

Critical comments

In mitogenomics, there is an obvious lack of conventions (Alexeyev, 2020), e.g.:

  • There is no unequivocal assignment of strands. Historically, the strand with the lower G+T content was referred to as (L)-strand (Anderson et al., 1981). However, present-day strand assignments in vertebrates do not comply with the original definition (Barroso Lima & Prosdocimi, 2017; Alexeyev, 2020). Recommendation: to avoid confusion, the strands should be distinguished by the relative number of genes contained, with the (+) strand being the primary or main coding strand containing more genes than the secondary (–) strand (Taanman 1999, fig.1; Gissi et al., 2008; Lima & Prosdocimi, 2017). [In insects, the complementary strands are referred to as minority and majority strand, respectively].
  • Circular maps may be oriented with functional elements (i.e. genes and control region) arranged either clockwise or counter-clockwise (Alexeyev, 2020). Recommendation: circular maps should always be displayed in clockwise orientation.
  • Circular maps may display any gene or the control region at the top (12 o’clock) position. Recommendation: circular maps should display gene F (the DNA template for tRNA Phe) at the top center.
  • Annotations may start with any gene or the control region; they can even start within the control region (e.g. in human mitochondrial annotations). Recommendation: vertebrate annotations should start with gene F and have the control region at the end (see e.g. Montaña-Lozano et al., 2022, Fig. 3). 

References

Al-Arab M, Höner zu Siederdissen C, Tout K, and Sahyoun AH (2017), Accurate annotation of protein-coding genes in mitochondrial genomes, Mol. Phylogenet. Evol. 106, 209-216. (abstract)

Aleix-Mata G, Ruiz-Ruano FJ, Perez JM, Sarasa M, and Sanchez A (2019), Complete mitochondrial genomes of the Western Capercaillie Tetrao urogallus (Phasianidae, Tetraoninae), Zootaxa 4550, 585-593. (pdf)

Alexeyev M (2020), Mitochondrial DNA: the common confusions, Mitochondrial DNA A 31, 45-47. (pdf)

Anderson S, Bankier AT, Barrell BG, de Bruijn MH, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, and Sanger F (1981), Sequence and organization of the human mitochondrial genome, Nature 290, 457-465. (abstract)

Andreu-Sánchez S, Chen W, Stiller J, and Zhang G (2020), Multiple origins of a frame shift insertion in a mitochondrial gene in birds and turtles, GigaScience 10, 1-11. (free pdf)

Bai G, Yuan Q, Guo Q, and Duan Y (2023), Identification and phylogenetic analysis in Pterorhinus chinensis (Aves, Passeriformes, Leiotrichidae) based on complete mitogenome, ZooKeys 1172, 15-30. (free pdf)

Barroso Lima NCB, and Prosdocimi F (2017), The heavy strand dilemma of vertebrate mitochondria on genome sequencing age: number of encoded genes or G+T content?, Mitochondrial DNA A 29, 300-302. (abstract)

Brown GG, Gadeleta G, Pepe G, Saccone C, and Sbisá E (1986), Structural conservation and variation in the D-loop containing region of vertebrate mitochondrial DNA, J. Mol. Biol. 192, 503-511. (abstract)

Cadahía L, Pinsker W, Negro JJ, Pavlicev M, Urios V, and Haring E (2009), Repeated sequence homogenisation between the control and pseudo-control regions in the mitochondrial genomes of the subfamily Aquilinae, J. Exp. Zool. 312B, 171-185. (abstract)

Caparroz R, Rocha AV, Cabanne GS, Tubaro P, Aleixo A, Lemmon EM, and Lemmon AR (2018), Mitogenomes of two neotropical bird species and the multiple independent origin of mitochondrial gene orders in Passeriformes, Mol. Biol. Rep. 45, 279-285. (abstract)

Dey P, Sharma SK, Sarkar I, Ray SD, Pramod P, Kochiganti VHS, Quadros G, Rathore SS, Singh V, and Singh RP (2023), Complete mitochondrial genome of endemic plum-headed parakeet Psittacula cyanocephala - characterization and phylogenetic analysis, PLoS ONE 16, e:0241098. (free pdf)

D‘Souza AR, Minczuk M (2018), Mitochondrial transcription and translation: overview, Essays Biochem. 62, 309-320. (pdf)

Eberhard JR, Wright TF, and Bermingham E (2001), Duplication and concerted evolution of the mitochondrial control region in the parrot genus Amazona, Mol. Biol. Evol. 18, 1330-42. (free pdf)

Gissi C, Ianneli F, and Pesole G and (2008), Evolution of the mitochondrial genome of Metazoa as exemplified by comparison of congeneric species, Heredity 101, 301-320. (free pdf)

Guo ZL, Zhang Y, Yang H, Wang TS, Wang WW, Zeng SS, Guo Y, Ye L, Du A, Wang ZW, Zeng SM, Tuan J, and Wang L (2022), Sequencing and structural characteristic analysis of mitochondrial genome in Zhijin White Goose (Anser cygnoides), Res. Square (pdf)

Haring E, Riesing MJ, Pinsker W, and Gamauf A (1999), Evolution of a pseudo-control region in the mitochondrial genome of Palearctic buzzards (genus Buteo), J. Zool. Syst. Evol. Res. 37, 185-194. (abstract)

Haring E, Kruckenhauser L, Gamauf A, Riesling MJ, and Pinsker W (2001), The complete sequence of the mitochondrial genome of Buteo buteo (Aves, Accipitridae) indicates an early split in the phylogeny of raptors, Mol. Biol. Evol. 18, 1892-1904. (free pdf)

Jiang C, Kang H, Zhou Y, Zhu W, Zhao X, Mohamed N, and Li B (2024), Selected lark mitochondrial genomes provide insight into the evolution of second control region with tandem repeats in Alaudidae (Aves, Passeriformes), Life 14, e:881. (free pdf)

Jiang L, Chen J, Wang P, Ren Q, Yuan J, Qian C, Hua X, Guo Z, Zhang L, Yang J, Wang Y, Zhang Q, Ding H, Bi D, Zhang Z, Wang Q, Chen D, and Kan X (2015), The mitochondrial genomes of Aquila fasciata and Buteo lagopus (Aves, Accipitriformes): sequence, structure, and phylogenetic analyses, PLoS ONE 10, e:0136297. (pdf) 

Jing M, Yang H, Li K, and Huang L (2020), Characterization of three new mitochondrial genomes of Coraciiformes (Megaceryle lugubris, Alcedo atthis, Halcyon smyrnensis) and insights into their phylogenetics, Genet. Mol. Biol. 43, e:20190392. (pdf)

Kang H, Li B, Ma X, and Xu Y (2018), Evolutionary progression of mitochondrial gene rearrangements and phylogenetic relationships in Strigidae (Strigiformes), Gene 674, 8-14. (abstract)

Kim JI, Do TD, Choi Y, Yeo Y, and Kim CB (2021), Characterization and comparative analysis of complete mitogenomes of three Cacatua parrots (Psittaciformes: Cacatuidae), Genes 12, e:209. (pdf)

Krzeminska U, Wilson R, Rahman S, Song BK, Seneviratne S, Gan HM, and Austin CM (2016), Mitochondrial genomes of the jungle crow Corvus macrorhynchos (Passeriformes: Corvidae) from shed feathers and a phylogenetic analysis of genus Corvus using mitochondrial protein-coding genes, Mitochondrial DNA 27, 2668-70. (abstract)

Kundu S, Alam I, Maheswaran G, Tyagi K, and Kumar V (2022), Complete mitochondrial genome of great frigatebird (Fregata minor): phylogenetic position and gene rearrangement, Biochem. Genet. 60, 1177-88. (abstract)

Lan G, Yu J, Liu J, Zhang Y, Ma R, Zhou Y, Zhu B, Wei W, Liu J, and Qi G (2024), Complete mitochondrial genome and phylogenetic analysis of Tarsiger indicus (Aves: Passeriformes: Muscicapidae), Genes 15, e:90. (free pdf)

Mackiewicz P, Urantówka AD, Kroczak A, and Mackiewicz D (2019), Resolving phylogenetic relationships within Passeriformes based on mitochondrial genes and inferring the evolution of their mitogenomes in terms of duplications, Genome BiolEvol. 11, 2824-49. (free pdf)

Mindell DP, Sorenson MD, and Dimcheff DE (1998b), An extra nucleotide is not translated in mitochondrial ND3 of some birds and turtles, Mol. Biol. Evol. 15, 1568-71. (free pdf)

Montaña-Lozano P, Moreno-Carmona M, Ochoa-Capera M, Medina NS, Boore JL, and Prada CF (2022), Comparative genomic analysis of vertebrate mitochondria reveals a differential of rearrangement rate between taxonomic class, Sci. Rep. 12, e:5479. (pdf)

Morris-Pocock JA, Taylor SA, Birt TP, and Friesen VL (2010), Concerted evolution of duplicated mitochondrial control regions in three related seabird species, BMC Evol. Biol. 10, e:14. (pdf)

Nicholls TJ, and Minczuk M (2014), In D-loop: 40 years of mitochondrial 7S DNA, Exp. Geront. 56, 175-181. (abstract)

Pacheco MA, Battistuzzi FU, Lentino M, Aguilar RF, Kumar S, and Escalante AA (2011), Evolution of modern birds revealed by mitogenomics: timing the radiation and origin of major orders, Mol. Biol. Evol. 28, 1927-42. (free pdf)

Pereira F, Soares P, Carneiro J, Pereira L, Richards MB, Samuels DC, and Amorim A (2008), Evidence for variable selective pressures at a large secondary structure of the human mitochondrial DNA control region, Mol. Biol. Evol. 25, 2759-70. (free pdf)

Slack KE, Janke A, Penny D, and Arnason U (2003), Two new avian mitochondrial genomes (penguin and goose) and a summary of bird and reptile mitogenomic features, Gene 302, 43-52. (abstract)

Taanman JW (1999), The mitochondrial genome: structure, transcription, translation and replication, Biochim. Biophys. Acta 1410, 103-123. (free reading)

Urantówka AD, Kroczak A, Silva T, Padrón RZ, Gallardo NF, Blanch J, Blanch B, and Mackiewicz P (2018), New insight into parrot’s mitogenomes indicates that their ancestor contained a duplicated region, MolBiolEvol. 35, 2989–3008. (free pdf)

Urantówka AD, Kroczak A, and Mackiewicz P (2020), New view on the organization and evolution of Palaeognathae mitogenomes poses the question on the ancestral rearrangement in Aves, BMC Genomics 21, e:874. (pdf)

Urantówka AD, Kroczak A, Strzala T, Zaniewicz G, Kurkowski M, and Mackiewicz P (2021), Mitogenomes of Accipitriformes and Cathartiformes were subjected to ancestral and recent duplications followed by gradual degradation, Genome Biol. Evol. 13, e:evab193. (pdf)

Valverde JR, Marco R, and Garesse R (1994), A conserved heptamer motif for ribosomal RNA transcription termination in animal mitochondria, Proc. Natl. Acad. Sci. USA 91, 5368-71. (pdf)

Wang X, Huang Y, Liu N, Yang J, and Lei F (2015), Seven complete genome sequences of bushtits (Passeriformes, Aegithalidae, Aegithalos): the evolution pattern in duplicated control regions, Mitochondrial DNA 26, 350-356. (abstract)

Yang C, Du X, Liu Y, Yuan H, Wang Q, Hou X, Gong H, Wang Y, Huang Y, Li C, and Ye H (2022), Comparative mitogenomics of the genus Motacilla (Aves, Passeriformes) and its phylogenetic implications, ZooKeys 1109, 49-65. (free pdf)

Yuan Q, Sha J, and Duan Y (2022), The new mitogenome of Erpornis zantholeuca (Aves: Passeriformes): sequence, structure, and phylogenetic analyses, Cytogenet. Genome Res. 162, 250-261. (abstract)

Yuan Q, Guo Q, Cao J, Luo X, and Duan Y (2023), Mitochondrial genomes of Sitta (S. himalayensis, S. nagaensis, and S. yunnanensis) and phylogenetic relationship (Aves: Sittidae), Genes 14, e:589. (free pdf)

Zhao Z, Alimo Z, Zhao X, Qin H, Dayananda B, Jiang L, and Chen W (2022), Complete mitochondrial genome of Turdus merula (Aves: Passeriformes: Turdidae) and related species: genome characteristics and phylogenetic relationships, Pakistan. J. Zool. 2022, 1-17. (pdf)