New papillomaviruses identified in Malayan and Chinese pangolins

Papillomaviruses are non-enveloped, double-stranded DNA (dsDNA) viruses with a circular genome. Infection with these viruses can lead to several clinical symptoms ranging from subclinical, cutaneous and mucosal warts to cancerous lesions in the vertebrate host.

A new study published in the journal Biological letters identifies two new lineages of papillomaviruses by genome data mining of Chinese and Malayan pangolins.

Study: Discovery of new papillomaviruses in critically endangered Malayan and Chinese pangolins.  Image credit: 2630ben /

Study: Discovery of new papillomaviruses in critically endangered Malayan and Chinese pangolins. Image credit: 2630ben /

What are pangolins?

Pangolins are nocturnal, scaly and insectivorous mammals belonging to the Pholidota order. The Phataginus and Smoothies genera are found in Africa, while genera Manis genera are found in Asia.

There are four different species of pangolin living in Asia, including the Philippine pangolin (Manis culionensis), Indian pangolin (Manis crassicaudata), Malayan pangolin (Manis javanica), and the Chinese pangolin (Manis pentadactyla). The International Union for Conservation of Nature (IUCN) has designated the Philippine, Malayan and Chinese pangolins as Critically Endangered due to declining populations due to overexploitation and trade in their scales.

Recently, pangolins have emerged as potential hosts for viral diseases after severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-related coronaviruses were reported in Malayan pangolins. In addition, pangolins have also been implicated as hosts for other ribonucleic acid (RNA) viruses such as picornaviruses, pneumoviruses, reoviruses, flaviviruses, and canine distemper virus (Paramyxoviridae).

The diversity of DNA viruses and their association with diseases in pangolins remains unclear.

About the study

The researchers of the current study discovered an unidentified papillomavirus contig in pangolins and then screened this sample using the tblastn algorithm. Annotation of the 7307-bp contig was done using the PuMA pipeline, which corresponded to the full genome of the papillomavirus encoding proteins E1, E2, E6, E7, L1 and L2, together with two spliced ​​products.

Subsequently, the short-read data of the re-sequenced genomes of 22 Chinese pangolins and 72 Malayan pangolins were reviewed. Short sequences with more than 100 significant matches were downloaded first, followed by theirs de novo assembly.

Genomic sequences are classified into forward, orphan and reverse reads. Finally, contig annotation of complete viral genomes was done using PuMA.

Examination of the L1 gene contigs revealed an intact L1 open reading frame. Estimation of the isoelectric point and molecular weight of the predicted protein products was performed using ExPASy. Magic-BLAST was used to calculate the coverage of the assembled sequences.

A Bayesian phylogeny of the E1 and L1 proteins was performed to study the systematics of these viruses using ICTV-recognized papillomaviruses Tupaia belangeri papillomavirus 1 (TbelPV1) and Tupaiabelangeri papillomavirus 2 (TbelPV2).

Study findings

Significant papillomavirus hits were identified in 11 of 22 Chinese pangolin samples and 36 of 73 Malayan pangolin samples. All samples with known geographic provinces originate from Yunnan Province, China.

Namely, one sample of Chinese pangolin and ten samples of Malayan pangolin had more than 100 significant hits. These samples were subsequently selected for genome assemblies, in which two new papillomavirus genomes were assembled, along with the reassembled genome for the reference contig.

The complete L1 gene was assembled for three of these samples. Moreover, three different L1 gene contigs were assembled from a single Chinese pangolin.

The assembled genome had between 7253 and 7437 base pairs (bp), with a guanine-cytosine (GC) content between 39.84%-40.09%.

All three genomes contained proteins E1, E2, E6, E7, L1 and L2, as well as two fusion products. In addition, seven of the eight L1 sequences were categorized into a monophyletic group and clustered as a sister lineage to TbelPV1. Clade E1 was also placed as sister to TbelPV1.

Only one protein sequence from the Chinese pangolin belonged to different classes, incl Dyodeltapapillomavirus, Alphapapillomavirusand Omegapapilloma virus.


The current study identifies Southeast Asian pangolins as hosts of highly widespread and diverse papillomaviruses. These findings highlight the importance of in silico mining host sequencing data to screen for novel viruses. Further research is needed to better understand the impact of papillomavirus on pangolins and to develop conservation strategies for these animals.

Journal reference:

  • Barreat, JGN, Kamada, AJ, Reuben de Souza, C., et al. (2023). Discovery of new papillomaviruses in critically endangered Malayan and Chinese pangolins. Biological letters. doi:10.1098/rsbl.2022.0464.

Leave a Comment

Your email address will not be published. Required fields are marked *