navbar selection unknown : twenty metagenome analysis

Thanks to the attendees, sponsors and organizers for making the Bioinformatics Workshop for Helminth Genomics (2015) a big success!

Whole Genome Shotgun (WGS) sequencing reveals significant microbiome shifts in response to helminth infections
Bruce Rosa, John Martin and Makedonka Mitreva in collaboration with Yenny Djuardi and Maria Yazdanbakhsh (Unpublished)

Project Summary
WGS sequencing was performed on fecal samples from 10 individuals from Indonesia, who were sampled in both 2008 and 2010. Changes in helminth infection status and shifts in their microbiomes were quantified and statistically analyzed. N. americanus infections were found to cause significant shifts in bacterial presence and abundance in the human host, but these shifts are not restored by subsequent removal of N. americanus and do not result in significant shifts the overall diversity level of the community. N. americanus infections induced similar significant changes in bacterial community structure in different individuals, which coincided with a shift in bacteria defining major enterotypes for these individuals. Finally, KEGG enrichment results suggest that N. americanus infections may result in a change in the bacterial-affected mucous layer of the intestine, either as a strategy to better reach sources of blood, or as a consequence of blood feeding, while shifts in bacterial genes coinciding with A. ceylanicum and T. trichuria infection may be related to the anti-inflammatory activity previously observed for these species.

The full dataset for all 20 samples, containing metadata and depths of coverage (for all of the genomes examined in this study), can be downloaded here:


Selected Results from the Study
Quantification of helminth and bacterial abundance in 20 metagenomic shotgun samples

An average of 154 million whole genome shotgun (WGS) reads were captured from metagenomic DNA collected from fecal samples from 10 individuals from Indonesia (sampled in both 2008 and 2010). RNA-Seq reads were mapped to nematode genomes for species endemic in the region and the NR bacterial database1. Five of the individuals were treated with albendazole and five others were treated with placebos. Among the 20 samples, only three helminths were detected (Ascaris lumbrocoides, Necator americanus, and Trichuris trichuria). A. lumbricoides was detected in 19 of the samples, and all 15 samples containing T. trichiura and all 6 samples containing N. americanus also contained A. lumbricoides (Figure 1A).

Figure 1: The presence of helminths among 20 metagenome samples. (B) The abundance of each helminth species from the same individual in 2008 and 2010.

The relative abundance of each of the the helminth species present in each sample was quantified (Figure 1B). A. lumbricoides was detected in 19 of the 20 samples, with only one individual being infected with it in 2008, and then not in 2010. For N. americanus, 3 of the 5 individuals who received albendazole treatment were cured of their infections from 2008 to 2010, and 3 of the 5 individuals who received a placebo were newly infected between 2008 and 2010, and four individuals were not infected in either year. Six of the ten individuals were infected with T. trichuria in both years (including 3 that were treated with albendazole), two were treated with albendazole and were cured of the infection between 2008 and 2010, and one was treated with placebo was newly infected. All 3 individuals who were cured of a helminth infection were treated with albendazole, all 3 individuals who were newly infected with a helminth between 2008 and 2010 were treated with placebos.

Clustering analysis suggests an influence of bacterial composition by N. americanus

The correlation of the abundance across all bacterial strains was compared between 2008 and 2010 for each individual (2008:2010 correlation; Figure 2). Individuals who were not infected with N. americanus in 2008 but were infected in 2010 had a lower average 2008:2010 correlation value (0.29) compared to individuals who were never infected with N. americanus (0.72; P = 0.012 according to a two-tailed T-test with unequal variance) and to individuals who were infected in 2008 but not in 2010 (ie, cured by the albendazole treatment; 0.68; P = 0.015). These results suggest that N. americanus infections cause significant shifts in bacterial presence and abundance in the human host, but these shifts are not restored by subsequent removal of N. americanus. There were not enough cured or newly-infected samples for T. trichuris or A. lumbricoides to perform similar statistical testing for these species.

Figure 2: Hierarchical clustering of individuals and bacterial strains based on differential bacterial abundance between 2008 and 2010.

Bacterial strain diversity was quantified using richness (the number of bacterial strains present) and the Shannon index2 for each of the 20 samples. Strain richness ranged from 241 to 422 strains per sample, but no significant association between helminth presence (or change in helminth presence) and strain richness was found. Samples containing T. trichuris had a higher average Shannon index (3.41) than samples without (2.92; p = 0.047), but the same comparison was not true for N. americanus (p = 0.32), suggesting that N. americanus infections cause significant shifts in bacterial abundance patterns without affecting the overall diversity level of the community.

Individuals were clustered according to the difference in abundance of each bacterial species between 2008 and 2010, using hierarchical clustering (and the Spearman rank correlation coefficient; XLSTAT-Pro version 2012.6.02, Addinsoft, Inc., Brooklyn, NY, USA; Fig. 2). The three individuals that clustered the most closely together in this analysis were all free of N. americanus infections in 2008 but infected with it in 2010. This suggests that N. americanus infections induced similar significant changes in bacterial community structure in different individuals (P = 0.002, for a comparison of similarity between newly infected N. americanus samples and other samples). No significant clustering pattern was found for individuals treated with albendazole or placebos, or for individuals from different villages.

Bacterial strains were also clustered according to each strain’s abundance pattern across individuals, producing many distinct clustering regions (Fig 2). The large cluster near the top of the diagram which is frequently differentially abundant across all samples consists primarily of strains of Escherichia coli, as is expected for fecal samples3. Another cluster, approximately 3/5ths of the way down the figure, contains several strains which are all strongly reduced in abundance only in the newly-infected N. americanus samples. This cluster contained 18 strains of Prevotella and 5 strains of Bacteroides (two of the three genera most commonly abundant in the gut microbiota, which are useful for classification in enterotypes 1 and 24). In contrast, the other genera used to define the third major enterotype (Ruminococcus) was represented by 7 strains which did not cluster together. Of these, 6 strains increased in abundance in all individuals newly-infected with N. americanus and one strain (Ruminococcus lactaris ATCC 29176) increased in abundance in 2 of these individuals. This suggests that N. americanus infection coincided with a shift in bacteria defining major enterotypes for these individuals.

KEGG Orthology and Pathway Enrichment Testing

KEGG orthology (KO)5,6 and pathway abundance quantification was performed on each of the 20 WGS bacterial samples using HUMAnN (The HMP Unified Metabolic Analysis Network)7. LDA Effect Size (LEfSe)8 was used to determine significant differential representation of the detected KEGG orthology (KO) groups and KEGG pathways among samples infected with each of three nematodes (A. lumbricoides, N. americanus and T. trichiura).

Only 1 KO and two pathways were significantly differentially represented among the 9 samples infected with A. lumbricoides (Figure 3). One of the pathways (ko00532), found over-represented in A. lumbricoides-infected samples, represents the pathway for chondroitin sulfate, which is a well-described anti-inflammatory compound commonly used in osteoarthritis treatment9. This suggests a possibility that some of the anti-inflammatory properties of A. lumbricoides infections10 may be due to bacteria that co-infect with the worms and secrete anti-inflammatory factors.

Figure 3: LEfSe results for KO enrichment and depletion among A. lumbricoides-infected samples.

Among the 10 samples infected with N. americanus, there was an enrichment for many hydrolases, including many related to detoxification of exobiotic compounds (Figure 4). Mucin-type O-glycans (which have an enriched biosynthesis term here) are the primary constituents of mucins that are expressed on various mucosal sites of the body, especially the bacteria-laden intestinal tract11. This suggests the possibility that N. americanus infections result in a change in the bacterial-affected mucous layer of the intestine, either as a strategy to better reach sources of blood, or as a consequence of blood feeding.

Figure 4: LEfSe results for KO enrichment and depletion among A. lumbricoides-infected samples.

A total of 24 KEGG Orthology terms (and 32 pathways) were enriched and 171 (and 20 pathways) were depleted among the 15 samples with T. trichiura infections. The large expansion of terms suggests that T. trichiura infections result in a significant shift in bacterial populations compared to infections with the other worms. Many of the top-enriched terms are related to transposon activity (important for antibiotic resistance) and to iron, which may be related to bleeding. The depleted terms include several related to hemolytic toxins, which cause an inflammatory host response; This reduction here may be related to a reduced inflammatory response with T. trichiura infections12.

This clustering analysis and significance testing will improve greatly with the addition of more individuals, since a relatively large number of strains (608) is being clustered here using the abundance across only 10 samples. Additional data here may allow for the identification of samples which coincide with particular helminth infections, and allow for better characterization of the N. americanus-induced shift in bacterial community structure observed here.


1    Pruitt, K. D., Tatusova, T. & Maglott, D. R. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic acids research 35, 27 (2007).
2    Li, K., Bihan, M., Yooseph, S. & Methé, B. A. Analyses of the Microbial Diversity across the Human Microbiome. PLoS One 7, e32118, doi:10.1371/journal.pone.0032118 (2012).
3    Carson, C. A., Shear, B. L., Ellersieck, M. R. & Asfaw, A. Identification of Fecal Escherichia coli from Humans and Animals by Ribotyping. Applied and Environmental Microbiology 67, 1503-1507, doi:10.1128/aem.67.4.1503-1507.2001 (2001).
4    Arumugam, M. et al. Enterotypes of the human gut microbiome. Nature 473, 174-180 (2011).
5    Kanehisa, M., Goto, S., Sato, Y., Furumichi, M. & Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic acids research 40, D109-114, doi:10.1093/nar/gkr988 (2012).
6    Moriya, Y., Itoh, M., Okuda, S., Yoshizawa, A. C. & Kanehisa, M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic acids research 35, W182-185, doi:10.1093/nar/gkm321 (2007).
7    Abubucker, S. et al. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol 8, e1002358, doi:10.1371/journal.pcbi.1002358 (2012).
8    Segata, N. et al. Metagenomic biomarker discovery and explanation. Genome Biol 12, 2011-2012 (2011).
9    Uebelhart, D. Clinical review of chondroitin sulfate in osteoarthritis. Osteoarthritis Cartilage 16, 31 (2008).
10    McSharry, C., Xia, Y., Holland, C. V. & Kennedy, M. W. Natural immunity to Ascaris lumbricoides associated with immunoglobulin E antibody to ABA-1 allergen and inflammation indicators in children. Infect Immun 67, 484-489 (1999).
11    Bergstrom, K. S. & Xia, L. Mucin-type O-glycans and their roles in intestinal homeostasis. Glycobiology 23, 1026-1037 (2013).
12    Broadhurst, M. J. et al. IL-22+ CD4+ T cells are associated with therapeutic Trichuris trichiura infection in an ulcerative colitis patient. Sci Transl Med 2, 3001500 (2010). v4.0           Copyright Statement
  User support forum User Support
The Genome Institute Washington University School of Medicine