NemaGene : nemagene search

Thanks to the attendees, sponsors and organizers for making the Bioinformatics Workshop for Helminth Genomics (2015) a big success!

NemaGene Database Search - Search our collection of genes and transcripts

Instructions specific to this page

NemaGene can be searched using IPR, GO and/or KO id filters. For those datasets with original 454 isotig or sanger EST contig transcript assemblies, its also possible to search by stage or tissue. First click on the [+] Expand label for the Species selection section and select 1 or more species to start your query. Note that if you select no species from the list, your query will be applied to all species in NemaGene (depending on the complexity of your query, this may take a long time to complete). After selecting the species to focus on, open the sections below to set specific filters you'd like to apply. You're able to request a specific gene name (or comma-delimited list of gene names), orthologous group name if such is available, a component read name for transcript assembly data (*this will be less and less useful as time goes on and transcript assemblies are replaced by full-fledged genesets). You can also filter by IPR id, GO term and/or KO id. You are allowed to enter comma-delimited lists of any of those ids as well. Note that filtering on multiple ids of a single type will return genes/transcripts annotated by any of those ids. But if you set filters using 2 or more id types (i.e. IPR id + KO id), each gene or transcript returned will be required to have at least one id from each list you supplied.

You will then arrive at a page showing the slice of data you've retrieved from NemaGene. The Query Definition section now displays the query you made to extract the results below. Then the Data Download section allows you to download the full fasta for all the genes/transcripts you requested. Note that if the type of fasta you request (nucleotide or protein) doesn't exist for any of the genes or transcripts in your list, the output file will still display the gene name(s) as headers, but the sequence record for those will be empty. The Results section will list all the genes and/or transcripts in your return set, organized by species, then by group if available (such as isogroups for 454 transcript assemblies, or clusters for sanger EST transcript assemblies). Each gene or transcript name is a link that will take you to a final detail page showing the available annotations for that entity. You will also be able to download that single entity, or forward its sequence into NemaBlast for further analysis. For more information on NemaGene please see our NemaGene FAQ.

How to use this page

Access into the NemaGene database frequently comes from other tools within the site such as the contig links from NemaPath which directly jump to the details pages that are the terminus of a NemaGene search, or from associated external sites (such as WormBase). But the NemaGene Database Search tool can also be used to extract custom slices of our database using available annotations. This tool is also very useful for retrieving the full protein or nucleotide fasta of our genesets, or of the slice you define, using the appropriate Download links.

Query Definition
Please setup your query in the expandable sections below
Species selection
Use this table to select which species you want reported to you. Choice of species is the top level filter you will set. Other selections you make below will only report results from the species you selected. If you do not select a species, NemaGene will build results based on all available species.

Selecting many or all available species may cause this service to hang or crash depending on server load. We strongly suggest limiting searches to 5 species or less. But note that applying ANY filter will drastically improve query speeds. So searches for small collections of IPR ids, GO terms and/or KO ids across all species should complete without issue.

Species listed in RED have stage or tissue specific gene expression annotation available. Expression values for a gene are provided in FPKM (Fragments Per Kilobase of transcript per Million mapped reads) per stage/tissue based on available RNAseq experiments.

(PC/UNIX/Linux users use control- and/or shift- click to select multiple species, MAC users use command- and/or shift- click)
Stage and/or Tissue selection
This section allows you to set stage and/or tissue specific filters on the list of species you selected above. Only transcripts (or transcript groups) from the selected stage/tissue of the selected species will be reported to you.

Note that stage and/or tissue selection only applies to transcript datasets built from sanger EST reads or 454 cDNA reads. Stage/Tissue selection is based upon the presence of sanger EST or 454 reads generated from stage- and/or tissue- specific libraries in the contigs or isotigs representing the transcripts. Genesets included in NemaGene have no association with component reads, but instead are annotations from finished (or near finished) genomic sequence. For this reason, they cannot be selected by stage or tissue. Thus setting a stage and/or tissue filter will result in NO returned transcripts from actual genesets! It will effectively filter out ALL members of species for which NemaGene only contains genecalls.

Note that only a single stage or tissue may be selected per query.

No selection
Infective larvae
Free Living

Filter selection
This section allows you to request a specific gene, group of genes, transcript (sanger EST contig or 454 cDNA isotig) or group of transcripts (clusters or isogroup) and then generates a report on that gene including primary sequence (protein and/or nucleotide) as well as available annotation. You can also select genes and transcripts by IPR, GO or KO annotation.

*note: When searching for GO terms, please be aware that NemaGene only tracks the highest resolution GO term assignment for genes and/or transcripts. You cannot choose a root term such as GO:0008150 (biological process) and have all GO terms under that root term returned to you.

Gene or Transcript name: 
Group name: 
Component read name: 
IPR id:(eg. IPR006186)
GO id:(eg. GO:0016787)
KO id:(eg. K06269) v4.0           Copyright Statement
  User support forum User Support
The Genome Institute Washington University School of Medicine