Nomenclature tracking of filarial genome data
What does a name mean? Use this guide to decode filarial molecular nomenclature.
What does a name mean? If you've analyzed filarial molecular data, odds are you were daunted by seemingly endless terminology - this guide will help you make sense of nomenclature associated with filarial transcriptome and genome data. Please remember to follow F.U.N.K. guidelines if you wish to assign a name to a filarial gene/protein.
Filarial cDNA libraries
The Filarial Genome Project constructed cDNA libraries from a variety of life cycle stages and filarial nematode species. These libraries are available through the FR3 Smith College location, and can be requested through the Molecular Resources section. The libraries are named based on the filarial species, life cycle stage, laboratory and person that constructed the library, and year of construction.
- Example: Library SAW98MLW-OvMf was made in Steven A. Williams' lab (SAW), in 1998 (98), by Michelle Lizotte-Waniewski (MLW), from Onchocerca volvolus (Ov) microfilariae (mf).
Filarial expressed sequence tags (ESTs)
Filarial ESTs are deposited in NCBI dbEST, Nembase, and Nematode.net. Each EST has multiple identifiers that were assigned to enable database searching, here is an example from dbEST:

- BG354745 = NCBI Genbank accession number
- MBAFCAF5H04SK = Filarial Genome Project identifier; this EST was sequenced by Mark Blaxter (MB) from a conventional adult female cDNA library (AFC), the clone code is unique to that clone is AF (adult female) 5H04. Finally, this clone was sequenced with the SK primer.
- SAW96MLW-BmAF = name of B. malayi adult female library
- 13198931 = Genbank gi number. This is a number assigned by Genbank upon submission of a sequence, the numbers are sequential in order. After a sequence is verified by Genbank, it is assigned an accession number
Brugia malayi Gene Index
EST information was used to build the B. malayi gene index in 2008. This was done by extracting B. malayi EST information from Genbank, clustering the ESTs, and assembling them into Tentative Consensus (TC) sequences. The names of these sequences start with the identifier 'TC' (example TC10344). The annotated B. malayi Gene Index is available on the FR3 website, and the searchable database is available at The Gene Index Project.
Brugia malayi genome data
Preliminary B. malayi genome BAC data was generated by the New England Biolabs FGP site. The entire genome was later sequenced by whole genome shotgun sequencing at The Institute for Genome Research (TIGR). Although TIGR has no been merged with the J. Craig Venter Institute, the original draft assembly with preliminary annotation is still available at http://www.tigr.org/tdb/e2k1/bma1/. The genome is currently being curated by Elodie Ghedin, University of Pittsburgh, who maintains the most current information on B. malayi genome sequence closure, assembly and annotation. B. malayi genome data is also maintained at NCBI Genbank, where it can be directly searched using BLAST. A new assembly is presently in progress and will be made available in early 2010.
- TIGR nomenclature: fragments of sequenced DNA were assembled into contigs; the names of the contigs consist of seven numbers preceded by 'CA:' (example CA:1381300). The ‘CA’ stands for ‘Celera Assembler’ and thus represents the method used to assemble the genome. The contigs were then further assembled into scaffolds. Open reading frames within these contigs are assigned gene identifiers (or gene models) that include a five digit number corresponding to the scaffold number to which the contigs were assigned, followed by .mXXXXX (example 12596.m00124) corresponding to the number of the model on the scaffold. Each model has a second identifier that is called a public locus that is numbered independently of the scaffold. The locus name is Bm1_XXXXX (example Bm1_01660). This allows us to reassemble the genome into new scaffolds without changing the unique identifier for the gene model which will get re-mapped onto the new assembly.
Brugia malayi version 2 oligonucleotide microarray
The B. malayi microarray was produced at the Washington University Genome Center by a consortium of filarial researchers; it is now in its second version and can be ordered through the FR3 website. The oligonucleotides represent ESTs and predicted genes from B. malayi, B. malayi Wolbachia, Onchocerca volvulus and Wuchereria bancrofti; each oligo identifier is preceded by 'bm.' (from first version array) or 'BMX' (second version array). The version 1 sequence clusters used for oligo design start with the identifier 'BMC'; version 2 sequence clusters start with 'BMW' or 'BMB'. Complete lists of the oligonucleotides, consensus sequences, and annotation can be found in the Filarial Genomics and Bioinformatics section.

