Mutagenesis, Vol. 15, No. 5, 411-414,
September 2000
© 2000 UK Environmental Mutagen Society/Oxford University Press
Bioinformatics |
The Mammalian Gene Mutation Database
Centre for Molecular Genetics and Toxicology, School of Biological Sciences, University of Wales Swansea, Swansea SA2 8PP, UK
Abstract
The Mammalian Gene Mutation Database (MGMD) is a comprehensive collection of published mutation data from the open literature on mammalian cell-based gene model mutation detection systems. The database currently contains approximately 30000 comprehensively described mutant spectra records and it is maintained and up- dated on a daily basis. The major objectives of the MGMD were (i) to provide an Internet-accessible database (http://lisntweb.swan.ac.uk/cmgt/index.htm) for chemically induced and spontaneous mutation types and spectra in selected genes; (ii) to standardize the reporting of mutations within different genes where ambiguity exists in the literature; and (iii) to provide interactive and user-friendly access to the information. A multi-option search facility has been included that allows the user to search the database for parameters such as mutagen, gene or cell type of interest. The structure of the database permits easy retrieval of specific mutation data for further analysis. Thus, the MGMD should become a useful and necessary reference source and provides an analysis tool for genetic toxicologists.
Introduction
Whilst it has generally been assumed that many cancers are the result of a series of genetic changes produced by exposure to exogenous agents, our ability to establish causal relationships between chemical exposures and the induction of specific tumour types has proved to be elusive. It is now possible to examine such relationships due to the advances in molecular biology which have enabled the analysis of mutations both in specific target genes associated with disease, such as oncogenes, as well as those induced by specific chemical or physical treatments in model systems (Olsen et al., 1996
; Waters et al., 1999
; Perera and Weinstein, 2000
). These advances have lead to a remarkable increase in the amount of DNA sequence information in the literature regarding both mutation types (profiles) and spectra (the frequencies of mutations observed at each nucleotide position within a gene). This, in turn, led to the requirement for computerized databases to keep pace with the information increase.
The Mammalian Gene Mutation Database (MGMD) is a project designed to collate the profiles and spectra of published mutagen-induced and spontaneous gene mutations from model mammalian cell types following analysis of the literature. Chemically induced mutation profiles and spectra can be used to determine the specificity of mutagen interaction within a target nucleotide sequence (Coller and Thilly, 1994
). A knowledge of mutagen interaction with specific DNA sequence may provide a powerful tool in determining the causes of mutation in tumours. One of the rationales of the MGMD was to build a collection of chemical mutation spectra from different mammalian cell types to highlight potential differences in metabolism and mechanisms of mutagenesis. It was also realized that there was a need to overcome the ambiguities in the way mutations are reported for different genes and present the data in a uniform way. The MGMD is an Internet-accessible database which aims to be an up-to-date and complete reference source for mutagen-induced and spontaneous mutational spectra which will be of considerable benefit to researchers in genetic toxicology. Furthermore, during development of such an extensive database it became apparent that access would have to be `user friendly'. Therefore, a simple to use, yet comprehensive search page has been developed that allows rapid and specific retrieval of mutant data.
Data coverage
The MGMD currently contains entries for nearly 30000 mutants for the genes listed in Table I
. Data have been collected for those genes used extensively in mutagenesis studies in cells derived from a variety of mammalian species. The quantity of data for the supF gene alone is extensive. For example, mutant sequence data for supF is available for human, monkey, mouse (cell lines and transgenic animals) and rat covering over 10 different tissue types and more than 40 different cell lines. In total, there is a diverse range of mutant data for the genes in Table I
for over 115 chemical and physical mutagens. In addition, there is a substantial collection of spontaneous mutation data for most of the genes. Data entry into the MGMD is an ongoing process with new mutant sequences added daily, and it is envisaged this task will be made easier by user submission of data via the Internet site.
|
Database structure
Before inclusion in the database each publication was evaluated to determine a number of key factors including: experimental method, cell type used, modifications of the cell type, including factors such as DNA repair status, chemical exposure regime, etc. Each individual mutant entry is also cross-referenced to ensure that duplication of entry does not occur. A major effort has been made to standardize the way in which the data are displayed for the user. For each entry into the database it has been important to classify each kind of mutation. In the MGMD, mutants are classed as single base substitution, tandem base substitution (two adjacent singles), multiple base substitution (non-adjacent singles), complex (mutations of more than one type), deletion, insertion and rearrangements.
It was also important to clarify and standardize the position of the mutation within a particular gene where non-uniformity exists in the literature. In general, the principles recommended by Antonarakisi and the Nomenclature Working Group (1998) were adopted. For practical reasons, different numbering systems have been adopted to describe mutations in the various genes of interest. In assigning a nucleotide number to the mutated base or bases, the following points were taken into consideration. Firstly, the numbering system used should be clear and unambiguous; secondly, it should conform to the numbering system generally accepted and used by workers in each gene system. Data entered into the MGMD has been edited, and the nucleotide numbering system standardized according to the following criteria: (i) supF, data are reported for the non-transcribed strand (or coding strand) as the genomic DNA position; (ii) APRT, data are reported for the non-transcribed strand, where the A of the AUG initiation codon = nucleotide position 1 using the nucleotide sequence for the entire gene with continuous numbering of both exons and introns; (iii) HPRT, data are reported for the non-transcribed strand, where the A of the AUG initiation codon = nucleotide position 1. Mutations occurring in exons are numbered according to their cDNA position. Mutations occurring in introns are numbered thus: where I3 + 1 = first base of intron3 and I3 - 1 = last base of intron3. Amino acids are numbered where the Met initiation codon = amino acid 1.
All mutant entries are held within one Microsoft AccessTM database book file. Each entry is subdivided into columns housing relevant mutation information: an organization that facilitates the multi-option search facility online.
Database access
The online database can be accessed by the user from the URL: http://lisntweb.swan.ac.uk/cmgt/index.htm
The URL leads to a home page allowing the user to select options to search the database, review the database instructions and nomenclature, submit mutant data, access links to other database web sites or view MGMD statistics. The `user friendly' search page contains drop down boxes for each searchable category (Figure 1
). An enquiry can be made individually by mutagen, species, cell origin (tissue type), cell type, gene, mutation class (single substitution, deletion, etc.) and mutation type (transition, transversion, etc.). Alternatively, the user can refine their search by selecting options within different categories thus allowing any combination of factors attributable to the mutants to be sought. This system has the advantage of eliminating awkward search terms and the many various ways researchers refer to compounds and cell lines. An additional benefit is an option where the scientist can search by author. The system, therefore, allows the user to obtain a list of only the mutation data that is required using single or multiple choices via the options provided. In addition, the search page provides useful examples of how to obtain specific data.
|
After submitting a search query the results are displayed in a scrollable web page of mutation information (Figure 2
|
Conclusions and future developments
The MGMD is currently the most comprehensive online database containing information on induced mutations in mammalian cells. The need for a continuously updated `user-friendly' interactive site has been recognized which, we believe, will make the MGMD a useful tool for genetic toxicologists. The database will also complement others dedicated to similar and different areas of mutation research such as the Mutation Spectra Database at Yale University (Hutchinson and Donnellan, 1997
), the Human Gene Mutation Database (Cooper et al., 1998
) and the Human p53, HPRT, lacI, lacZ databases developed by Cariello et al. (1998). A future addition to the MGMD will include pages dedicated to generated graphical mutation spectra and profiles from the data which may be downloaded directly by the user.
Acknowledgments
The development of the database has been made possible by the support from the European Union Environmental Programme, Otsuka Pharmaceuticals and BAT Ltd.
Notes
1 To whom correspondence should be addressed. Tel: +44 1792 205200; Fax: +44 1792 205200; Email: balewis{at}swan.ac.uk ![]()
References
-
Antonarakisi,S.E. and the Nomenclature Working Group (1998) Recommendation for a nomenclature system for human gene mutations. Hum. Mutat., 11, 13.[Web of Science][Medline]
Cariello,N.F., Douglas,G.R., Gorelick,N.J., Hart,D.W., Wilson,J.D. and Soussi,T. (1998) Databases and software for the analysis of mutations in the human p53 gene, human hprt gene and both the lacI and lacZ gene in transgenic rodents. Nucleic Acids Res., 26, 198199.
Coller,H.A. and Thilly,W.G. (1994) Development and applications of mutational spectra technology. Environ. Sci. Technol., 11, 478487.
Cooper,D.N., Ball,E.V. and Krawczak,M. (1998) The human gene mutation database. Nucleic Acids Res., 26, 285287.
Hutchinson,F. and Donnellan,J.E.,Jr (1997) A mutation spectra database for bacterial and mammalian genes. Nucleic Acids Res., 25, 192195.
Olsen,L.S., Nielson,L.R., Nexø,B.A. and Wasserman,K. (1996) Somatic mutation detection in human biomonitoring. Pharmacol. Toxicol., 78, 364373.[Medline]
Perera,F. and Weinstein,I.B. (2000) Molecular epidemiology: recent advances and future directions. Carcinogenesis, 21, 517524.
Waters,M.D., Stack,H.F. and Jackson,M.A. (1999) Genetic toxicology data in the evaluation of potential human environmental carcinogens. Mutat. Res., 437, 2149.[Medline]
Received on May 30, 2000; accepted on June 20, 2000.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G.J.S. Jenkins, S.H. Doak, G.E. Johnson, E. Quick, E.M. Waters, and J.M. Parry Do dose response thresholds exist for genotoxic alkylating agents? Mutagenesis, November 1, 2005; 20(6): 389 - 398. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. D. Lewis and J. M. Parry In silico p53 mutation hotspots in lung cancer Carcinogenesis, July 1, 2004; 25(7): 1099 - 1107. [Abstract] [Full Text] [PDF] |
||||
![]() |
P.D. Lewis, J.S. Harvey, E.M. Waters, D.O.F. Skibinski, and J.M. Parry Spontaneous mutation spectra in supF: comparative analysis of mammalian cell line base substitution spectra Mutagenesis, November 1, 2001; 16(6): 503 - 515. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



