Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ye, Z.
Right arrow Articles by Parry, J. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ye, Z.
Right arrow Articles by Parry, J. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Mutagenesis, Vol. 17, No. 5, 361-364, September 2002
© 2002 UK Environmental Mutagen Society/Oxford University Press

The discovery and confirmation of single nucleotide polymorphisms in the human p53R2 gene by EST database analysis

Zheng Ye1 and James M. Parry

Centre for Molecular Genetics and Toxicology, School of Biological Sciences, University of Wales Swansea, Singleton Park, Swansea SA2 8PP, UK


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Reference
 
The human expressed sequence tag (EST) database provides a wealth of resources, which can be used to rapidly screen for potential polymorphisms in proteins of physiological interest. The human p53R2 gene, a recently identified ribonucleotide reductase, plays an important role in DNA repair and is involved in the pathway of p53 activity in response to the presence of DNA damage. On the basis of the alignment of human EST sequences, we identified three candidate polymorphisms at nt 2752, 2759 and 4696 in the 3'-untranslated region of the p53R2 gene. The presence of these polymorphisms was confirmed in a Caucasian population (n = 82) by allele-specific PCR and PCR/restriction fragment length polymorphism analyses. The rare allele frequency at position 4696 (15.5%) is higher than either rare allele frequency at position 2752 or 2759 (6 and 6%). Our results suggest that the human EST data may serve as a valuable source for the rapid identification of genetic variation.


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Reference
 
The human expressed sequence tags (ESTs) database consists of >3700000 entries of partial cDNA sequences. These sequences have been generated from many different tissues and are derived from a range of individuals. ESTs can reflect a part or all of the transcribed sequence of a gene, which includes the coding sequences as well as the 5'- and 3'-untranslated regions (UTRs). Currently, the ESTs database is accessible online from the website of the National Center for Biotechnology Information (NCBI) website (http://www.ncbi.nlm.nih.gov/dbEST/). Because ESTs have been collected from many different sources, the wealth of information in the ESTs database provides investigators with overlapping sequences of the same region, thus potentially allowing the identification of new single nucleotide polymorphisms (SNPs). Over the past few years other investigators have taken advantage of bioinformatic searching strategies to identify many SNPs (Garg et al., 1999Go; Picoult-Newberg et al., 1999Go; Blackburn et al., 2000Go; Ulrich et al., 2000Go; Board et al., 2001Go). For example, Ulrich et al. (2000) identified a common polymorphism in the 3'-UTR of the thymidylate synthase gene by this approach. However, it would be particularly valuable if this approach could be employed to identify polymorphic variation of specific genes in the pathways of disease metabolism. Following identification of the p53R2 gene involved in p53-induced DNA repair and the suggestion of its possible role in cancer (Tanaka et al., 2000Go), our study demonstrates that the information in the human EST database can be used to identify novel polymorphisms in the p53R2 gene by direct searching of the ESTs database.

p53R2 is a recently identified ribonucleotide reductase that catalyses the conversion of ribonucleoside diphosphates to their corresponding deoxyribonucleotides to provide precursors for DNA synthesis (Tanaka et al., 2000Go) and it is also part of the p53 pathway. The p53R2 gene is one of the genes that functions in p53-induced DNA repair. In response to various levels of genotoxic stress, inhibition of p53R2 expression in cells with an intact p53-dependent DNA damage checkpoint has been shown to result in reduced levels of ribonucleotide reductase activity, DNA repair and cell survival (Tanaka et al., 2000Go). Thus, p53R2 plays an important role in the repair of DNA damage.

Since p53R2 plays a crucial role in DNA repair and is involved in the pathway of p53 activity in response to the presence of DNA damage, the discovery and understanding of any genetic variation in the p53R2 gene present in the human population would be valuable. To our knowledge, two studies from the same laboratory have identified three polymorphisms in the p53R2 5'-UTR (Smeds et al., 2001aGo,bGo). In this study we have carried out a search for previously unidentified genetic polymorphisms in the p53R2 gene by direct searching strategies in the ESTs database and identified and further confirmed three SNPs in the 3'-UTR of the p53R2 gene.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Reference
 
Materials
Oligonucleotides were synthesized using an automated DNA synthesizer (Cruachem, Glasgow, UK). Restriction enzymes and their reaction buffers were obtained from Promega Corp. (Southampton, UK) (HindIII) and New England BioLabs (Hertfordshire, UK) (Tsp509I). Agarose gel was obtained from Cambrex (Rockland, USA).

EST database screening
The EST database is available at the NCBI website (http://www.ncbi.nlm.nih.gov/dbEST/). Database screening was performed using gapped BLAST programs (Ulrich et al., 2000Go), which are obtainable from the home page of the NCBI (Altschul et al., 1997Go).

Primer design and restriction enzyme selection
Primers were designed for the candidate polymorphisms using the PRIMER program (obtained from the Whitehead Genome Center), which was used to evaluate primer melting temperature, annealing temperature and the likelihood of oligonucleotide self-priming. Restriction enzymes were selected for the polymorphic sites using the CUTTER 2.0 program (http://www.firstmarket.com/cutter/cut2.html), which can identify all the restriction enzyme sites in any given cDNA sequence.

Confirmation of p53R2 candidate polymorphisms in a Caucasian population
The presence of human p53R2 candidate polymorphisms was verified in a Caucasian population. DNA was extracted from peripheral blood lymphocytes by standard methods (Stratagene, UK). Eighty-two healthy individuals were randomly selected and included in this study (Gao et al., 1998Go). The average age of healthy individuals was 43.5 years (range 15–79), with 47 women and 35 men.

The BLAST program was used to search the EST database to identify candidate polymorphisms in the p53R2 gene. No candidate polymorphism in the coding region has been identified by a multiple sequence alignment. Five p53R2 candidate polymorphisms were identified in the 3'-UTR by searching the EST database, three of which have been experimentally confirmed in our study population (Figure 1Go). A PCR-based assay was used to verify potential polymorphisms. Primer sequences are presented in Table IGo. The first polymorphism, an A->C transversion at nucleotide position 2752 was detected by allele-specific PCR with one downstream (R2-3) and two upstream (R2-1 and R2-2) primers, differing in the terminal base (A or C). Each sample was tested in two parallel reactions, with the same downstream primer and one of the upstream primers. Amplification will take place in only one of the tubes, that containing the exact matching upstream primer, for homozygous individuals but in both tubes for heterozygotes. The reactions were performed in a total volume of 50 µl, containing 1.5 mM MgCl2, 10 mM Tris–HCl, pH 8.8, 100 µM each dNTP, 10 pmol each primer, 2.5 U Taq DNA polymerase (Promega Corp) and 0.5 µg template DNA. Amplification was performed in a PTC-225 Peltier Thermal Cycler (MJ Research) with 5 min of initial denaturation at 94°C, followed by 32 cycles of 94°C for 45 s, 57°C for 30 s and 72°C for 45 s, with a final extension at 72°C for 5 min.



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 1. . Distribution of 100 aligned ESTs of p53R2 (GeneBank accession no. AB036063). Solid line, complete alignment; dashed line, incomplete alignment.

 

View this table:
[in this window]
[in a new window]
 
Table I. . List of PCR primers used in this study
 
The second polymorphism, an A->G transition at nucleotide position 2759 was investigated by the PCR/restriction fragment length polymorphism (PCR–RFLP) method. The A->G transition eliminates a recognition site for the restriction enzyme Tsp509I. The fragment containing the polymorphism was amplified by PCR using primers R2-4 and R2-5 in the same reaction mix as described above. The cycling conditions were 5 min of initial denaturation at 94°C, followed by 32 cycles of 94°C for 25 s, 57°C for 30 s and 72°C for 45 s, with a final extension at 72°C for 5 min. The amplification product was incubated overnight with Tsp509I restriction enzyme at 65°C and digested fragments were visualized on a 2.5% agarose gel with ethidium bromide staining.

The third polymorphism identified is a G->C transversion at nucleotide position 4696, which creates a recognition site for the HindIII restriction enzyme. The fragment containing the polymorphism was amplified by PCR using primers R2-6 and R2-7 in the same reaction mix as described above. The cycling conditions were 5 min of initial denaturation at 94°C, followed by 32 cycles of 94°C for 10 s, 57°C for 20 s and 72°C for 45 s, with a final extension at 72°C for 5 min. The amplification product was digested overnight with HindIII restriction enzyme at 37°C and the digested fragments were analyzed on a 2.5% agarose gel with ethidium bromide staining.

After completion of the EST database screening, the potential polymorphisms in the p53R2 gene were checked against the SNP consortium database (http://snp.cshl.org/), the NCBI SNP database (http://www.ncbi.nlm.nih.gov/SNP/index.html) and the Human Genome Variation database, HGVbase (http://hgvbase.cgb.ki.se/) (Fredman et al., 2002Go). The potential polymorphisms identified in this study had not been reported in the above databases. When the potential polymorphisms in the p53R2 gene have been experimentally confirmed, new identified polymorphisms can then be submitted to the SNP database.

Statistical analysis
{chi}2 analysis was used to test for agreement with the Hardy–Weinberg equilibrium in our study population and for linkage disequilibrium between the two polymorphic sites. All P values shown are two-sided and P < 0.05 was judged statistically significant.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Reference
 
EST database screening
When the complete cDNA sequence of p53R2 (GenBank accession no. AB036063) was used as the query sequence, we identified 100 matches of human EST sequences from the human ESTs database using a BLAST searching approach (Altschul et al., 1997Go). Alignment of the EST sequences with the complete p53R2 cDNA sequence is shown in Figure 1Go.

On the basis of this BLAST searching approach, a number of sequence variants were observed. Since ESTs are usually generated by single pass automated sequencing, the occurrence of errors are quite common. Given that the sequencing errors are random, the number of true polymorphic sites can be substantially reduced to those that show the same base substitution in more than one EST sequence. Figure 2Go shows the alignment of human EST sequences obtained with the p53R2 cDNA sequence. In this alignment, variations at nt 2752, 2759 and 4696 in the 3'-UTR are frequent. The sequence variations at positions 2752 and 2759 occurred in eight and nine of 14 aligned ESTs, respectively, while the variation at position 4696 occurred in 16 of 17 aligned ESTs. No other position presented variations at these frequencies in the region. Moreover, a review of these aligned EST sequences showed that they were derived from different tissue sources and research communities. Therefore, these support the possibility that the three identified variations are due to allelic variations rather than random sequencing errors.




View larger version (49K):
[in this window]
[in a new window]
 
Fig. 2. . The alignment of human EST sequences with a portion of the p53R2 cDNA. Nucleotides are indicated as follows: dots, identical; n, ambiguous; space, gaps or where there is no sequence information. The first column after p53R2 is EST accession numbers, the second and fourth columns are positions of cDNA sequences. (A) Potential polymorphisms are clearly identified at nt 2752 and 2759. (B) A polymorphism is clearly identified at nt 4696.

 
Identification of three new p53R2 candidate polymorphisms in a Caucasian population
We used a PCR-based assay to confirm the existence of three candidate polymorphisms of the p53R2 gene in the 3'-UTR among a Caucasian population. The results of the PCR-based assay are shown in Figure 3Go. The A->C transversion at position 2752 was detected by the presence or absence of a 270 bp fragment after allele-specific PCR. The A->G transition at position 2759 was characterized by the presence or absence of a Tsp509I restriction site in the 249 bp PCR product after PCR–RFLP analysis. Similarly, the G->C transition at position 4696 was characterized by the presence or absence of a HindIII restriction site in the 228 bp PCR product after PCR–RFLP analysis.



View larger version (60K):
[in this window]
[in a new window]
 
Fig. 3. . Detection of single nucleotide polymorphisms of the p53R2 gene in the 3'-UTR by PCR. Lane M, 100 bp DNA ladder. Sizes of PCR products are indicated on the right-hand side. (A) Allele-specific PCR for detection of the A2752->C transversion. Lanes 1 and 2, A/A; lanes 5 and 6, C/C; lanes 3 and 4, A/C. (B) PCR–RFLP analysis for detection of the A2759->G transition. Digestion with Tsp509I for nt 2759 determination. Lanes 1, 2, 5 and 6, A/G; lanes 3 and 4, A/A. (C) PCR–RFLP analysis for detection of G4696->C transition. Digestion with HindIII for nt 4696 determination. Lanes 1 and 2, C/C; lanes 3 and 4, G/C; lanes 5 and 6, G/G.

 
The genotype frequencies among 82 Caucasian individuals are shown in Table IIGo. The frequencies of C and A alleles at position 2752 are 94 and 6%, respectively. Similarly, the frequencies of A and G alleles at position 2759 are 94 and 6%, respectively, whereas the frequencies of G and C alleles are 15.5 and 84.5%, respectively, at position 4696. The rare allele frequency at position 4696 (15.5%) is higher than either rare allele frequency at position 2752 or 2759 (6 and 6%). The genotype frequencies were in agreement with the Hardy–Weinberg equilibrium. The haplotype distributions of A2752/C2752 and A2759/G2759 alleles in our study population were also examined. The frequency of the most common haplotype, C2752A2759, was 0.90 and the frequencies of C2752G2759, A2752A2759 and A2752G2759 were 0.05, 0.04 and 0.01, respectively. {chi}2 analysis of these alleles indicated that there is significant linkage disequilibrium between these two loci (P < 0.05).


View this table:
[in this window]
[in a new window]
 
Table II. . Distribution of p53R2 3'-UTR genotypes in a Caucasian population (n = 82).
 

    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Reference
 
Genetic polymorphisms in the human population have been studied in order to gain insight into their influence on the activity of specific genes involved in disease susceptibility. Finding previously unknown polymorphisms has often relied on the detection of a related phenotype or the chance sequencing of variant cDNA (Seidegard et al., 1988Go; Ali-Osman et al., 1997Go). This is a time-consuming task, usually requiring months or years at the bench to identify a novel polymorphism. Moreover, many polymorphisms may exist in the human genome which have not been identified and characterized due to problems of methodology. The use of the human ESTs database to find novel genetic polymorphisms provides an opportunity to investigate the consequences of polymorphisms on enzyme activity, which may not be discovered using other approaches (e.g. phenotypic approaches). In the present study we have confirmed that the human ESTs database is a powerful tool for the rapid identification of new potential polymorphisms associated with proteins of specific interest (Ulrich et al., 2000Go; Board et al., 2001Go). Using this approach, we have identified and subsequently experimentally verified three new SNPs in the 3'-UTR of the p53R2 gene.

The use of the human ESTs database to identify candidate polymorphisms has advantages that can be exploited to facilitate the development of highly dense genetic maps for the analysis of a human population. One of the main advantages of this approach is that it is undoubtedly rapid and cost-effective, which allows investigators to increase the pace of their research.

Although this approach has obvious advantages, there are some disadvantages and limitations that should be considered. EST sequences are usually generated by single pass automated sequencing, thus sequencing errors are common (2%) (Hillier et al., 1996Go). Many of the deposited sequences in the ESTs database contain errors that produce false positives in the search for polymorphisms. Therefore, this requires a quality control measure to eliminate false positives. The measure of `more than one EST rule' is employed in this study (Brett et al., 2000Go; Ulrich et al., 2000Go), i.e. only candidate polymorphisms are considered for further analysis when more than one EST have the same potential polymorphic change.

The ESTs database contains a number of sequences from multiple sources. It provides an excellent resource to search for polymorphisms in widely expressed genes. However, this approach may be limited by the number of available ESTs, as the database contains mainly data for widely expressed genes. Another limitation of this approach is that ESTs are normally sequenced from the 5'- and 3'-ends and there are relatively few sequences spanning the central region of complete cDNAs. Thus, polymorphisms in these areas may be relatively less frequently discovered than those in the 5'- and 3'-UTRs (Figure 1Go). The 3'-end of mRNA is a non-coding region and although it is not translated into protein, it contains sequence information to maintain and determine mRNA stability. Since p53R2 is a recently identified gene, the role of the 3'-UTR in this gene, associated with other binding sites or functional sites, is still not clear. However, it is conceivable that polymorphic changes in this region may have an impact on mRNA turnover and differences in mRNA turnover can modify the steady-state levels of a given mRNA and thus determine protein expression levels.

In summary, the human ESTs database has been demonstrated to be a valuable tool to search for potential polymorphisms by using different sequence alignment strategies. We have used this bioinformatic searching strategy to identify three new SNPs in the 3'-UTR of the human p53R2 gene. The presence of the polymorphisms has been confirmed in a Caucasian population (n = 82). All three new polymorphisms of the p53R2 gene reported here have been deposited in HGVbase (SNP001025927–SNP001025929).


    Acknowledgments
 
We thank Professor Julian M.Hopkin and Dr Pei-Song Gao for kindly providing the healthy samples. During the period of the study Zheng Ye was supported by an ORS award and a PhD studentship provided by Phillip Morris Products SA, Switzerland.


    Notes
 
1 To whom correspondence should be addressed. Tel: +44 1792 205678; Fax: +44 1792 295447; Email: bazheye{at}swansea.ac.uk Back


    Reference
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Reference
 

    Ali-Osman,F., Akande,O., Antoun,G., Mao,J.X. and Buolamwini,J. (1997) Molecular cloning, characterization and expression in Escherichia coli of full-length cDNAs of three human glutathione S-transferase Pi gene variants. Evidence for differential catalytic activity of the encoded proteins. J. Biol. Chem., 272, 10004–10012.[Abstract/Free Full Text]

    Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J.,Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402.[Abstract/Free Full Text]

    Blackburn,A.C., Tzeng,H., Anders,M.W. and Board,P.G. (2000) Discovery of a functional polymorphism in human glutathione transferase zeta by expressed sequence tag database analysis. Pharmacogenetics, 10, 49–57.[Web of Science][Medline]

    Board,P.G, Chelvanayagam,G., Jermin,L.S., Tetlow,N., Tzeng,H., Anders,M.W. and Blackburn,A. (2001) Identification of novel glutathione transferase and polymorphic variation by expressed sequence tag database analysis. Drug Metab. Dispos., 29, 544–547.[Abstract/Free Full Text]

    Brett,D., Lehmann,G., Hanke,J., Gross,S., Reich,J. and Bork,P. (2000) EST analysis online: WWW tools for detection of SNPs and alternative splice forms. Trends Genet., 16, 416–418.[Web of Science][Medline]

    Fredman,D., Siegfried,M., Yuan,Y.P., Bork,P., Lehvaslaiho,H. and Brookes,A.J. (2002) HGVbase: a human sequence variation database emphasizing data quality and a broad spectrum of data sources. Nucleic Acids Res., 30, 387–391.[Abstract/Free Full Text]

    Gao,P.S., Mao,X.Q., Kawai,M., Enomoto,T., Sasaki,S., Tanabe,O., Yoshimura,K., Shaldon,S.R., Dake,Y., Kitano,H., Coull,P., Shirakawa,T. and Hopkin,J.M. (1998) Negative association between asthma and variants of CC16(CC10) on chromosome 11q13 in British and Japanese populations. Hum. Genet., 103, 57–59.[Web of Science][Medline]

    Garg,K., Green,P. and Nickerson,D.A. (1999) Identification of candidate coding region single nucleotide polymorphisms in 165 human genes using assembled Expressed Sequence Tags. Genome Res., 9, 1087–1092.[Abstract/Free Full Text]

    Hillier,L., Lennon,G., Becker,M., Bonaldo,M.F., Chiapelli,B., Chissoe,S., Dietrich,N., DuBuque,T., Favello,A., Gish,W., Hawkins,M., Hultman,M., Kucaba,T., Lacy,M., Le,M., Le,N., Mardis,E., Moore,B., Morris,M., Parsons,J., Prange,C., Rifkin,L., Rohlfing,T., Schellenberg,K. and Marra,M. (1996) Generation and analysis of 280,000 human expressed sequence tags. Genome Res., 6, 807–828.[Abstract/Free Full Text]

    Picoult-Newberg,L., Ideker,T.E., Pohl,M.G., Taylor,S.L., Donaldson,M.A., Nickerson,D.A. and Boyce-Jacino,M. (1999) Mining SNPs from EST databases. Genome Res., 9, 167–174.[Abstract/Free Full Text]

    Seidegard,J., Vorachek,W.R., Pero,R.W. and Pearson,W.R. (1988) Hereditary differences in the expression of the human glutathione transferase active on trans-stilbene oxide are due to a gene deletion. Proc. Natl Acad. Sci. USA, 85, 7293–7297.[Abstract/Free Full Text]

    Smeds,J., Kumar,R. and Hemminki,K. (2001a) Polymorphic insertion of additional repeat within an area of direct 8 bp tandem repeats in the 5'-untranslated region of the p53R2 gene and cancer risk. Mutagenesis, 16, 547–550.[Abstract/Free Full Text]

    Smeds,J., Nava,M., Kumar,R. and Hemminki,K. (2001b) A novel polymorphism (–88 C->A) in the 5' UTR of the p53R2 gene. Hum. Mutat., 17, 82.[Medline]

    Tanaka,H., Arakawa,H., Yamaguchi,T., Shiraishi,K., Fukuda,S., Matsui,K., Takei,Y. and Nakamura,Y. (2000) A ribonucleotide reductase gene involved in a p53-dependent cell-cycle checkpoint for DNA damage. Nature, 404, 42–49.[Medline]

    Ulrich,C.M., Bigler,J., Velicer,C.M., Greene,E.A., Farin,F.M. and Potter,J.D. (2000) Searching Expressed Sequence Tag databases: discovery and confirmation of a common polymorphism in the Thymidylate Synthase gene. Cancer Epidemiol. Biomarkers Prev., 9, 1381–1385.[Abstract/Free Full Text]

Received on January 14, 2002; accepted on April 11, 2002.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ye, Z.
Right arrow Articles by Parry, J. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ye, Z.
Right arrow Articles by Parry, J. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?