Volume 53 (2): 177-185, 2005 Copyright ©The Histochemical Society, Inc. Histomathematical Analysis of Clinical Specimens : Challenges and Progress
Pathogenetics Unit, Laboratory of Pathology and Urologic Oncology Branch (GG,JWG,RFC,MAT,MRE-B), and Urologic Oncology Branch (WML), Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland Correspondence to: Michael R. Emmert-Buck, Pathogenetics Unit, Advanced Technology Center, Laboratory of Pathology and Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, 8717 Grovemont Circle, Bethesda, MD 20892-4605. E-mail: mbuck{at}helix.nih.gov
Proteomic analysis of clinical tissue specimens is a difficult undertaking. Described here is a multiplex study of protein expression levels in histological sections of human prostate that addresses many of the associated challenges. Whole-mount sections from 10 prostatectomy specimens were studied using 15 antibodies, immunohistochemical staining, digital imaging, and mathematical analysis of the data sets. The approach was successful in stratifying cell lineages present in the samples based on proteomic patterns, including differentiating normal epithelium from cancer. This strategy likely will be a useful method for extending the number of proteins that can be analyzed in clinical cancer specimens using currently available laboratory techniques. (J Histochem Cytochem 53:177185, 2005)
Key Words: histomathematics histopathology mathematics immunohistochemistry prostate cancer
THE COMPLETION OF THE HUMAN GENOME PROJECT and the concurrent development of high-throughput expression analysis technologies now permit investigators to perform in-depth molecular profiling studies of cell and tissue samples. Ultimately, these efforts will produce a comprehensive mathematical description of the "expression state" of biological samples based on global measurements of mRNA and protein levels. In other words, individual cell phenotypes will be categorized in terms of expression quantities, patterns, and biochemical pathway status in the context of specific genomic backgrounds. Significant progress in this direction has occurred for cells grown in culture, homogenized tissue samples, and microdissected cell populations. However, comprehensive analysis of histological sections remains problematic. At present, there is no one platform that permits multiplex expression measurements from the complete range of normal and diseased cell types in a tissue section. Therefore, it is important that new technologies and/or strategies for conducting these studies be developed and assessed. Historically, investigators have semiquantitatively measured the level of one (or perhaps two) transcripts or proteins at a time in tissue sections using in situ hybridization or immunohistochemistry (IHC), respectively. More recently, the use of tissue microdissection has facilitated profiling of specific, dissected cell populations. This technique is a useful advance, but the analysis is limited to the relatively few cell types that are procured in a study, and a comprehensive view of histopathology is not obtained. The molecular profiling field will advance most efficiently when investigators have a range of analysis technologies at their disposal, that is, a complete set of tools that can be utilized alone or in combination depending upon the particular goals of a study. Thus, in addition to developing new analysis methods, several groups are experimenting with multiplex expression measurements based on conventional methods such as IHC. The hope is to build and expand upon an established platform with which investigators are already experienced. Along these lines, we evaluated the feasibility of performing and analyzing multiplex immunohistochemical data from prostate tissue sections. Whole-mount prostate cases were utilized because each case has several different histological areas of interest that can be stained, examined, and analyzed simultaneously. This same approach can be used for studying many tissue types, including brain, developing embryos, or any organ exhibiting a disease process. The present study represents a step toward generating mathematical descriptions of histopathology.
Selection of Cases Prostatectomy cases were obtained from the National Institutes of Health and the National Naval Medical center under an institutional review boardapproved protocol. Ten whole-mount prostate cancer cases were ethanol fixed and paraffin embedded as described previously (Gillespie et al. 2002
Immunohistochemistry Sections were incubated for 10 min with a biotinylated secondary antibody, and a signal was detected by streptavidin-peroxidase using 3-amino-9 ether-carbazol chromogen as a substrate for peroxidase. Slides were counterstained with hematoxylin. Positive reactions (i.e., positively stained cells) were identified by the presence of a red precipitate. Negative reactions (i.e., negative cells) were identified by the absence of a red precipitate and only blue counterstain.
Data Collection
The stained sections were photographed (constant tissue area of 0.1 mm2) using a magnification of x200 with an Olympus microscope and a charged-coupled device (CCD) camera. The camera used had a resolution of 2080 x 1542 pixels (CCD color bayer mosaic; Q-Color-3, Olympus America Inc., Melville, NY). Seven images were taken per slide, including one image of each of the seven different morphological areas (Figure 1) , for a total of 2100 images. Each pathologist took 1050 images, consisting of seven histologic areas per slide, multiplied by 15 antibodies, and then multiplied by 10 cases.
Image Analysis All images were scanned and analyzed for the total number of cells present, the number of positively stained cells, and the number of non-stained cells (i.e., no specific reaction with the primary Ab) using ImagePro Analysis System (ImagePro 4.5; Cybernetics, Chevy Chase, MD). Measurement of positively stained cells was performed on each imaged area for all antibodies, and was expressed as the mean of [positive cells/total cells (positive + negative cells)]. In digital photomicroscopy, an image is stored as a matrix of N x M elements. Each element of the matrix (ni,mj) has a color represented by a combination of the three primary colors of light (RGB: red, green, and blue). The image (hence the matrix) is stored as three separate N x M pixel matrix files. In a 24-bit depth of color image, there are three separate 8-bit samples producing 28 or 256 level scales of red, blue, and green. Thus, the total number of colors is 2563. These colors are represented as discrete variables. Therefore, each color is assigned a numerical value between 0 and 255. The color derived at each pixel within an image is represented digitally by its three separate values indicating the level (or amount) of red, green, and blue contained therein. Images were examined simultaneously using the ACDSee program (ACD Systems of America; Miami, FL) and the positive staining was evaluated according to the most intensely stained and the least intensely stained image for each antibody. The data were saved using a manual arrow pointer to the red staining in the intensely stained image (i.e., positively stained cells) and for the blue staining in the least intensely stained case (i.e., negative cells). Every image for each antibody was screened using ImagePro according to the intensity staining for the total positive and negative cell count per image. This manual adjustment was performed because the ImagePro RGB did not separate the colors sufficiently. The ImagePro watershed separation was used for the image analysis. This is a method to separate objects and assist in the counting process. The size of the counted objects does change; however, because the same process was applied uniformly across all of the cases, it did not alter the obtained final results. Every measurement was transferred to an MS Excel (Microsoft Excel 2000; Seattle, WA) spreadsheet, and the mean staining was calculated according to the formula [positive cells/total cells (positive + negative cells)].
Data Analysis
i and j are the averages of observations i and j, respectively, generating a covariance matrix with the components as calculated in (1).
This represents a multidimensional matrix that makes it difficult to show correlation between observations. Simplification is needed so that patterns are easier to interpret; thus, we applied PCA (Raychaudhuri et al. 2000 After generating a covariance matrix, the eigenvectors of the matrix were calculated. These are orthogonal base vectors (principal components) used to represent the data. There are, in fact, M such vectors; however, one typically works with the higher value vectors that explain most of the observed variance (PC #1, #2, and #3).
One-way analysis of variance (ANOVA) (Lane and Nelder 1982 For normalization, the GAPDH and histone antibody measurements were combined in a Microsoft Excel spreadsheet. This specific combination of antibodies was the most uniform among all the groups as determined by analysis with the Partek Pro software. The mean value for GAPDH and histone served as the denominator. The numerator was the reading for the rest of the antibodies examined according to the following formula: [Mean value for a picture/(mean GAPDH value for same area + mean histone value for same area)]. Areas that were negative for GAPDH or histone were excluded from the normalization.
A total of 10 whole-mount human prostate cancer specimens were analyzed using 24 different antibodies. Nine antibodies (p53, Ki-67, p27Kip1, p21WAF1, Her-2-c-erbB2, phospho-EGF receptor, caspase 3, Akt, and phospho Akt) demonstrated positive staining in one or more of the cases, but did not provide highly consistent and reproducible staining patterns and thus were excluded from the study.
Fifteen histological sections from each case were stained by IHC. The antibodies (Abs) were selected based on cell specificity for different histological areas (Figure 1). Cytokeratin and PSA Abs were selected for normal glands, PIN, and carcinoma cells. SMA was used for stroma (comprised of an admixture of smooth muscle and fibroblasts). Vimentin was used for mesenchymal-derived tissue. CD3 Ab was chosen for inflammatory infiltrate and S-100 Ab for nerve. Other antibodies were selected based on reports that they are differentially expressed in prostate cancer tissue including p130 (Claudio et al. 2002 For each Ab in the study, the percentage of positive cells was established as follows. First, the most intensely stained of the 10 cases was determined by the reviewing pathologists. The percentage of positive cells was then measured in this case using ImagePro, and this index case was subsequently used as a reference for the other nine patient samples in the study. This manual adjustment was performed because the ImagePro RGB did not separate the colors sufficiently. Figure 2 shows a representative case to demonstrate the quantification by the ImagePro program. The staining pattern was different for every antibody: cytoplasmic, membranous, or nuclear. Thus, the method for counting was adjusted to fit each antibody and was based on counting of stained cells.
Figure 3 demonstrates that each of the histological areas could be grouped based on their directionality on the xyz-axis. PCA separates datasets according to distance from a base vector (PC #13); thus, the fact that the different morphological areas have distinct directionality shows the groups can be clustered based on similarities and differences in IHC staining profiles. With normalization of the data to housekeeping proteins, there was an even greater separation of the histological areas (Figure 4) .
In the epithelial group, normal epithelium and carcinoma clustered separately based on analysis of the proteins p130, pERK, PIM1, ERK, and HSP27 (Figure 5) as was expected from the literature (Cornford et al. 2000
Figure 7 summarizes the differences between the mean values calculated for pathologist JG compared with pathologist RC for each histological area (carcinoma, PIN, hyperplasia, normal epithelium, stroma, inflammation, and nerve) for three different antibodies (GAPDH, vimentin and Pim1). The graph shows that the difference in scoring between the pathologists was within a range of 10%.
The field of clinical molecular pathology is moving toward high-throughput, quantitative measurements of expression states of cells in tissues, a so-called histomathematical description of phenotypes. However, there are many challenges in this undertaking that need to be addressed and overcome. In the present study, we evaluated the strengths and weaknesses of a proteomic analysis strategy that uses a combination of immunohistochemical staining and newly developed imaging and data analysis programs. The focus of the study was to measure and then investigate the expression levels of a relatively large set of proteins (24) in several distinct cell populations present in human prostate cancer specimens. Whole-mount tissue sections representing entire transverse anatomical planes of the prostate were utilized in the study because they contain a range of cell types, including subclassifications of epithelium such as cancer and premalignant lesions. These samples allow comprehensive survey of all histopathological changes present and represent a strategic difference from studies utilizing tissue microdissection or tissue microarrays, in which one typically attempts to analyze one (or a few) distinct cell population(s), tumor cells, for example (Simone et al. 2000 The first challenge encountered was the failure to detect many of the target proteins. Although the antibodies for the study were commercially available and advertised to perform well on human tissue samples, 9 of the 24 produced equivocal staining. The use of standard antigen-retrieval methods was not successful in improving performance. The failure to detect more than one-third of the proteins is of concern, especially in light of the fact that they are known to be "highly abundant" in at least a subset of the cell types present in the prostate sections. This points to a potentially significant problem for the field of proteomics, that is, the inability to measure low- and moderate-abundant proteins in a complex matrix such as a histological section. This difficulty is likely to become increasingly problematic. Many, if not most, of the genes newly discovered by the Human Genome Project are expressed at relatively lower levels of abundance than the "known genes/proteins"; hence, they have escaped discovery for the past several decades by investigators in the laboratory. These proteins likely will be even more difficult to measure with IHC than those in the present study and may not be amenable to this technique at all. Clearly, the inability to detect and accurately measure a significant fraction of the proteome in specific cell phenotypes represents a major hurdle for the proteomics community to overcome in the future. A second difficulty in the present study was the requirement for multiple tissue sections, in other words, the need for a separate histological section for each antibody. This was problematic for each of the different cell types because they often exhibited significant changes as the tissue block was serially sectioned. For example, PIN foci are small, localized groups of cells that can vary qualitatively (based on histological grading) and/or quantitatively (appear/disappear) among the sections. These problems were partly solved by incorporating immediately sequential sections and by close coordination between the reviewing pathologists. Nonetheless, this was a difficulty in the current study and is an inherent problem of analysis strategies that require serial sections for expression measurements.
Further challenges were encountered in performing quantitative measurement of protein levels. Historically, IHC has been analyzed using a semi-quantitative grading scheme, for example, on a scale from 0 to 4 based on somewhat subjective parameters such as staining pattern, frequency, and/or intensity. O'Neill et al. (2004)
Densitometric measurement of staining profiles from images is dependent on background (nonspecific staining reaction) and on electronic settings, including camera and lamp settings, white balance, and slide thickness. A major decision for the investigator is in choosing threshold values. In one approach, images can be converted to gray scale and enhanced for analysis, as was done by Nieruchalska et al. (2003)
Quantification of IHC staining was another issue that needed to be addressed. There are several possible approaches that one can employ. For example, Cunnane et al. (1999) To normalize expression levels of the non-housekeeping proteins, we divided each value (nominator) by the value produced by the combined staining of GAPDH and histone antibody IHC expression (combined as a denominator), because these readings were stable and evenly expressed among the different morphological groups. This allowed the staining data to be compared with internal controls in each cell population, similar to an investigator normalizing a measurement on an immunoblot using an internal housekeeping protein. This is a beneficial feature of multiplex protein analysis of histology as compared with a standard, "one-protein IHC approach." After normalizing each data point to GAPDH and histone, the differences between the groups became sharper, emphasizing the variance between them (Figure 4).
Once the images were analyzed, the resultant data were imported from Microsoft Excel spreadsheets directly to the Partek Pro program and grouped based on individual point variance in PCA scatterplots. Analysis of the full set of 15 antibodies with the PCA scatterplots generated four separate groups that varied in their expression pattern (different shape of wire mesh) and in their location in the xyz-axis (Figure 3). Multiplex analysis was useful in that it provided optimal segregation of the various cellular types, including cancer and corresponding normal epithelium (Figure 5) (Jiang et al. 2002 In summary, production and analysis of high-throughput proteomic data sets from histology sections is likely to become an essential tool in basic research and clinical evaluation of normal cellular physiology and disease states. The present work indicates this is feasible using currently available laboratory tools. However, the study also points to several technical challenges that the proteomic community needs to overcome before genuine and comprehensive numerical descriptions of cellular phenotypes become a reality.
Received for publication June 24, 2004; accepted September 29, 2004
Bell WC, Myers RB, Hosein TO, Oelschlager DK, Grizzle WE (2003) The response of extracellular signal-regulated kinase (ERK) to androgen-induced proliferation in the androgen-sensitive prostate cancer cell line, LNCaP. Biotechnol Histochem 78:1116 Bishop JW, Marcelpoil R, Schmid J (2002) Machine scoring of Her2/neu immunohistochemical stains. Anal Quant Cytol Histol 24:257262[Medline] Claudio PP, Zamparelli A, Garcia FU, Claudio L, Ammirati G, Farina A, Bovicelli A, et al. (2002) Expression of cell-cycle-regulated proteins pRb2/p130, p107, p27(kip1), p53, mdm-2, and Ki-67 (MIB-1) in prostatic gland adenocarcinoma. Clin Cancer Res 8:18081815 Cornford PA, Dodson AR, Parsons KF, Desmond AD, Woolfenden A, Fordham M, Neoptolemos JP, et al. (2000) Heat shock protein expression independently predicts clinical outcome in prostate cancer. Cancer Res 60:70997105 Cunnane G, Bjork L, Ulfgren AK, Lindblad S, FitzGerald O, Bresnihan B, Andersson U (1999) Quantitative analysis of synovial membrane inflammation: a comparison between automated and conventional microscopic measurements. Ann Rheum Dis 58:493499 De Boer WI, Hiemstra PS, Sont JK, De Heer E, Rabe KF, Van Krieken JH, Sterk PJ (2001) Image analysis and quantification in lung tissue. Clin Exp Allergy 31:504508[CrossRef][Medline] Dhanasekaran SM, Barrette TR, Ghosh D, Shah R, Varambally S, Kurachi K, Pienta KJ, et al. (2001) Delineation of prognostic biomarkers in prostate cancer. Nature 412:822826[CrossRef][Medline] Gillespie JW, Best CJ, Bichsel VE, Cole KA, Greenhut SF, Hewitt SM, Ahram M, et al. (2002) Evaluation of non-formalin tissue fixation for molecular profiling studies. Am J Pathol 160:449457 Iwafuchi H, Mori N, Takahashi T, Yatabe Y (2004) Phenotypic composition of salivary gland tumors: an application of principle component analysis to tissue microarray data. Mod Pathol 17:803810[CrossRef][Medline] Jiang J, Ulbright TM, Zhang S, Eckert GJ, Kao C, Gardner TA, Koch MO, et al. (2002) Fas and Fas ligand expression is elevated in prostatic intraepithelial neoplasia and prostatic adenocarcinoma. Cancer 95:296300[CrossRef][Medline] Kwong J, Lui K, Chan PS, Ho SM, Wong YC, Xuan JW, Chan FL (2003) Expression study of three secretory proteins (prostatic secretory protein of 94 amino acids, probasin, and seminal vesicle secretion II) in dysplastic and neoplastic rat prostates. Prostate 56:8197[CrossRef][Medline] Lane PW, Nelder J (1982) Analysis of covariance and standardization as instances of prediction. Biometrics 38:613621[CrossRef][Medline] Lindsay LI (2002) A tutorial on Principal Components Analysis. http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf (accessed February 20, 2004) Nabi G, Seth A, Dinda AK, Gupta NP (2004) Computer based receptogram approach: an objective way of assessing immunohistochemistry of androgen receptor staining and its correlation with hormonal response in metastatic carcinoma of prostate. J Clin Pathol 57:146150 Nieruchalska E, Strzelczyk R, Wozniak A, Zurawski J, Kaczmarek E, Salwa-Zurawska W (2003) A quantitative analysis of the expression of alpha-smooth muscle actin in mesangioproliferative (GnMes) glomerulonephritis. Folia Morphol (Warsz) 62:451453 O'Neill PA, Shaaban AM, West CR, Dodson A, Jarvis C, Moore P, Davies MP, et al. (2004) Increased risk of malignant progression in benign proliferating breast lesions defined by expression of heat shock protein 27. Br J Cancer 90:182188[Medline] Patel VM, Heinel LA, Provencio JJ, Vinall PE, Kramer MS, Rosenwasser RH (2002) Validation of image analysis for enzyme histochemical and immunocytochemical staining. Biotechnol Histochem 77:213221 Raychaudhuri S, Stuart JM, Altman RB (2000) Principal components analysis to summarize microarray experiments: application to sporulation time series. Pac Symp Biocomput 5:452463 Simone NL, Remaley AT, Charboneau L, Petricoin EF 3rd, Glickman JW, Emmert-Buck MR, Fleisher TA, et al. (2000) Sensitive immunoassay of tissue cell proteins procured by laser capture microdissection. Am J Pathol 156:445452 Wang S, Saboorian MH, Frenkel EP, Haley BB, Siddiqui MT, Gokaslan S, Wians FH Jr, et al. (2001) Assessment of HER-2/neu status in breast cancer. Automated Cellular Imaging System (ACIS)-assisted quantitation of immunohistochemical assay achieves high accuracy in comparison with fluorescence in situ hybridization assay as the standard. Am J Clin Pathol 116:495503[CrossRef][Medline] Winter EE, Goodstadt L, Ponting CP (2004) Elevated rates of protein secretion, evolution, and disease among tissue-specific genes. Genome Res 14:5461 Yan H, Zhou W (2004) Allelic variations in gene expression. Curr Opin Oncol 16:3943[CrossRef][Medline] Zellweger T, Ninck C, Mirlacher M, Annefeld M, Glass AG, Gasser TC, Mihatsch MJ, et al. (2003) Tissue microarray analysis reveals prognostic significance of syndecan-1 expression in prostate cancer. Prostate 55:2029[CrossRef][Medline]
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||