Subnuclear Proteomics in Colorectal Cancer

Abnormalities in nuclear phenotype and chromosome structure are key features of cancer cells. Investigation of the protein determinants of nuclear subfractions in cancer may yield molecular insights into aberrant chromosome function and chromatin organization and in addition may yield biomarkers for early cancer detection. Here we evaluate a proteomics work flow for profiling protein constituents in subnuclear domains in colorectal cancer tissues and apply this work flow to a comparative analysis of the nuclear matrix fraction in colorectal adenoma and carcinoma tissue samples. First, we established the reproducibility of the entire work flow. In a reproducibility analysis of three nuclear matrix fractions independently isolated from the same colon tumor homogenate, 889 of 1,047 proteins (85%) were reproducibly identified at high confidence (minimally two peptides per protein at 99% confidence interval at the protein level) with an average coefficient of variance for the number of normalized spectral counts per protein of 30%. This indicates a good reproducibility of the entire work flow from biochemical isolation to nano-LC-MS/MS analysis. Second, using spectral counting combined with statistics, we identified proteins that are significantly enriched in the nuclear matrix fraction relative to two earlier fractions (the chromatin-binding and intermediate filament fractions) isolated from six colorectal tissue samples. The total data set contained 2,059 non-redundant proteins. Gene ontology mining and protein network analysis of nuclear matrix-enriched proteins revealed enrichment for proteins implicated in “RNA processing” and “mRNA metabolic process.” Finally, an explorative comparison of the nuclear matrix proteome in colorectal adenoma and carcinoma tissues revealed many proteins previously implicated in oncogenesis as well as new candidates. A subset of these differentially expressed proteins also exhibited a corresponding change at the mRNA level. Together, the results show that subnuclear proteomics of tumor tissue is feasible and a promising avenue for exploring oncogenesis.

Abnormalities in nuclear phenotype and chromosome structure are key features of cancer cells. Investigation of the protein determinants of nuclear subfractions in cancer may yield molecular insights into aberrant chromosome function and chromatin organization and in addition may yield biomarkers for early cancer detection. Here we evaluate a proteomics work flow for profiling protein constituents in subnuclear domains in colorectal cancer tissues and apply this work flow to a comparative analysis of the nuclear matrix fraction in colorectal adenoma and carcinoma tissue samples. First, we established the reproducibility of the entire work flow. In a reproducibility analysis of three nuclear matrix fractions independently isolated from the same colon tumor homogenate, 889 of 1,047 proteins (85%) were reproducibly identified at high confidence (minimally two peptides per protein at 99% confidence interval at the protein level) with an average coefficient of variance for the number of normalized spectral counts per protein of 30%. This indicates a good reproducibility of the entire work flow from biochemical isolation to nano-LC-MS/MS analysis. Second, using spectral counting combined with statistics, we identified proteins that are significantly enriched in the nuclear matrix fraction relative to two earlier fractions (the chromatin-binding and intermediate filament fractions) isolated from six colorectal tissue samples. The total data set contained 2,059 non-redundant proteins. Gene ontology mining and protein network analysis of nuclear matrixenriched proteins revealed enrichment for proteins implicated in "RNA processing" and "mRNA metabolic process." Finally, an explorative comparison of the nuclear matrix proteome in colorectal adenoma and carcinoma tissues revealed many proteins previously implicated in oncogenesis as well as new candidates. A subset of these differentially expressed proteins also exhibited a corresponding change at the mRNA level. Together, the results show that subnuclear proteomics of tumor tissue is feasible and a promising avenue for exploring oncogenesis.

Molecular & Cellular Proteomics 9:988-1005, 2010.
whereas 15% of CRC cases exhibits microsatellite instability (MIN) (8). Progression of non-malignant precursor lesions, colorectal adenomas, into CINϩ CRC has been associated with a number of specific chromosomal alterations of which gain of additional copies of chromosome 20q is most prominent (9,10). Tumor progression is accompanied by important phenotypic changes in neoplastic cells, in particular alterations of nuclear structure such as nuclear size and shape, numbers and sizes of nucleoli, and chromatin texture as observed by the pathologist under the microscope (9,11,12). Yet, still little is known about the protein constituents underlying these phenotypic characteristics (12). As such, proteins supporting nuclear structure and organization represent a relevant target for proteomics analysis of cancer tissue and precursor lesions. Here we evaluate a proteomics work flow for profiling of subnuclear fractions from early stage CRC tumors and apply this work flow to a comparative analysis of the nuclear matrix fraction of colorectal adenoma and carcinoma tissues.
A focused proteomics work flow for analysis of patient tumor tissue should effectively separate subproteomes of interest from interfering and highly abundant proteins and other tissue components. In addition, the method must be practical, reproducible, and compatible with relatively small amounts of tissue. Here we investigate a biochemical protocol that originally was developed by Fey and co-workers (13,14) in the 1980s. In brief, tissue samples are homogenized in a solution with stabilizers of RNA-protein complexes, and soluble, cytoskeletal, and DNA-associated proteins are extracted. The remaining detergent-, salt-, and DNase-resistant pellet is cleared of filaments and is referred to as the "nuclear matrix" (NM) fraction. The NM has been postulated to be a dynamic RNA/protein network that spans the nucleoplasm and to provide support for higher order chromatin packaging and overall shape of the nucleus (15). Several lines of evidence support roles for the NM in the processing of genetic information and for establishing functional subdomains and macromolecule assemblies in the nucleus (15). Oncogenes or suppressor genes can function differentially in different cell types, and much of this cellular context for transformation may be determined, among others, by the NM (16). Interestingly, early studies using two-dimensional gel-based proteomics have indicated that the NM proteome varies significantly between different tissues and cells and in cancer versus healthy tissue (for reviews, see Refs. 7 and 17). However, only a few cancer-related NM proteins have been identified so far. A comprehensive LC-MS/MS-based proteomics analysis of the NM proteome in adenoma-carcinoma progres-sion may shed light on the potential role of the NM in (tissuespecific) cancer pathogenesis; in particular, the NM proteins have been suggested to provide a promising source of protein biomarkers for cancer (17,18). The NMP22/Bladdercheck (Matritech) test measures urine levels of the NM-associated protein nuclear mitotic apparatus protein (NUMA) and has been United States Food and Drug Administration-approved for follow-up measurements in bladder cancer patients (17).
In addition to the NM fraction, two hitherto uncharacterized fractions are also extracted with this protocol: the "chromatinbinding" (CB) and the "intermediate filament" (IF) fractions. Using differential analysis of the three fractions, we show that the NM fraction is enriched for proteins derived from distinct subnuclear domains. Importantly, this subcellular fractionation approach was reproducible, and a differential comparison of the nuclear matrix proteome in colorectal adenoma and carcinoma tissues uncovered many proteins previously implicated in cancer biology as well as novel candidates. Therefore, the proteomics work flow presented here provides relevant targets for comprehensive profiling of proteins involved in subnuclear structure and organization in early stage CRC tumors and is a promising avenue to explore oncogenesis.

Materials
All basic chemicals were obtained from Sigma. HPLC solvents, LC-MS grade water, acetonitrile, and formic acid were obtained from Biosolve (Valkenswaard, The Netherlands). Porcine sequence grade modified trypsin was obtained from Promega (Leiden, The Netherlands). Complete protease inhibitor mixture tablets (mini, EDTA-free) were purchased from Roche Diagnostics. Protein assay reagent was obtained from Bio-Rad. Precast gradient gels and gel buffers were acquired from Invitrogen.

Clinical Material
CRC tissue was collected at the VU University Medical Center, Amsterdam, The Netherlands, in compliance with institutional regulations for use of leftover material. The tissue samples (50 -200 mg) were obtained from surgical resection specimens at the department of pathology, snap frozen in liquid nitrogen, and stored at Ϫ80°C. All samples were inspected by a pathologist to ascertain a tumor content of at least 70% cancer cells using hematoxylin-and eosin-stained sections. Biological subtypes of carcinomas (MINϩ versus CINϩ) were determined to include both carcinoma types in the analysis. In brief, copy number changes were determined by MLPA as described by Schouten et al. (19). To this end, we used previously designed probes for chromosomes arms 8q, 13q, and 20q (20). MLPA data analysis was performed using MLPAnalyzer version 7.0 (21). MIN analysis was performed using the MIN Analysis System (MSI Multiplex System Version 1.1, Promega) as described (22). Clinical details are described in supplemental Table 1.
blade. The frozen tissue pieces were immediately placed in a precooled Potter-Elvehjem tube containing 600 l of precooled homogenization buffer/100 mg of tissue (0.5% Triton X-100, 300 mM sucrose, 100 mM NaCl, 10 mM PIPES, pH 6.8, 3 mM MgCl 2 , 2 mM vanadyl ribonucleoside, 1 mM EGTA, 1 mM PMSF, Complete protease inhibitor mixture (Roche Diagnostics)). The tissue was homogenized for 2 min (50 strokes) on ice using a precooled Potter-Elvehjem homogenizer and electrical drill. Next, the homogenate was filtered through three layers of 350-m nylon mesh. The homogenate was left to mix on a turning wheel for 5 min (4°C) and centrifuged for 5 min at 750 ϫ g (4°C), and the supernatant was collected (fraction 1, the "cytosolic" fraction containing lipids and soluble proteins). The pellet was resuspended in 600 l of "extraction buffer"/100 mg of starting material (same as homogenization buffer but with 250 mM ammonium sulfate instead of 100 mM NaCl). The sample was mixed on a turning wheel for 5 min (4°C) and centrifuged for 5 min at 750 ϫ g (4°C), and the supernatant was collected (fraction 2, the "cytoskeletal" fraction containing salt-extracted tissue elements and proteins). Third, the remaining pellet was resuspended in 400 l of "digestion buffer"/100 mg of starting tissue (same as homogenization buffer but with 100 g/ml RNase-free DNase I, and a reduced concentration of 50 mM NaCl). The sample was left at room temperature for 30 min for digestion of chromosomal DNA. The sample was brought back to high-salt conditions by addition of ammonium sulfate solution, left at room temperature for 5 min, centrifuged for 5 min at 1000 g (4°C), and the supernatant was collected (fraction 3, the chromatin-binding fraction). The remaining detergent-, high salt-, and DNase-resistant pellet was resuspended in 400 l of "disassembly buffer"/100 mg of starting material (8 M urea, 20 mM PIPES, pH 6.8, 1 mM EGTA, 1 mM DTT, 1 mM PMSF). The suspended proteins were treated by overnight dialysis against Ͼ1,000 volumes of "reassembly buffer" (25 mM imidazole, pH 7.1, 150 mM KCl, 5 mM MgCl 2 , 0.125 mM EGTA, 1.4 mM DTT, 0.26 mM PMSF). The repolymerized filaments were pelleted by centrifugation for 90 min at 217,000 ϫ g at 20°C. The visible pellet was dissolved in Tris buffer (10 mM Tris-HCl, pH 7.0, 100 mM KCl), shaken vigorously, and incubated overnight at 4°C after which the majority of protein was dissolved (fraction 4, the intermediate filament fraction). The 217,000 ϫ g supernatant was collected as the final fraction (fraction 5, the nuclear matrix fraction). For the analysis of unfractionated CRC tissue, three CRC tissue samples (ϳ75 mg each) were homogenized in a ratio of 100 mg/1 ml of buffer (7 M urea, 2 M thiourea, 4% CHAPS, and 10 l/ml protease inhibitor mixture). Protein concentrations of the samples were determined with the Bradford assay (Bio-Rad). For proteomics analysis, 60 g of CB, IF, and NM fractions and unfractionated CRC lysates were loaded on SDS-polyacrylamide gels.

Gel Electrophoresis
SDS-PAGE was performed with precast 4 -12% gradient gels containing Bis-Tris buffer (NuPAGE MES system, Invitrogen). After electrophoresis, the gels were fixed in 50% ethanol containing 3% phosphoric acid and stained with Coomassie Blue R-250. After staining, the gels were washed in Milli-Q water and stored at 4°C until processing for in-gel digestion.

In-gel Digestion
Coomassie-stained entire gel lanes were cut in 10 bands, and each band was processed for in-gel digestion according to the method of Shevchenko et al. (23). Briefly, bands were washed and dehydrated three times in 50 mM ammonium bicarbonate, pH 7.9 and in 50 mM ammonium bicarbonate, pH 7.9 ϩ 50% ACN, respectively. Subsequently, cysteine bonds were reduced with 10 mM DTT for 1 h at 56°C and alkylated with 50 mM iodoacetamide for 45 min at room temperature in the dark. After two subsequent wash/dehydration cycles, the bands were dried for 10 min in a vacuum centrifuge and incubated overnight with 0.06 g/l trypsin at 25°C. Peptides were extracted once in 1% formic acid and subsequently two times in 50% ACN in 5% formic acid. The volume was reduced to 50 l in a vacuum centrifuge prior to nano-LC-MS/MS analysis.
Injection Scheme Label-free Comparison-Nano-LC-MS/MS data were acquired for the three subnuclear compartments (CB, IF, and NMP) in three batches with six colorectal tissues within 3 months and stable instrument performance. Per subnuclear fraction, the injection scheme of the 10 gel bands per sample was as follows: "adenoma1-band1," "adenoma2-band1," "MIN1-band1," "MIN2-band1," "CIN1-band1," and "CIN2-band1" followed by the second gel band for all samples, the third, etc. until band 10. The injection scheme spreads instrumental drift over the biological samples and minimizes bias.
Tandem Mass Spectrometry-Eluting peptides were ionized at 1.7 kV in a Nanomate Triversa Chip-based nanospray source using a Triversa LC coupler (Advion, Ithaca, NY). Intact peptide mass spectra and fragmentation spectra were acquired on an LTQ-FT hybrid mass spectrometer (Thermo Fisher, Bremen, Germany). Intact masses were measured at resolution 50,000 in the ICR cell using a target value of 1 ϫ 10 6 charges. In parallel, following an FT prescan, the top five peptide signals (charge states 2ϩ and higher) were submitted to MS/MS in the linear ion trap (3-amu isolation width, 30-ms activation, 35% normalized activation energy, Q value of 0.25, and threshold of 5,000 counts). Dynamic exclusion was applied with a repeat count of 1 and an exclusion time of 30 s.
Database Searching-MS/MS spectra were searched against the human IPI database 3.31 (67,511 entries) using Sequest (version 27, revision 12), which is part of the BioWorks 3.3 data analysis package (Thermo Fisher, San Jose, CA). MS/MS spectra were searched with a maximum allowed deviation of 10 ppm for the precursor mass and 1 amu for fragment masses. Methionine oxidation and cysteine carboxamidomethylation were allowed as variable modifications, two missed cleavages were allowed, and the minimum number of tryptic termini was 1. After database searching the DTA and OUT files were imported into Scaffold (versions 1.07 and 2.01) (Proteome Software, Portland, OR). Scaffold was used to organize the gel slice data and to validate peptide identifications using the PeptideProphet (24,25) algorithm; only identifications with a probability Ͼ95% were retained. Subsequently, the ProteinProphet (24,25) algorithm was applied to group peptide identifications to protein identifications. Protein identifications with a probability of Ͼ99% with two peptides or more in at least one of the samples were retained. Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped.
Protein FDR was calculated in pilot experiments for three subnuclear fractions analyzed in a 10-gel band experiment each. For the CB fraction, 1,497 proteins were identified (two or more unique peptides and ProteinProphet score Ͼ99%) in the forward database search, and six were identified in the reversed database search, resulting in an FDR (reverse/(reverse ϩ forward)) of 0.00399. Conversely, for the IF fraction, 893 were identified in the forward database search, and four were identified in the reversed database search, and for NMP fraction, 905 forward and six reversed database identifications were found, yielding an FDR of 0.00446 and 0.00659, respectively. The average FDR for these three fractions is 0.00501 or 0.50%.
For each protein identified, the number of spectral counts (the number of MS/MS spectra associated with an identified protein) was exported to Excel. Normalized spectral counts were calculated by dividing the spectral counts for an identified protein by the sum of the spectral counts per sample.

MALDI-TOF/TOF MS
Protein bands from Coomassie-stained SDS-polyacrylamide gels were excised manually using a clean scalpel or razor blade and digested as described above. The tryptic peptides extracted from gel slices were concentrated and desalted on reversed-phase C 18 Zip-Tips (Millipore) and eluted with 3 l of ␣-cyano-4-hydroxycinnamic acid matrix (6.2 mg/l ␣-cyano-4-hydroxycinnamic acid in 56% (v/v) acetonitrile, 36% MeOH; Agilent Technologies). A sample volume of 0.8 l from the peptide-matrix mixture was analyzed by MALDI-TOF/ TOF MS (4800 MALDI-TOF/TOF, Applied Biosystems). The obtained mass spectra were searched against the human database using a MASCOT search engine (Matrix Science) with a mass tolerance set at 20 ppm for MS1 and 0.6 Da for MS/MS. The protonated trypsin autodigestion products at m/z 842.510, 1,045,564, and 2,211.104 were used for internal calibration of the MALDI-TOF MS spectra. All the proteins listed were identified with a confidence interval of at least 95% from the combined MS and MS/MS data.

Statistical Analyses
Cluster Analysis-Cluster analysis was performed using hierarchical clustering in R. The protein abundances were normalized to zero mean and unit variance for each individual protein. Subsequently, the Euclidean distance measure was used for protein clustering. For sample clustering, the difference between two samples was defined as the sum of differences in individual proteins. Let the detection rates of a protein p in the two samples be 1 Ͼ 0 and 2 Ͼ 0. The difference in p between the two samples is as follows.
This measure is the Jeffrey divergence, a symmetric variant of the Kullback-Leibler divergence, between two Poisson distributions with parameters 1 and 2 . This measure has the advantage that it prevents highly abundant proteins from dominating others in contribution to the total sample difference. Relative Enrichment of Proteins in Subnuclear CB, IF, and NM Fractions-Enrichment analysis of proteins in the NM fraction was performed by two pairwise comparisons with the CB fraction and the IF fraction. For each comparison, the Mantel-Haenszel method was used for significance analysis and estimation of a common -fold change. A protein is enriched in the NM fraction when it is 2 times more abundant in the NM fraction than in both the CB and IF fractions with p values less than 0.05.
Differential Analysis of Nuclear Matrix Proteome in Colorectal Adenoma and Carcinoma Tissues-Differential analysis of colorectal adenoma, MINϩ CRC, and CINϩ CRC was performed using the ␤-binomial test. The ␤-binomial test takes into account the withinsample variation and the between-sample variation in a single statistical model (26). Three pairwise comparisons were performed. The threshold for statistical significance is 0.05. To select for the more robust changes, the regulated proteins with overlapping counts in each two-group comparison were removed. For the su-pervised cluster analysis, the results of the three comparisons were merged.

Gene Ontology and Pathway Analysis
Analysis of Enrichment of Nuclear Proteins-To assess the percentage of nuclear proteins in a list of differential proteins, their entries in the UniProt Knowledgebase were checked for any evidence that these proteins can be found in the nucleus. Entries at "General annotation (Comments) Subcellular location," "Ontologies Ͼ Keywords Ͼ Cellular component," and "Ontologies Ͼ Gene ontology (GO) Ͼ Cellular component" were all checked.
Relative Enrichment of GO Terms in Subnuclear Domains-IPI names of subnuclear fraction-enriched proteins were compared using the Functional profiling by functional enrichment methods (FatiGOϩ) tool version 3.2 on the Babelomics server (27). Enriched proteins in the CRC fractions and unfractionated CRC tissue were compared for relative enrichment of GO terms in the GO categories "biological process," "molecular function," and "cellular component," allowing for no redundant GO terms, and significant hits included only false discovery rate-adjusted p values Ͻ0.05. In addition, the FatiGOϩ tool was used to search for enriched InterPro motifs in the amino acid sequences of the NM-enriched proteins.
Analysis of Enriched Protein Complexes-Gene symbol names of identified proteins that were significantly enriched in the CB, IF, and NM fractions and differential NM proteins of the adenoma-carcinoma comparison were analyzed with the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) tool version 8.0 (28). Proteinprotein interaction (PPI) maps were created with default settings, allowing for experimentally verified and predicted PPIs; stronger associations are represented by thicker lines.
Analysis of Cancer-associated Proteins and Pathway Analysis-IPI names of identified proteins were searched with the Ingenuity IPA 7.0 tool (Ingenuity Systems, Inc.) for "Bio functions." Putative cancer association was deduced from the Disorders reference list given in the Disorders and Mutations section of the pertinent protein entry in GeneCards. Manual literature mining was done with the Entrez Gene server.

Assessment of Reproducibility of Entire Work Flow
First, we investigated the reproducibility of the differential extraction protocol that yields the nuclear matrix fraction (13,14) in conjunction with GeLC-MS/MS analysis and spectral counting. To this end, four CRC samples were pooled (total mass, 500 mg), the tissue was homogenized, and the homogenate was separated into three aliquots that were fractionated independently, resulting in three nuclear matrix fractions for analysis. First, 50, 60, or 70 g of protein from one NM fraction was loaded onto an SDS-polyacrylamide gel. The maximum amount that could be loaded without smearing was 60 g. Accordingly, 60 g of each of the three replicates of NM fractions was separated by SDS-PAGE. The band pattern obtained in each NM fraction was very similar (Fig. 1A). For analysis of reproducibility of protein identification, the three lanes were cut into 10 pieces, aiming for an equal amount of total protein in each gel piece (Fig. 1A). The gel pieces were processed by in-gel digestion and subjected to nano-LC-MS/ MS. The total number of identified proteins was 1,047 (with identification criteria of minimally two unique peptides at 95% confidence at the peptide level and Ͼ99% confidence at the protein level in minimally one sample) in the total data set consisting of three replicate analyses. The average false discovery rate as revealed by reversed database searching was ϳ0.5% at the protein level, which is a low value. The number of proteins detected in all three replicates was 889 (85% of the total number of identified proteins), indicating good reproducibility (Fig. 1B). See supplemental Table 2 for documentation of protein identifications in the NM triplicate analysis and associated spectral count data.
To evaluate the combined technical variation including tissue fractionation, in-gel digestion, and nano-LC-MS/MS, we used protein spectral counting. The average CV of the number of spectral counts was 27% for the 889 proteins identified in all three replicates (supplemental Table 2). As expected, the CVs varied strongly between proteins with the proteins with the highest spectral counts producing the lowest CV% values (ϳ10%; see supplemental Table 2).

Identification and Data Mining of NM-enriched Proteins
The proteomes of six colorectal tissues, i.e. two colorectal adenoma, two MINϩ CRC, and two CINϩ CRC, were frac-tionated each into five subproteomes by biochemical fractionation. Fig. 2A shows an example of the protein band patterns in the five fractions of a CRC tissue as analyzed by SDS-PAGE. The majority of the protein mass (Ͼ90%) was collected in the two first fractions using extraction by detergent and high salt removing membrane/soluble and cytoskeletal proteins, respectively. By subsequent treatment with DNase I, ϳ5% of the total protein mass was collected in the third

FIG. 2.
A, representative protein gel view of five biochemical fractions, including the CB, IF, and NM fractions isolated from a CRC tissue homogenate using protocol of Fey and co-workers (13,14). The biochemical procedure, the fraction name, and the protein yield per 100 mg of starting material (g/%) is described above the gel. Selected protein bands (arrows) were identified by MALDI-TOF/TOF MS. B, Venn diagram of confident protein identifications for the CB, IF, and NM fractions from two colorectal adenoma and four carcinoma tissues. Sup., supernatant. fraction, the so-called chromatin-binding fraction. The remaining pellet was dissolved in urea-containing buffer, the urea was dialyzed off, allowing intermediate filament proteins to reassemble into insoluble networks, and the sample was centrifuged at high speed in an ultracentrifuge. The final pellet (fraction 4) and supernatant (fraction 5) each contained ϳ1% of the total protein mass. The SDS-PAGE profiles of fractions 1 and 2 were very similar, whereas fractions 3-5 differed substantially (see Fig. 2). The profile of the third fraction was characterized by two strong bands of ϳ16 kDa, identified by MALDI-TOF/TOF MS as histones (H1d and H1a). The fourth fraction was characterized by a cluster of strong bands of 40 -50 kDa identified as keratin (cytokeratin 8), actin (cytoplasmic 2), and tubulin (␤2 and ␣3). The fifth fraction was characterized by two protein bands of ϳ70 kDa that were identified as lamins B1 and A/C, respectively (Fig. 2). In the following, fractions 3, 4, and 5 are referred to as the CB, the IF, and the NM fractions, respectively.
To identify proteins specifically enriched in the NM fraction of the six SDS-PAGE-separated colorectal tissue samples, the entire gel lanes of the CB, IF, and NM fractions were sliced in 10 pieces, and the proteins were subjected to in-gel digestion and analysis by nano-LC-MS/MS. A total of 2,059 proteins was identified in the whole data set with 753 detected in all three fractions of the six tissue samples (Fig. 2B). To obtain a global view of the data, we performed an unsupervised cluster analysis. The cluster analysis yielded three main clusters that corresponded to the three subnuclear fractions. Each fraction cluster showed further subgrouping, largely according to biological tissue types (see Fig. 3 for a heat map view of the cluster analysis).
To distinguish candidate core components of the NM from contaminants, we performed a statistical analysis to identify proteins that were significantly enriched in the NM fraction relative to the protein levels in the CB and IF fractions. The fraction enrichment analysis yielded 158 proteins enriched in the NM fraction of a total of 1,153 proteins identified in the NM fraction (criteria: p value Ͻ0.05 with minimum 2-fold change). Besides statistical significance, we added the extra filter that the numerical enrichment is consistently observed in at least five of six tissue samples. This allowed us to reduce the number of contaminants while retaining the majority of known NM proteins. See supplemental Table 3 for documentation of the NM-enriched proteins. Highly enriched NM proteins include several heterogeneous nuclear ribonucleoproteins including hnRNPA1, hnRNPA3, and hnRNPA2B1; isoform 1 of far upstream element-binding proteins 1 and 3 (FUBP1 and FUBP3); latent transforming growth factor ␤-binding proteins 2 and 4 (LTBP2 and LTBP4); three isoforms of charged multivesicular body proteins 1a, 2a, and 4b (CHMP1A, CHMP2B, and CHMP4B); paraspeckle protein 1 (PSPC1); the RNAbinding protein ELAV-like protein (ELAVL1/HUR); transcriptional activator protein Pur-␣ (PURA); isoform long of TATA-binding protein-associated factor 2N (TAF15); and nucleoporin-like 1 isoform c (NUPL1) (supplemental Table 3). The enriched NM proteins were evaluated by GO mining, InterPro motif search, and PPI mapping (see below).
Gene Ontology Mining of NM-enriched Proteins-To evaluate the significant association of specific gene ontologies with proteins identified in the NM fraction, we compared the NM-enriched proteins to a set of 1,244 proteins that were reproducibly identified in unfractionated tissue homogenates of three CRC tissue samples (see supplemental Table 4 for documentation of protein identifications in the unfractionated CRC tissues). To this end, we used the GO mining tool FatiGOϩ (27). Interestingly, the NM fraction was uniquely enriched for several GO terms associated with specific nuclear domains and functions. Table I lists the distribution of the significant (adjusted p Ͻ 0.05) biological process, molecular function, and cellular component GO terms that were associated with the proteins enriched in the NM fraction relative to unfractionated CRC tissue lysate. The different levels for each ontology are shown in Table I with the significantly different GO terms listed in order of decreasing significance. For example, the NM-enriched proteins were significantly associated with the cellular component term "nucleus" and with the biological func- The protein abundances were normalized to zero mean and unit variance for each protein. The Euclidean distance measure was used for protein clustering. The sample clustering uses a measure based on the Jeffrey divergence between two Poisson distributions modeling the withinsample variation of each protein in the two samples (see main text for description). Samples from each fraction are clearly clustered together. Three main clusters correspond to the three subnuclear compartments with subclustering according to clinical tissue type. 20q, chromosomal instability including 20q gain (referred to as CIN in the text); ade, adenoma; msi, microsatellite instability (referred to as MIN in the text).
tions "RNA processing," "RNA splicing," and "cell adhesion." In line with this, significant molecular functions included "protein binding," "cytoskeletal protein binding," and "RNA binding." Finally, NM-enriched proteins were significantly associated with the following InterPro motifs: IPR000504, an RNA recognition motif (RNP-1, involved in RNA binding), IPR000694, a prolinerich region (involved in protein-protein interactions), and IPR013111 an epidermal growth factor-like motif that is present in many extracellular and membrane proteins.

Nuclear Matrix Proteome in Colorectal Adenoma and Carcinoma Tissues
In view of a potential important role of the nuclear matrix in colorectal adenoma to carcinoma progression, we further analyzed the nuclear matrix proteome obtained from the two adenoma and four carcinoma tissues to identify cancer-asso-ciated proteins using ␤-binomial statistics (26). The results for the IF and CB fractions will be reported elsewhere. Included in the analysis of carcinoma tissues were the two major modes for genomic instability, CIN (85% of cases) and MIN (15% of cases), to further pinpoint candidate CIN-associated proteins. An overview of the identification data and statistical analyses is presented in Table II, and the NM protein fractionation and a summary of the protein identification data are shown in supplemental Fig. 1. The total data set contained 1,153 proteins with about half of the proteins identified in four of six clinical samples (supplemental Fig. 1). We focus here on the comparisons adenoma versus all CRC samples and adenoma versus CINϩ CRC. Tables III and IV display the lists of nuclear annotated and/or NM-enriched proteins as well as the differential NM proteins with corresponding mRNA change in the comparisons of adenoma versus all CRC tissues (Table III) and adenoma versus CINϩ CRC (Table IV). Supplemental Tables 5 and 6 contain the complete lists and associated spectral count data of the significantly regulated proteins. Statistical comparisons revealed that, of a total of 1,153 identified NM proteins, 129 proteins were significantly regulated (28 up-and 101 down-regulated; p Ͻ 0.05) in the comparison adenoma versus all CRC tissues (Table III and  supplemental Table 5), 169 proteins were significantly regulated (87 up-and 82 down-regulated) in the comparison adenoma versus CINϩ CRC (Table IV and supplemental  Table 6), and 237 proteins were significantly regulated (158 up-and 79 down-regulated) in the comparison adenoma versus MINϩ CRC (supplemental Table 7). Fig. 5 shows a heat map view of the supervised cluster analysis using the differential proteins from the above mentioned comparisons. The tree contains two main clusters consisting of one cluster with the two MIN tumors, and the other main cluster contains in one subcluster the two adenomas and one of the CIN tumors and the other CIN tumor in a separate branch. Interestingly, the CIN tumor in the adenoma cluster that exhibited an "intermediate" expression profile ( Fig. 5 and supplemental Fig. 1A) was a small lymph node-negative tumor (p8 in supplemental Table 1). The majority of the significantly regulated NM proteins exhibited a -fold change Ͼ2 (Tables III and IV and  supplemental Tables 5-7 Tables III and IV and supplemental Tables 5-7). Integration with our previously generated transcriptomics data that are based on a large series of clinical samples (31 adenomas and 36 carcinomas; sets available at Gene Expression Omnibus under accession number GSE8067) reveals multiple candidates (26 of 129 and 37 of 169) that exhibited a corresponding change at the mRNA level (p Ͻ 0.1) (Tables III  and IV and supplemental Tables 5-7). Differential proteins that exhibit a corresponding mRNA change include for example versican V1 (VCAN), myeloid cell nuclear differentiation antigen (MNDA), high mobility group protein B3 (HMGB3), proteins S100-A8 and -A9, isoform 1 of PDLIM4, periplakin (PPL), envoplakin (EVPL), ITGB4, nuclear pore complex protein 88 (NUP88), and isoform 1 of RNA-binding protein 14 (TMEM137/RBM14).
To explore the connectivity of the nuclear annotated NM proteins and/or NM-enriched proteins that were differential in the comparisons adenoma versus carcinoma (n ϭ 6 samples) and adenoma versus CINϩ carcinoma (n ϭ 4 samples), we used the STRING tool. The PPI map of the 68 regulated NM-enriched nuclear proteins from the adenoma versus all CRC comparison is shown in Fig. 6 and contains one large cluster of highly connected down-regulated proteins (cluster A) and two smaller, well connected clusters of which one cluster exclusively contains down-regulated proteins (cluster B). The dense cluster of down-regulated proteins ( 11 nucleoporins and the nuclear pore components nucleoprotein TPR and FIG1 (for details see Table III). The cluster that is directly connected to the nuclear pore protein cluster (Fig. 6C) consists of up-regulated RNA-binding proteins involved in transcriptional regulation (heterogeneous nuclear ribonucleoprotein U-like protein 1 (hnRNPUL1) and DHX9), RNA processing (SYNCRIP), and RNA trafficking (KSRP) as well as down-regulated RNA processing and splicing factors (DDX, BEF, splicing factor arginine/serinerich 6 (SFRS6), and splicing factor U2AF 65-kDa subunit (U2AF2)). The third cluster at the opposing end of the network (Fig. 6B) consists of down-regulated (structural) proteins, including catenins ␤-1, ␣-1, and ␦-1 (CTNNB, CTNNA1, and p120); tight junction protein ZO-2 (TJP2); vinculin (VCL); and LIM domain and actin-binding protein 1 (LIMA1). In addition, a number of proteins are poorly or unconnected (e.g. VCAN, actin-binding LIM protein 1 (ABLIM1), and neuregulin 1 (NDF); see Fig. 6).
The PPI map of the 96 regulated NM-enriched nuclear proteins from the adenoma-chromosomal instable carcinoma comparison is shown in Fig. 7 and contains two highly connected clusters; one symmetric cluster consists of exclusively down-regulated proteins of the nuclear pore complex (NUPL1, NUP88, NUP133, NUP85, NUP205, NUP98, NUP214, FIG1, and TPR), and the other cluster consisting of four up-and six down-regulated proteins involved in RNA processing and splicing. The up-regulated proteins include hnRNPUL1, Y-box-binding protein 1 (YBX1), the transcriptional regulator hnRNPD, and the transcriptional activator ATP-dependent RNA helicase A (DHX9) (Fig. 7). The downregulated proteins in the second cluster include poly(rC)binding protein 2 (PCBP2/hnRNPE2), hnRNPH2, protein mago nashi homolog (MAGOH), SFRS6, stromal cell-derived growth factor SF20 (SF2), and U2AF2 (Fig. 7). The nuclear pore and RNA processing clusters are connected through hnRNPUL1 that is directly connected to NUP214 and NUP98 and through YBX1 that is indirectly linked to the nuclear pore cluster through its connection to (up-regulated) nucleolin (NCL) that in turn is connected to NUP98 (Fig. 7). Other up-regulated proteins in close vicinity of the RNA processing-splicing cluster include the up-regulated far upstream element-binding protein 2 (KSRP) and heterogeneous nuclear ribonucleoprotein Q (SYNCRIP) that are directly connected through hnRNPD. The up-regulated RNA-binding protein DAZAP1 is connected to YBX1, and peptidylprolyl cis-trans isomerase NIMA-interacting 4 (PIN4) is indirectly connected through KSRP (Fig. 7). In addition, several up-regulated histones (H2AFY, HIST1H1D, and HIST1H1C) are connected to nucleolin (Fig. 7) as well as MNDA and nucleolar protein NOP5 (NOL5). More diffuse areas of the network include mainly structural proteins such as PDLIM4, VCL, LAMC1, ITGB4, galectin-3-binding protein (90K), and galectin-3 (MAC-2).

DISCUSSION
Nuclear phenotypic alterations and chromosomal instability are associated with progression toward malignancy. Scaffolding proteins may play an important underlying role in this process. In the past, a specific differential biochemical extraction procedure has been devised to investigate such a nuclear scaffolding network, or matrix (13,14). We are the first to provide a deep, MS/MS identification-based view of the NM fraction, the final product of this procedure. Importantly, through differential comparison against two earlier fractions that contain nuclear proteins, i.e. the chromatin-binding and the intermediate filament fractions, we are for the first time able to pinpoint NM-enriched proteins that may represent core constituents. Moreover, we show that a combination of this differential extraction protocol together with GeLC-MS/MS analysis and spectral counting as a means of quantification is reproducible (average CV% of Ͻ30%), allowing us in a proof-of-concept study to identify NM proteins associated with adenoma to chromosomal instable carcinoma progression. We propose that sequential cellular extraction of proteins in combination with nano-LC-MS/MS and protein-protein interaction network analysis   provides an interesting avenue for the analysis of (compromised) nuclear complexes in patient tumor tissues.

Differential Extraction of Nuclear Proteins and Identification of NM-enriched Proteins
Previous studies have used differential extraction protocols that are most often based on the procedure elaborated by Penman and co-workers (13,14). For analysis of the NM fraction in diverse cells and tissues, most studies have used 2DGE (17), but these studies have yielded only limited protein identification information to date. To provide a more comprehensive view and to overcome some of the limitations with solubility inherent to 2DGE, we used an alternative method consisting of GeLC-MS/MS on an LTQ-FT MS platform. In brief, the majority of the protein mass (Ͼ90%) isolated from tissue was first removed by sequential treatment with detergent, high salt buffers, and DNase treatment after which the so-called CB (proteins bound to/associated with chromatin), IF (a mixture of cytoskeletal scaffold filaments), and NM (the non-intermediate filament part of an insoluble nuclear scaffold and scaffold filaments) fractions are harvested. We emphasize that the work flow presented here allows for profiling of relatively small tissue samples (ϳ50 -100 mg). We extracted an average of 70 g of NM protein/100 mg of CRC tissue, and such a protein yield is comparable with another study where a total of 300 -500 g of NM protein was extracted from ϳ1 g of bladder tissue (29). The high performance of our work flow was shown in our reproducibility study where we identified 889 reproducibly detected proteins (CV% Ͻ 30%) of a total of 1,047 proteins in the NM fraction from CRC tissues. In the CB, IF, and NM fractions derived from six clinical samples, we identified a total of 2,059 proteins, about half of them with nuclear annotation, a clear enrichment as compared with unfractionated CRC tissue. Combined with our subnuclear compartment enrichment analysis based on six biological samples, we provide, to the best of our knowledge, the most comprehensive view of the NM fraction so far.
The biochemically defined NM, a detergent-, salt-, and DNase-resistant nuclear remnant after a series of biochemical extractions developed by Albrethsen et al. Berezney, Coffey and Penman (13,14), offers a "snapshot" of more than the "NM proper" per se: it includes many factors that are (also) involved in the generation, processing, and export of ribosome subunits and RNA molecules. Many of these are (also) nucleolar constituents. We are interested in the NM proper as a fibrogranular scaffold and organizing entity involved in (regulation of) chromatin structure and function. As a first step toward identifying the NM core components, we have generated a list of proteins enriched in the NM fraction relative to

Nuclear Matrix Proteomics of Colorectal Cancer
the CB and IF fractions. Many efforts in the past have aimed to identify a set of common or "core" NM proteins as well as tissue-and/or cancer-specific NM proteins; so far, ϳ300 NM mammalian proteins have been collected in an electronic database (30). However, most studies only describe a limited number of proteins, and core NM components cannot be deduced (7). In the NM fraction of six colorectal tissues, we identified out of a total of 1,153 identified proteins 158 significantly enriched proteins as compared with the CB and IF fractions. By GO mining, we found that these proteins are significantly associated with "mRNA processing" and "RNA metabolic process." In line with these findings, particularly prominent in the molecular interaction map of proteins enriched in the NM was a dense cluster of proteins involved in pre-mRNA processing and, directly connected to it, multiple proteins involved in RNA splicing. How these functions are linked to the core fibrogranular NM scaffold remains to be investigated.
Among the proteins commonly detected by 2DGE in the NM fractions from diverse tissues and cell lines are several hnRNP family members as well as NUMA and lamins as "hallmark" constituents of the NM. Indeed in our analysis, many members of the hnRNP protein family were truly enriched in the NM fraction as compared with the CB and IF fractions (hnRNPA0, -A1, -A2B1, -A3, -AB, -D, -DL, -F, -H1, -H2, -H3, -L, and -R). It has been concluded from a large variety of experiments that hnRNP family members (such as hnRNPA1 and -B1) are responsible for a minor proportion of the protein content of the NM filaments (15). The nuclear lamins are members of the intermediate filament protein family and, as such, part of the tissue skeleton and therefore obvious candidates as main structural proteins of the NM. Lamins are primarily found at the inner side of the nuclear envelope and are probably not the primary constituents of the interior NM filaments (15); nonetheless, lamins have also been identified in the interior nucleus (31). Here, we have identified lamin A/C as an NM-enriched protein, whereas lamins B1 and B2 are enriched in the IF fraction, supporting several studies that find that specifically lamin A/C play important nuclear roles (60).
Association of the NM with RNA-binding proteins and RNA metabolism is in line with the notion that the NM is an RNAprotein-based network. Besides "RNA functions," the NMenriched proteins were also associated significantly with cytoskeletal protein binding, cell adhesion, and "extracellular matrix structural constituent" functions. Although one would not anticipate extracellular matrix, cell adhesion, and cytoskeletal proteins as NM constituents a priori, a nuclear role for at least some of these proteins cannot be ruled out. Indeed, several studies now support the concept of an interconnected protein skeleton in tissue, extending from the innermost structures of the nucleus through the cytoskeleton to the extracellular matrix, and in this way the entire tissue skeleton could be involved in gene regulation (17,32,33). It goes without saying that further work is needed to distinguish contaminants from shuttling proteins and proteins with a (structural) function in the nucleus. As to one promising candidate of this group, there is now sufficient evidence that actin is implicated in several nuclear activities together with a growing number of actin-binding proteins (34). Actin, myosin, cytokeratins, and spectrin have all been identified in NM filaments by immunodetection (35). Indeed, in this study, many actin-binding proteins were enriched in the NM fraction (e.g. PDLIM1 and PDLIM4), and "actin binding" was a significant molecular function term associated with NM-enriched proteins. As such, our results support that in particular actin is a likely candidate as a major NM protein.
Other novel candidate NM proteins enriched in the NM preparations include the three isoforms CHMP1A, CHMP2B, and CHMP4B. Especially the chromatin modifying properties of at least one member of this family (CHMP1), found to be a nuclear matrix protein, are intriguing (36). Other proteins highly enriched in the NM include PSPC1, the RNA-binding protein ELAV-like protein (ELAVL1/HUR), transcriptional activator protein PURA, isoform long of TAF15, and isoform 1 of FUBP1 (up in CIN-CRC; see below) and FUBP3. Interestingly, the cancer stem cell protein nestin was identified in this study as an NM-enriched protein. Nestin may play a role in the trafficking and distribution of intermediate filament proteins FIG. 5. Heat map view of supervised cluster analysis using differential NM proteins from adenoma-carcinoma, adenoma-CIN, and adenoma-MIN comparisons. Heat map and cluster analysis of the spectral count data of six samples from the NM fraction is shown. A list containing 342 curated proteins resulting from the differential analyses was used. The protein abundances were normalized to zero mean and unit variance for each protein. Two clusters are formed for the adenoma group and MIN group. 20q, chromosomal instability including 20q gain (referred to as CIN in the text); ade, adenoma; msi, microsatellite instability (referred to as MIN in the text). and potentially other cellular factors to daughter cells during progenitor cell division (37). Further experiments are needed to determine which of the differentially enriched NM proteins are important structural components of the NM, and which ones are core NM-associated proteins. A combined effort of proteomics, microscopy, and immunodetection will be needed to further unravel the molecular composition and architecture of the NM.

Nuclear Matrix Proteins Regulated in Colorectal Adenoma and Carcinoma Tissues
To show the potential of our work flow for investigation of the nuclear matrix in the context of chromosomal instability, we further analyzed the nuclear matrix proteome in an exploratory study of adenoma and MINϩ and CINϩ carcinoma tissues. Our analysis revealed 129 and 169 differential proteins in the comparisons adenoma versus all CRC samples and adenoma versus CINϩ CRC, respectively. For clarity, we discuss these proteins in three categories: (i) chromosomal instability; (ii) nuclear protein complexes, nuclear bodies, and structure; and (iii) oncogenes, tumor suppressors, and binding partners.

Chromosomal Instability
The most common form of genomic instability observed in CRC is chromosomal instability, the molecular basis of which remains to be determined. The HMG proteins are associated with chromosomes, albeit it not exclusively, and have been implicated, through different mechanisms, in both benign and malignant neoplasias. HMGB1 was previously identified in the NM fraction from colon cancer tissue by 2DGE, and these results suggested enhanced localization to the NM in colon tumors relative to healthy tissue (38). In this study, we confirmed the up-regulation of HMGB1 in the NM fraction of chromosomal instable CRC and show, that besides HMGB1, HMGB2 and HMGB3 were also specifically up-regulated in the CINϩ CRC samples with HMGB3 up-regulation also being reflected at the mRNA level. HMGB1/2 have been implicated in cellular response to chemotherapy-induced DNA damage (39) and can up-regulate topoisomerase II (40), providing a possible link to chromosomal instability. Indeed, overexpression of HMGB1 induced unbalanced chromosomal rearrangements in human prostate cancer cell lines (41). Furthermore, HMGB1 and topoisomerase II co-localize at AT-rich scaffold/ matrix-attached regions that are located at the base of DNA loop domains and thus present juxtapositions for chromosomal rearrangements (42).
B23/nucleophosmin (NPM) is a protein that is often encountered in NM preparations and has been implicated, among others, in pre-rRNA processing and ribosomal subunit export from the nucleus (43,44) as well as centrosome duplication (45). Increased NPM expression has been linked to neoplastic colorectal mucosa (both adenomas and carcinomas) (46). We did not find a statistically significant difference between the NPM levels in the NM fraction of adenoma versus chromosomal instable carcinoma, but several of the NPM-interacting factors identified by Maggi et al. (44) are present in our list of differential proteins, being either up-regulated (NCL and DHX9) or down-regulated (CPSF6, hnRNPH2, DDX5, and Nup62). A putative NMP interactor that is highly differential between adenomas and carcinomas in our data set is PIN4/ PAR14/parvulin, a peptidyl-prolyl cis-trans isomerase also harboring a DNA binding domain. This protein has been implicated in nucleolar pre-rRNA processing (47). Both PIN4 and NPM can be detected in pre-rRNP complexes, and PIN4 co-localizes with NMP both in interphase nucleoli and in mitotic spindles (47). In view of the latter observation and as PIN4 seems to accumulate at chromosomes during mitosis, it might have a role in proper partitioning of material (pre-rRNA, prenucleolar bodies, or chromosomes) during cell division. PIN4 does not seem to be enriched in NM preparations with respect to chromatin-binding and intermediate filament fractions. In view of the DNA binding and subcellular shuttling activities of PIN4 (48), this may not come as a surprise.

Nuclear Protein Complexes, Nuclear Bodies, and Structure
Splicesosome-Many NM proteins identified before and many of the differential proteins identified here, especially hnRNPs, have been implicated in spliceosome structure and nucleosome function (49,50); some of them are up-regulated, whereas others are down-regulated. An association with cancer (according to literature in the respective entries in GeneCards) exists for many hnRNPs, KHSRP, DHX9, DDX5, SFRS1, and SFRS6, leaving the other differential proteins (e.g. SYNCRIP and hnRNPAB) as potential novel candidates. In a recent proteomics study (51), galectin-3 (GAL3) was shown to stabilize SYNCRIP to maintain proliferation of human colon cancer cells. Interestingly, we found, in addition to up-regulation of SYNCRIP, also up-regulation of GAL3, underscoring a potential interaction between these two proteins in clinical material.
Nucleolus-The nucleolus is a major component of biochemical NM preparations. There is evidence that the nucleolus is the mirror of a series of metabolic changes that characterize cancer cells. Cell entry into the cell cycle is always associated with up-regulation of the nucleolar function and increased nucleolar size (52). Of the nucleosome-associated proteins with a cancer link, we note the strong up-regulation of NCL, a well known, established NM component. In addition, we detected several other cancer-related NM proteins (Table IV) with a previously reported association to the nucleolus (53), among others NOP5/NOP58 (up in CIN), S100 A-8 and A-9 (up in CIN; also with a corresponding mRNA change), HP1BP3 (up in CIN), cysteine-and glycine-rich protein 1 (CSRP1; NM-enriched and up in CIN), 40 S ribosomal protein S19 (RPS19; up in CIN) and 28 S ribosomal protein S12 (RPS12; up in CIN).
Nuclear Pore Complex-The nuclear pore complex (NPC) is closely linked via lamins to (the rest of) the NM. A striking observation of this study is that many NPC proteins were down-regulated in the carcinoma samples, including NUPL1 (which is also an NM-enriched protein), NUP214, NUP98, NUP133, NUP205, NUP88 (also mRNA down), NUP85, NUP93, NUP62 (also mRNA down), NUP54, NUP160, and NUP50 (Tables III and IV). This massive down-regulation of multiple NPC proteins may indicate cellular and nuclear crisis and loss of nuclear integrity. Interestingly, NUP98 is also a modulator of key spindle checkpoint proteins such as Mad and Bub proteins and has been linked to chromosomal instability (for a review, see Ref. 54). Together with TMPO, a putative lamin receptor in the nuclear envelope (which is up-regulated in CINϩ CRC samples), the NPC may form essential anchor points for the nuclear lamina, the outer shell of the NM scaffold.

Oncogenes, Tumor Suppressors, and Binding Partners
Although their relationship to the NM is unclear as yet, catenins ␣-1, ␤-1, and ␦-1 were all down-regulated in the NM fraction of CRC samples. The adenomatous polyposis colitis/ ␤-catenin signaling pathway is of high importance in colon cancer (55,56). Catenin ␤-1 is involved in the regulation of cell adhesion and in signal transduction through the Wnt pathway. Catenin ␦-1 has been implicated in both cell transformation by SRC and ligand-induced receptor signaling through various receptor tyrosine kinases. Together, the differential regulation of these proteins may result in an altered regulation of dynamic actin-based, cytoskeletal, and signaling activities in the CRC tissues. Another example of a protein up-regulated specifically in CINϩ CRC and highly enriched in the NM fraction is FUBP1 (or FUSE-binding protein 1), a DNA helicase that regulates MYC expression by binding to a single-stranded far upstream element (FUSE) upstream of the c-myc promoter (57). It has generally been accepted that c-myc plays an important role in CRC progression, but its exact activators remain poorly understood. Here we identify FUBP1 as a novel candidate. Finally, as another example of an important CINϩ cancer-associated protein found in our NM preparations, the TP53 interactor YBX1 may be mentioned. YBX1 expression is correlated with tumor stage, drug resistance, and patient prognosis for many human tumors, including malignancies of the colorectum (58). A recent study correlated increased expression of YBX1 together with reduced E-cadherin expression to poor patient survival in invasive breast cancer (59). Here we also found correlation of up-regulation of YBX1 with down-regulation of E-cadherin in the CINϩ carcinomas.

Conclusion
We propose that the protocol used here for differential extraction of subnuclear fractions can be used for comparative protein complex profiling in small tissue samples. Important changes in tumor cells are alterations of nuclear structure and chromatin texture. We are the first to provide an extensive analysis of the protein constituents that are reproducibly detected as well as enriched in the NM fraction, and in a proofof-concept study, we have pinpointed components that exhibit quantitative differences in chromosomal instable colorectal carcinomas (85% of CRC) relative to adenomas and microsatellite instable carcinomas. We anticipate that more extensive nano-LC-MS/MS-based subnuclear proteomics of patient tumor tissue in various stages of progression from colorectal adenoma to chromosomal instable carcinoma, combined with microscopy follow-up, could in future studies establish the missing link between diagnostic investigation of tissue in the pathology laboratory and the underlying cancer genetic mechanisms. * This work (proteomics infrastructure) was supported by the VUmc-Cancer Center Amsterdam.
□ S This article contains supplemental Fig. 1