Proteome Dynamics during C2C12 Myoblast Differentiation*S

Mouse-derived C2C12 myoblasts serve as an experimentally tractable model system for investigating the molecular basis of skeletal muscle cell specification and development. To examine the biochemical adaptations associated with myocyte formation comprehensively, we used large scale gel-free tandem mass spectrometry to monitor global proteome alterations throughout a time course analysis of the myogenic C2C12 differentiation program. The relative abundance of ∼1,800 high confidence proteins was tracked across multiple time points using capillary scale multidimensional liquid chromatography coupled to high throughput shotgun sequencing. Hierarchical clustering of the resulting profiles revealed differential waves of expression of proteins linked to intracellular signaling, transcription, cytoarchitecture, adhesion, metabolism, and muscle contraction across the early, mid, and late stages of differentiation. Several hundred previously uncharacterized proteins were likewise detected in a stage-specific manner, suggesting novel roles in myogenesis and/or muscle function. These proteomic data are complementary to recent microarray-based studies of gene expression patterns in developing myotubes and provide a holistic framework for understanding how diverse biochemical processes are coordinated at the cellular level during skeletal muscle development.

Skeletal muscle consists of tightly bundled cylindrical multinucleated myocytes, each packed with ordered tandem arrays of actin-myosin contractile filaments, which together account for nearly 40% of the total mass of the human body. These highly differentiated cells originate late in embryogenesis from somite-derived mesenchymal myoblast precursor cells in a multistep pathway initiated in response to inductive physiological cues (1). Activation of the myogenic program is normally tightly coupled to a network of signal transduction pathways that act to repress or stimulate cell proliferation and differentiation during embryonic development (1). These pathways function, at least in part, by altering the levels of critical target regulatory proteins through both transcriptional and post-transcriptional mechanisms (1). However, the developmental transitions responsible for myoblast commitment, myocyte specification, and possible developmental plasticity are not fully understood and are just now beginning to be described in integrated molecular terms (2).
Cultured undifferentiated fibroblast-like mouse and human myoblast cell lines serve as excellent model systems for investigating the complex biochemical adaptations that underlie the formation of functional myocytes. In the absence of mitogenic stimuli, proliferating myoblasts synchronously withdraw from the cell cycle, elongate, adhere, and finally fuse together to form myotubes exhibiting most, if not all, of the principle mechanobiochemical adaptations associated with contractility (1). Molecular genetic studies of this myogenic program have provided fundamental insight into key regulatory events associated with this profound resetting of cell physiology (2). These include the discovery of critical intracellular signaling pathways and their target sequence-specific transcriptional factors that control cascading waves of gene expression in terminally differentiating myoblasts, leading to large scale biological reorganization and the formation of functional myofibrils (3,4). Nonetheless knowledge of the full range of biochemical adaptations associated with myocyte formation remains incomplete, masking the complexity that is likely to exist in vivo.
Genome scale molecular profiling studies offer a unique opportunity to investigate the molecular hierarchy and biochemical logic that governs muscle cell development and physiology. To this end, several groups have reported the use of DNA-based microarray technology to examine global patterns of transcription in cultured myoblasts during the transition from mitotic cell proliferation to terminal differentiation (5)(6)(7). These studies have provided evidence of significant changes in gene expression patterns during myogenesis. However, because mRNA transcript levels do not always correlate with corresponding cognate protein levels (8), determination of the full spectrum of biochemical alterations associated with skeletal muscle formation might be best ascertained by monitoring alterations in protein abundance and turnover directly (9,10). Systematic comparison of changes in the proteome composition of mitotic and differentiating myoblasts should also provide insight into the various mechanisms and pathways that underlie the formation of skeletal muscle.
To this end, Tannu and colleagues used two-dimensional (2D) 1 PAGE-based proteomic methods to compare the levels of soluble proteins in mitotic and fully differentiated C2C12 mouse myoblasts (11). Although several notable alterations in protein abundance were uncovered, overall the proteome adaptations detected were surprisingly slight and less dynamic than widely assumed. Given that gel-based methods have limited sensitivity, modest overall dynamic range, and substantive detection biases (12,13), it is likely that many of the proteome perturbations associated with the muscle differentiation program went undetected, particularly those associated with lower abundance proteins.
Recent advances in alternative gel-free proteome profiling methods combining LC with ultrasensitive tandem MS-based peptide sequencing (LC-MS) offer a complementary and likely more effective experimental approach for examining global changes in protein expression as a function of development (14,15). Here we report the results of an extensive LC-MSbased shotgun profiling analysis of alterations in protein expression in differentiating C2C12 myoblasts as a function of the myogenic program. Striking changes were detected in the abundance of hundreds of proteins linked to cell adhesion, intracellular signaling, gene expression, metabolism, and muscle contraction, consistent with a substantive and highly dynamic biochemical remodeling of cell function. These data were broadly consistent with the predictions reported in a recent microarray-based study of myogenic gene expression by Tomczak et al. (7) and provide compelling evidence for the involvement of previously overlooked transcription regulators, signaling factors, and adhesion molecules as well as novel uncharacterized proteins in skeletal muscle development and contractility.

EXPERIMENTAL PROCEDURES
Materials-Ultrapure grade urea, ammonium acetate, ammonium bicarbonate, DTT, EDTA, iodoacetamide, and calcium chloride were obtained from Sigma. Heptafluorobutyric acid was obtained from BioLynx (Brockville, Ontario, Canada). Poroszyme bead immobilized trypsin was purchased from Applied Biosystems (Streetsville, Ontario, Canada). Sequencing grade endoproteinase Lys-C was obtained from Roche Diagnostics (Montreal, Quebec, Canada). HPLC grade acetonitrile, methanol, and distilled water were obtained from Fisher Scientific. SPEC Plus PT C 18 solid phase extraction pipette tips were purchased from Ansys Diagnostics (Lake Forest, CA). Tissue culture media, serum, media supplements, and plasticware were obtained from Invitrogen.
Cell Culture and Protein Preparation-Mitotic C2C12 mouse myoblasts (American Type Culture Collection (ATCC), Manassas, VA) were passaged as subconfluent monolayers in Dulbecco's modified Eagle's medium (Invitrogen) supplemented with 20% fetal bovine serum, 200 mM L-glutamine, 10 units/ml penicillin, and 10 g/ml streptomycin. Confluent (90%) myoblasts were differentiated into myotubes by culturing the cells in Dulbecco's modified Eagle's medium supplemented with 2% horse serum. Crude nuclear extracts of harvested cells were prepared using a commercial kit (Nu-CLEAR TM extraction kit, Sigma). Briefly cell pellets were resuspended in 5 volumes of hypotonic lysis buffer (10 mM HEPES, pH 7.9, 1.5 mM MgCl 2 , 10 mM KCl) and incubated on ice for 15 min. The suspension was centrifuged for 5 min at 420 ϫ g, and the resulting pellet was resuspended in 400 l of lysis buffer. The cells were disrupted using a glass tissue homogenizer with a type B pestle and centrifuged for 20 min at 1,000 ϫ g. The crude nuclear pellet was resuspended in extraction buffer (20 mM HEPES, pH 7.9, 1.5 mM MgCl 2 , 0.42 M NaCl, 0.2 mM EDTA, 25% glycerol) and incubated for 30 min with gentle shaking followed by several strokes with a glass pestle. The suspension was centrifuged for 5 min at 21,000 ϫ g with the crude nuclear extract supernatant used for further analysis.
Protein Digestion-Aliquots of extract (150 g of total protein) were precipitated with 5 volumes of ice-cold acetone overnight at Ϫ20°C. The precipitate was centrifuged for 20 min at 21,000 ϫ g, and the pellet was dissolved in solubilization buffer (8 M urea, 50 mM Tris-HCl, pH 8.5, 2 mM DTT) for 1 h at 37°C. The soluble proteins were alkylated with 5 mM iodoacetamide, diluted to 4 M urea with 50 mM ammonium bicarbonate, pH 8.5, and then digested overnight with endoproteinase Lys-C at 37°C. The following day, the samples were further diluted to 2 M urea with 50 mM ammonium bicarbonate, pH 8.5, supplemented with CaCl 2 to a final concentration of 1 mM and digested overnight with Poroszyme immobilized trypsin beads at 30°C with rotation. The resulting peptide mixtures were desalted using solid phase extraction with SPEC-Plus PT C 18 cartridges according to the manufacturer's instructions and stored at Ϫ80°C until further use. Sample pH was made Ͻ3 by adding formic acid prior to LC-MS analysis.
Multidimensional Protein Identification Analysis-The multidimensional protein identification technology (MudPIT) shotgun profiling methodology reported by Washburn et al. (16) was used essentially as described in Kislinger et al. (17). Briefly a Surveyor quaternary HPLC pump (Thermo Finnigan, San Jose, CA) was interfaced with a Finnigan LCQ DECA XP ion trap tandem mass spectrometer. A 150-minner diameter fused silica microcapillary column (Polymicro Technologies, Phoenix, AZ) was pulled using a P-2000 laser puller (Sutter Instruments, Novato, CA) and packed with ϳ10 cm of 5-m Zorbax Eclipse XDB-C 18 (Agilent Technologies, Inc., Mississauga, Ontario, Canada) followed by ϳ6 cm of 5-m Partisphere strong cation exchange resin (Whatman). Samples were loaded manually onto a separate column using a pressure vessel with the loaded column then connected to the HPLC pump using a polyetheretherketone microcross. Each sample was analyzed via a fully automated 15-cycle data-dependent chromatographic analytical method using a low flow (Ͻ300 nl/min) set-up essentially as described previously (17).
Informatics-Over 250,000 uninterpreted product ion mass spectra were sequence-mapped against a locally maintained, minimally redundant data base of human and mouse protein sequences (Swiss-Prot and TrEMBL; downloaded from the European Bioinformatics Institute) using the SEQUEST data base search algorithm (18) running on a multiprocessor computer cluster. The false discovery rate was estimated by also searching against an equivalent number of reversed decoy non-sense protein sequences (17). Putative protein sequence matches were evaluated statistically using STATQUEST, a Perl-based computer algorithm that uses an error estimation model to compute the accuracy of individual predictions to minimize the false discovery rate (17). A stringent threshold cutoff (p value, Ͻ0.05), corresponding to a 95% or greater likelihood that a given protein was correctly 1 The abbreviations used are: 2D, two-dimensional; LC-MS, LC coupled to tandem MS-based peptide sequencing; MudPIT, multidimensional protein identification technology; GO, Gene Ontology; RT, reverse transcription; CREB, cAMP-response element-binding protein; CDK, cyclin-dependent kinase; MEF, myocyte-specific enhancer factor; MAPK, mitogen-activated protein kinase; MAPKK, MAPK kinase.
identified, was used to filter the data sets. The 1,865 high confidence proteins were then subgrouped into specific functional categories based on the Gene Ontology (GO) annotation schema (19) using the Perl-based computer program GOClust (17). 1,487 proteins (80%) were linked to one or more GO terms: 1,138 proteins (61%) mapped to the principle GO category biological process, 1,316 proteins (71%) mapped to the GO category molecular function, whereas 1,184 proteins (64%) mapped to the GO category cellular component (Supplemental Table S1).
Hierarchical Clustering, Data Visualization, and Cluster Evaluation-The total cumulative spectral count recorded from each protein was interpreted as a semiquantitative measure of relative abundance (20). For the time course analysis, relative protein abundance was estimated by calculating the ratio of the natural logarithm of total spectra detected for each protein relative to that detected in the undifferentiated, asynchronously proliferating (day 0) myoblast cell population. Use of the log scale allows negative expression ratios to be more readily visualized as a -fold change rather than fractional (simple ratio) data. It also tightens the spread of the data points, allowing for more subtle local differences to be detected across a broader dynamic range. Hierarchical clustering was carried out using the Cluster 3.0 freeware software package using the correlation distance metric with average linkage selected. To improve data grouping, a nominal low, non-zero cutoff value (0.01) was substituted for blank values in cases where a protein was not detected in a particular sample. The clustered data profiles were visualized in heat map format using the TreeView software package (21).
Statistical enrichment of proteins matching to select functional categories (GO terms) within each of the clusters was assessed using the hypergeometric distribution (22), which reflects the probability (p value) that the intersection of a given protein list with any given annotation term occurs more frequently than would be expected by chance. A Bonferroni amendment factor was used to correct for multihypothesis testing; the p value deemed significant for an individual test was determined by dividing the preliminary value by the number of tests conducted, thereby accounting for spurious significance due to repeat testing over all the categories in the GO data base. A threshold cutoff value of 10 Ϫ3 was used as a final selection criterion to highlight promising, biologically interesting clusters.
Proteomic and Microarray Data Set Comparisons-The complete supplemental gene expression data sets from the microarray study of Tomczak et al. (7), which was based on the mouse MG_U74Av2 and MG_U74Cv2 Affymetrix (Santa Clara, CA) GeneChip platforms, were parsed into a relational data base. Cross-comparison of the respective genomic and proteomic profiles was carried out by first crossreferencing the respective cognate gene products by mapping Swiss-Prot/TrEMBL accessions using annotation tables downloaded from the Affymetrix website. The scaled microarray data were then aligned and co-clustered with the proteomic patterns using the Spearman distance metric. To correct for differences in experimental design and slight temporal shifts, the entire microarray data set was normalized by dividing each data point by the value recorded for either day Ϫ2 or day 0 according to the nomenclature of Tomczak et al. (7), whereas the day 0 proteomic time point was used as the reference for calculating the ratio of protein expression. The entire set of mapped gene protein pairs with their corresponding data values is presented in Supplement Table S2.
Reverse Transcription (RT)-PCR-Total RNA was extracted from cells using Tripure (Roche Applied Science). Total RNA (100 ng) was used for RT-PCR essentially as described previously (23). cDNAs were amplified using specific primer pairs based on available NCBI sequence data (Supplemental Table S3) and visualized on 1% agarose gels. Between 22 and 32 amplification cycle numbers were used per reaction to ensure linearity of response.

Proteomic Investigation of an in Vitro
Model of Myogenesis-A temporally well defined myogenic differentiation program can be triggered selectively in cultured C2C12 myoblasts upon withdrawal of media-derived growth factors and mitogens (23). When switched to differentiation medium (see "Experimental Procedures"), mitotic C2C12 myoblasts rapidly cease proliferating and initiate a synchronously terminal differentiation program (Fig. 1). The cells also exhibit striking morphological changes over the course of 2-6 days, eventually fusing into mature multinucleated myotubes (i.e. by day 6).
To investigate the global dynamics of protein expression occurring during myogenesis, we used a variant of the gelfree MudPIT method first pioneered by Washburn and colleagues (16) and adapted and optimized for large scale profiling of mammalian cells and tissue by our group (17). Nuclear enriched protein fractions were isolated from both proliferating C2C12 myoblasts and from differentiating myotubes harvested at four different time points (2, 4, 6, and 10 days postserum withdrawal) following induction of the differentiation program. The samples were extensively digested with endoproteinase Lys-C and trypsin, and the peptide mixtures were analyzed by multiple rounds of capillary scale multidimensional liquid chromatography coupled on line to data-dependent fragmentation using a fully automated ion trap instrument (16). This efficient and highly sensitive method of proteome mapping circumvented many of the limitations associated with more traditional 2D gel-based proteomic technology (24), enabling a far more comprehensive characterization of lower abundance proteins in particular (16,17).
Approximately 50,000 fragmentation spectra, encompassing Ͼ5,000 peptides selected for collision-induced dissociation, were acquired for each protein fraction. The data base search algorithm SEQUEST (18) was used to match these spectra to peptide sequences present in a minimally redundant data base of mouse and human proteins obtained from Swiss-Prot/TrEMBL (25). The probability assessment algo- rithm STAQUEST (17) was then used to assign a statistical confidence score to each putative candidate identification. In practice, due to the non-linear distribution of data base confidence scores, the majority of the predictions had extremely high likelihoods of being correct. As seen in Supplemental Fig. S1, the majority of data base matches had predicted likelihood scores greater than 99%. As an independent quality measure, the preponderance of incorrect (false positive) matches was empirically calculated by populating the reference data base with an equal number of mock decoy proteins created by inverting the amino acid orientations of the normal Swiss-Prot/TrEMBL protein sequences. Matches to these "reverse" sequences represent spurious false positives because they are not expected to occur naturally. The final proportion of matches mapping to reverse proteins relative to normal (or "forward") proteins provided an objective criterion for estimating the false discovery rate.
Quality filtering of large scale expression data sets represents a trade-off between specificity (precision), which reflects the proportion of correct identifications, and sensitivity (recall), which indicates the bone fide proteomic coverage attained. Receiver-Operator-Characteristic plots are often used to assess the effects of varying classification criteria on classification precision and recall. Although prior knowledge of the correct class labels is not available (because we do not know a priori which proteins are in fact present in the samples), in practice one can estimate this trade-off empirically based on the fraction of data base matches to the forward and reverse sequences after applying various quality filters. The Receiver-Operator-Characteristic-like plot shown in Supplemental Fig. S2 shows the effects of applying confidence filters of different stringency to the time course data sets.
To ensure dependable accuracy, we opted to apply a stringent first pass probability filter (albeit at the expense of reduced detection coverage) corresponding to a minimum of Ն95% confidence to each candidate peptide match. This stringent filtering resulted in the detection of 1,865 high confidence (p value, Ͻ0.05) proteins throughout the myogenic differentiation program. Importantly only ϳ2.5% reverse decoy proteins passed this criteria (data not shown), confirming its reliability. A high level summary of the final data set is shown in Table I, whereas a breakdown of the spectral counts detected per protein per time point is provided in Supplemental Table S4. A more complete description of the search results (including peptide sequence, precursor ion mass and charge, and the search algorithm and preliminary confidence scores) is provided in Supplemental Table S5. Several hundred of these high confidence proteins appear to be muscle lineage-specific as they were not detected in previously reported extensive proteomic analyses of unrelated mouse tissues, such as lung and liver (Ref. 17; data not shown). These included a surprisingly large number (ϳ250) of uncharacterized proteins (i.e. "hypothetical," "unknown," or simply labeled as "Riken" cDNA products).
Data Clustering and Visualization-To better monitor changes in the global pattern of protein expression throughout the differentiation program, we next applied large scale data clustering and visualization algorithms, used widely to monitor genome-wide differences in gene expression (21,26), across the entire set of proteomic profiles. Changes in relative protein abundance were estimated based on the observed FIG. 2. Hierarchical clustering of the myogenic proteomic profiles. For this time course analysis, relative protein abundance was calculated based on the ratio of the natural logarithm of the protein spectral count detected in each of the differentiation time points (2-10 days postserum withdrawal) relative to that observed in the proliferating myoblast cell population (i.e. day 0). Cluster membership was evaluated for statistical enrichment to select functional annotation categories (GO terms) to reveal biologically relevant changes in the underlying patterns of protein expression. A few select, statistically significant GO terms are reported. MB, myoblasts; MT, myotubes; Ref., reference sample; d, days.

TABLE I High confidence proteins identified by mass spectrometry
Listed are the total number of proteins with a confidence level higher than 95% and the corresponding spectral counts identified in each cell fraction. MB, myoblasts; rm, reverse match as a measure of false positives (% total).

Proteins
Total ratio of total spectra matching each protein detected in the four differentiation time points relative to the starting undifferentiated myoblast population (see "Experimental Procedures"). Hierarchical clustering of the resulting expression profiles ( Fig.  2) revealed successive, overlapping clusters of proteins that were either induced (up-regulated) as a function of the differentiation program or whose relative abundance diminished (downregulated) concomitant with the cessation of cell division. Because functionally related proteins are often coordinately expressed (26), clusters can be indicative of a shared biological role. Hence we examined cluster membership for evidence of similar biological properties based on enrichment for GO annotation terms describing protein molecular function, biological roles, and/or molecular environment (19). Statistical enrichment was assessed using a hypergeometric distribution test (22) together with a correction factor for multihypothesis testing (see "Experimental Procedures") to estimate the significance of deviations in protein classification. Low p values (Ͻ10 Ϫ3 or 0.001) indicated a cluster was significantly enriched for a particular GO term as compared with chance alone.
In this manner, striking changes in protein function were detected throughout the time course (Fig. 2). A first major phase followed immediately after initial withdrawal of serum (i.e. day 2) and was represented by a sharp but transient increase in levels of intracellular signaling and transcriptional regulatory factors (labeled as cluster A; discussed further below). A second phase, consisting of more gradual waves of expression of cell remodeling factors, such as proteases and intercellular adhesion molecules, followed next (cluster B). By day 10, a marked alteration in overall metabolic capacity and cytoskeletal organization was apparent, including dramatically elevated levels of proteins linked to mitochondrial function, fatty acid oxidation, and muscle contraction (cluster C), consistent with the gross morphological and physiological transitions associated with terminal differentiation and formation of muscle-like myotubes (27)(28)(29). This phase was also accompanied by commensurate down-regulation of enzymes linked to cell division, protein synthesis, and DNA metabolism (cluster D), coincident with exit from the cell proliferation cycle and cessation of cell growth and division. For a closer examination of alterations in these core processes, the proteins were sorted based on membership to discrete functional categories (GO terms). Because low spectral counts are more likely to be prone to spurious variance (20), more reliable biological inferences can be drawn by looking for trends across the samples (i.e. functional categories showing altered expression across the differentiation time points). As outlined below, analyses of the resulting subgroups helped to highlight biologically interesting patterns of differential protein expression occurring during progression of the myogenic program.
Protein Expression Patterns in Early Myogenic Development (Dividing Myoblasts)-Proliferating myoblasts were found to preferentially express 257 proteins linked to a range of functions associated with cell proliferation, chromosomal replication, and the mitotic cell cycle, several of which are listed in Table II (see Supplemental Table S6 for complete details). These included many proteins mapping to the GO term nucleus (53 proteins, including CREB-binding protein, DNA polymerase, and the transcription factor Notch 4), cell cycle (eight proteins, including Cullin-3 and the cyclin-dependent kinase 1 (CDK1)), DNA replication (eight proteins, including replication factors MCM4 and DNA primase), and chromosome (six proteins, including high mobility group protein 4, telomeric repeat binding factor 2, and the serine/threonine protein kinase splicing factor PRP4).
Additional categories were found to be statistically enriched in the dividing myoblasts (Supplemental Table S6), most notably the GO term DNA binding (25 proteins, including the chromosome-associated kinesin KIF4A, nuclear factor Hcc-1, and TATA-binding protein factor 2N), protein biosynthesis (17 proteins, including subunits of translation initiation factor 3 and multiple 60 S ribosomal proteins), RNA binding (16 proteins, such as fragile X mental retardation protein 1 homolog, ATP-dependent RNA helicase p54, and heterogeneous nuclear ribonucleoprotein M), and cytoskeleton (seven proteins, including ␣-centractin, dynactin complex 50-kDa subunit, radixin, and profilin I).
Expression Patterns during Early Stage Induction of the Differentiation Program-Substantive changes in protein levels were detected immediately following induction of the myogenic program (day 2) by withdrawal of serum-derived mitogens, consistent with a major shift from cell division to terminal differentiation. Many of these proteins were detected exclusively at this stage, several of which are listed in Table III. These included marked up-regulation of proteins mapping to select functional categories (see Supplemental Table S7 for complete details), including the GO terms nucleus (55 proteins, such as vitamin D 3 receptor, CpG-binding protein, mRNA-capping enzyme, and ATP-dependent chromatin remodeling protein), DNA-dependent regulation of transcription (20 proteins, including the Kruppel-like factor 4, homeobox protein Meis1, nuclear receptor corepressor 1, mothers against decapentaplegic (SMAD) 4, and lamin B receptor),   TPCC_MOUSE  ND  ND  ND  ND  2  Troponin C, slow skeletal and cardiac muscles  TRT2_MOUSE  ND  ND  1  1  2  Troponin T, cardiac muscle isoforms  KCRM_MOUSE  ND  ND  ND  2  1 Creatine kinase, M chain and protein binding (35 proteins, such as neurogenic locus Notch homolog protein 3, syntaxin4, Rab6-interacting protein 2 isoform, myosin 5, and thyroid receptor-interacting protein). Changes in Expression Associated with Myotube Formation-Viewed collectively, proteins up-regulated during the differentiation response (days 2-10) were enriched for a number of functional categories (Table IV and see Supplemental  Table S8 for complete details), including the GO terms nuclear (159 proteins, such as homeobox proteins SIX1 and -4, E-box binding transcription factors 4 and 12, ATP-dependent RNA helicase, emerin, GA-binding protein, and lamin B1), mitochondrion (78 proteins, including acyl-CoA dehydrogenases, ATP synthases, cytochrome-c oxidase, NAD(P) transhydrogenase, fumarate hydratase, isocitrate dehydrogenase, malate dehydrogenase, superoxide dismutase, and ubiquinol-cytochrome-c reductase complex), calcium ion binding (43 proteins, such as calnexin, calsequestrin 1, fibulin-1 and -2, protein kinase C, and nidogen 1 and 2), cell adhesion (29 proteins, including cadherin-13 precursor, fibronectin, and integrins), cytoskeleton (23 proteins, such as vimentin, dystroglycan, focal adhesion kinase 1, plakoglobin, ␣and ␤-sarcoglycan, and ␣-syntrophin), and muscle development (19 proteins, including skeletal muscle actin, desmin, myosin light chain 1, myosin heavy chain, myosin-binding protein H, troponin C, troponin T, and creatine kinase). Table V highlights notable proteins belonging to several major functional categories significantly down-regulated as a function of cessation of cell division and the process of cell differentiation. These categories included cell cycle (22 proteins, such as cyclin T, cyclin-dependent kinase 1, Cullin-3, and septins 2 and 5-7, and septin-like protein Sint1), DNA replication (16 proteins, including DNA replication licensing factors MCM2-4, -6, and -7 and various chromatin assembly factors), DNA topoisomerase activity (two proteins, including DNA topoisomerases II ␣ and ␤ isozymes), and chromatin (nine proteins, such as high mobility group 1 and 2 proteins, SWI/SNF-related matrixassociated actin-dependent regulator of chromatin subfamily The largest decreases occurred for proteins involved in protein synthesis and mRNA processing during later stages of myotube formation, consistent with the reduced growth rate of the postmitotic cells (see Supplemental Table S9 for complete details). Table VI, a number of proteins were detected exclusively in the fully terminally differentiated (day 10) myotubes (see Supplemental Table S10 for details). This set showed significant enrichment for the GO terms mitochondrion (41 proteins, such as electron transfer flavoprotein subunits, short-chain 3-hydroxyacyl-CoA dehydrogenase, peroxiredoxin 5, and ubiquinol-cytochrome-c reductase complex 11-kDa protein), oxidoreductase activity (23 proteins, including fatty aldehyde dehydrogenase, heme oxygenase 1, thioredoxin-dependent peroxide reductase, and prostaglandin G/H synthase 1), electron transport (12 proteins, including acyl-CoA dehydrogenase, long and short chain; cytochrome-c oxidase; NAD(P) transhydrogenase, and inducible nitric-oxide synthase), fatty acid metabolism (nine proteins, including acyl-coenzyme A oxidase 1, short-chain 3-hydroxyacyl-CoA dehydrogenase, long-chain-fatty-acid-CoA ligase 3, and hydroxyacyl-coenzyme A dehydrogenase), and cell adhesion molecules (10 proteins, including cadherin-13, integrin ␣V, laminin ␣-2, laminin ␤-1, CD9 antigen, and vitronectin). These patterns are consistent with the energetic requirements of contractile tissue.

Protein Expression Patterns in Fully Differentiated Myotubes-As outlined in
Independent Evaluation of Gene Expression Profiles Using RT-PCR-In a final set of confirmation experiments, we assessed the mRNA expression patterns of gene products predicted to be differentially expressed by our proteomic screening during the transition from myoblast to early stage myotubes. For these studies, we performed RT-PCR using total RNA isolated from the cell cultures for specific amplification of gene transcripts corresponding to five select transcription factors identified as differentially regulated. One of these transcription factors appeared to be either down-regulated at the protein level during myogenesis (SMAD3), whereas three were apparently up-regulated in the myotube stage (Notch 3 and homeobox proteins SIX1 and SIX4).
As seen in Fig. 3, the RT-PCR data correlated well with the proteomic results because in most instances the time points producing the most intense RT-PCR products were largely the same as those for which the most elevated proteomic count was recorded. Consistent findings were observed for SIX1, SIX4, SMAD3, and Notch 3. Nevertheless some lag was observed between the relative gene product abundance as predicted by LC-MS and RT-PCR (e.g. protein and mRNA species). Although we cannot exclude sampling error, these temporal shifts most likely reflect differences between mRNA and protein accumulation as well as post-translational control mechanisms (8,30). Comparison with DNA Microarray Analyses of Global Patterns of Gene Expression-Several microarray studies investigating the global pattern of gene expression in cultured C2C12 myoblasts during virtually identical myogenic differentiation protocols have been published (5-7). We compared our proteomic profiles to the results of the most recent and comprehensive study reported by Tomczak et al. (7), who used the popular Affymetrix GeneChip platform to investigate changes in mRNA levels using a virtually identical differentiation time course analysis. Using data base accession (Swiss-Prot/ TrEMBL) cross-referencing, we were able to map and directly compare the proteomic and genomic expression patterns of 547 pairs of cognate protein-mRNA gene products detected in both studies. In cases where corroborating microarray data confirmed gene product expression (see below), we opted to use a slightly lower preliminary confidence threshold cutoff (i.e. 85%ϩ confidence, with positive mRNA transcript detected in published companion microarray studies (5-7)) with the logical expectation that the number of false positives is greatly reduced in this overlapping set.
Co-clustering of the respective time course profiles revealed good overall concordance in the respective time course patterns (Fig. 4A). 497 (91%) of the gene products showed similar temporal trends in expression (i.e. elevated or depressed levels as a function of terminal differentiation) in response to myogenesis after normalizing the microarray data using either the day Ϫ2 (Fig. 4A) or day 0 (Fig. 4B) time points as reference to correct for minor differences in translation lag. The coordinately up-regulated gene products were found to be significantly enriched in GO terms related to muscle function, such as muscle development, calcium ion binding, and muscle contraction. In contrast, a large co-cluster of downregulated gene products was found to be enriched for GO terms related to cell division, such as cell cycle, DNA replication, and chromosome.
Although this comparison largely corroborates the microarray and proteomic findings, a closer examination reveals differences in the exact timing of peak abundance in the two data sets, providing evidence of possible post-translational control. Of the remaining 50 largely discordant gene products (ϳ9% of total mapped), 32 showed reduced protein abundance in the developing myotubes despite evidence for elevated gene expression, whereas 18 showed evidence of protein accumulation despite an apparent reduction in the analogous mRNA transcript level (Fig. 4C). Although experimental noise and artifact could be the basis for these contradictory results, the two clusters were found to be statistically enriched for several functional categories, including the GO terms cytoskeleton, muscle contraction and actin binding, and centric heterochromatin, respectively, suggesting that the differences reflect genuine biological variation in gene product accumulation.

DISCUSSION
The development of striated muscle depends on a complex series of integrated mechanisms that ultimately reprogram gene expression and drive cellular reorganization (27)(28)(29). This process is initiated through the regulation of a network of intracellular signaling pathways that impinge on select sequence-specific myogenic transcription factors (1,(31)(32)(33)(34). Although traditional reductionist methods have helped researchers to elucidate key mechanistic aspects of the myogenic program, the development of global expression profiling methods now offers the opportunity to investigate this multilayered process from a holistic, system-wide perspective.
In this study, we have aimed to provide an as complete and unbiased as possible molecular overview of the biochemical adaptations associated with the formation of functional myocytes by systematically examining the temporal dynamics of protein expression in a C2C12 model system using shotgun sequencing. The most striking changes in protein expression detected in response to terminal differentiation occurred during two distinct physiological transitions. The first phase coincided with a rapid withdrawal of the myoblasts from the cell cycle following mitogen deprivation, concomitant with a shift from active proliferation to the initiation of differentiation. This postmitotic phase was associated with substantive alterations in the levels of a number of critical signaling factors. The second transition consisted of a series of more gradual, but ultimately more profound, perturbations in a broad range of biological systems driving the morphological conversion of single cells into fused, elongated, multinucleated myotubes. This phase included large scale cytoskeletal rearrangements, enhanced intercellular adhesion, and the maturation of both the contractile apparatus and even entire organelles such as peroxisomes and mitochondria.
Exit from the cell cycle occurs through blockade of the CDK mitotic engine (35,36). Consistent with this expectation, the proliferating myoblast cell population was found to preferentially express high levels of key activators of mitotic division (Table II and Supplemental Table S6), such as CDK1 and Cdc5, without detectable expression of any known CDK inhibitor. On the other hand, the levels of all detectable cell cycle-related proteins decreased markedly following initiation of the differentiation program.
Muscle-specific Transcription Factors-The major developmental pathways directing skeletal muscle formation are governed by the action of two major classes of transcriptional regulatory factors: namely the MyoD family of muscle-regulatory factors and the myocyte-specific enhancer factor 2 (MEF2) family of MADS box transcription factors. Each family member encodes a basic helix-loop-helix domain that directs binding to specific myogenic DNA-regulatory elements, called E-boxes (sequence, CANNTG), in response to appropriate physiological signals (37). Three key members of these families were identified in this study (Supplemental Table S4 7) is shown. 547 pairs of cognate protein-mRNA transcripts were cross-referenced. The data sets were then combined, normalized, and co-clustered, and a heat map was generated. Concordant co-expression was observed for mRNA transcripts normalized to the probe signal levels detected in either day Ϫ2 (A) or day 0 (B) samples using the Tomczak et al. (7) nomenclature. Individual clusters were exported and analyzed for significantly enriched GO terms. Statistically significantly (10e Ϫ5 ) functional categories are displayed next to each cluster. Discordant expression patterns were observed for a subset of the gene products (C) using either day Ϫ2 (not shown) or day 0 (shown here) normalized microarray data. MB, myoblasts. is known to regulate expression of numerous skeletal musclespecific genes positively, including creatine kinase and troponin (38,39), which were both detected at elevated levels at later time points. Nevertheless we were unable to detect most of the other major myogenic transcription factors, such as myogenin, muscle-regulatory factor 4, Myf5, or MyoD, presumably because these are expressed at low levels below the detection limits of the LC-MS methodology used here. Previous proteomic studies of the myogenic differentiation program using 2D PAGE-based technology have likewise failed to identify these factors (11), indicating an important limitation of proteomic profiling using current instrumentation.
Nonetheless several other developmentally important transcriptional regulators were identified in this analysis (Table IV), including the homeobox proteins SIX1 and SIX4, whose levels were sharply elevated immediately following serum withdrawal, consistent with previous reports (40). Both factors have been implicated in the control of muscle formation through binding to an evolutionary conserved MEF3-binding site upstream of the myogenin promoter and resulting in the activation of myogenin transcription (40), a key step in skeletal muscle development. The MEF3 site is also present in other skeletal muscle-specific genes, including the promoters of genes encoding troponin C (41), creatine kinase (42), and the glycolytic enzyme aldolase A (43,44), all of which were detected as being up-regulated during later stages of myogenic differentiation. Likewise myocyte nuclear factor (also known as Forkhead box protein K1) was also detected immediately following serum deprivation (Supplemental Table S4), consistent with an early role in myogenic differentiation (45,46). Myocyte nuclear factor encodes one of the first winged-helix Hepatocyte Nuclear Factor 3/Forkhead family of transcription factors identified as a key regulator of the myogenic program (46,47). It acts by binding to the CCAC box sequence motif found in the promoter region of the myoglobin gene and thereby activating transcription (45).
Intracellular Mediators of Myogenic Signaling-Although a crude nuclear preparation (expected to contain nuclear, contractile, and mitochondrial proteins) was prepared, we were also able to detect a diverse set of proteins involved in signal transduction (see Supplemental Table S4 for details). In particular, key members of the Ras superfamily of signaling proteins, which broadcast a host of signals from activated receptor tyrosine kinases at the cell surface through to the nucleus (48,49), were identified particularly during the initial stages of cell differentiation. For instance, elevated levels of both R-Ras and the Ras-related protein Rab-33B were detected immediately following serum withdrawal. Likewise several critical downstream targets of Ras were induced at early stages of differentiation, including the mitogen-activated protein kinases MAPKK-1 and -2, extracellular signal-regulated kinase-1, and the MAPKK-1-interacting protein 1 as well as AF-6, a putative signaling protein. In contrast, the Ras GTPase-activating-like protein IQGAP1, which is proposed to participate in the reorganization of the actin cytoskeleton during cellular remodeling (50), was found to be expressed throughout all stages of myogenesis. Taken together, these data indicate that significant differences in the accumulation of core components of Ras-MAPK-mediated signaling occur during the myogenic process.
In contrast, components of at least two major signaling pathways known to inhibit myogenic differentiation were found to be down-regulated in response to initiation of myogenic program (Tables II and III). The first major pathway involves Notch signaling, which inhibits myogenesis by interfering with MEF2 activation (51)(52)(53). For instance, Notch 3 and 4 were detected preferentially in the proliferating myoblasts (day 2), consistent with the results of previous microarray studies (6). The second pathway involves the SMAD family of signaling proteins (54,55). The transcription factors SMAD3, -4, and -5 were all detected preferentially early during myogenesis (i.e. in proliferating myoblasts and early stage myotubes), again consistent with a proliferative function.
Several other notable signaling factors (Supplemental Table  S4) were preferentially detected in the late stage myotubes, including ArgBP2, DCAMKL1, and Copine III. ArgBP2 is a novel member of the Abelson (Arg/Abl) protein-tyrosine kinase-binding proteins and is predicted to be a substrate of Arg and v-Abl (56). ArgBP2 localizes to the Z-disc in cardiac muscle where it likely influences contractility and elastic properties of cardiac sarcomeres in response to activation of Abelson-linked signaling cascades (56). DCAMKL1 is a serine/threonine protein kinase implicated in Ca 2ϩ signaling and has been shown to control neuronal cell migration in the developing brain (57). On the other hand, Copine III, a putative protein kinase detected in both proliferating and early differentiating cells, has been linked to the regulation of membrane trafficking (58).
Sarcomeric Organization and Muscle Contraction-Myofibrils, which form the bulk of muscle mass, are composed of tandem arrays of sarcomeres, the core structural unit of dynamically interacting chains of actin and myosin filaments responsible for muscle contraction (59). The development and maturation of the contractile apparatus requires a parallel increase in calcium-dependent cycling proteins to handle the rapid fluxes in Ca 2ϩ associated with muscle contractility (60). Consistent with this, myotube formation was associated with elevated levels of calcium transporters, including the Ca 2ϩ release channel (ryanodine receptor), voltage-gated L-type calcium channel, and the sarco(endo)plasmic reticulum calcium ATPase. As expected, a large set of other proteins linked to skeletal muscle excitation/contraction (such as the cytoskeletal factors actin, myosin, troponin, nebulin, titin, desmin, and ␣-actinin) were likewise enriched in late stage differentiating cells (Table IV and Supplemental Table S8).
The ATP requirements of contracting muscle are extremely high both for ATP-dependent Ca 2ϩ handling and for ATP-dependent myosin cross-bridge formation. As a result of this marked demand, skeletal muscle metabolizes large amounts of glucose, fatty acids, and amino acids to fuel contraction (61). Consistent with this, late stage myotubes were greatly enriched in enzymes linked to basic metabolism, including key rate-limiting energy-producing enzymes such as pyruvate kinase, creatine kinase, ATP synthase, acyl-coenzyme A oxidases, and 3-hydroxyacyl-CoA dehydrogenases (Table VI and  Supplemental Table S10).
Cytoskeleton and Extracellular Matrix Reorganization-The muscle cytoskeleton consists of a complex network of filaments and tubules that transmit mechanical and chemical stimuli within and between adjacent myocytes. This dynamic structure also contributes to cell stability during the contraction cycle by anchoring subcellular structures such as mitochondria, nuclei, and myofibrils. The stabilizing mechanotransducer action is supported by membrane-associated proteins, in particular dystrophin, which binds to intracellular actin and extracellular laminin through the dystroglycan complex (62). As expected, myogenesis was associated with the induction of a vast array of cytoskeletal factors, such as desmin and nebulin, as well as muscle-specific components of the dystrophin complex, such as dystroglycan and sarcoglycan.
Reorganization of the contractile apparatus was paralleled by an increased abundance of connective tissue proteins in the developing myocytes, which provide physical stabilization during the extreme tensile forces generated during contraction. These included numerous collagen ␣ isoforms, integrins (␣5, ␣v, ␣7, ␤1, and ␤5), laminins (␣-2, ␣-5, ␤-1, and ␥-1), and fibronectin. In addition, matrix metalloproteases, including glycoproteins, and adhesion factors, such as ␣and ␤-sarcoglycan, were expressed preferentially during cell differentiation or were detected exclusively in terminally differentiated myotubes. Numerous other adhesion or structural proteins, including nidogen and dystroglycan, were also enriched in later stages of differentiation.
Protein Functional Annotation-In this study, we detected expression of ϳ25 uncharacterized Riken cDNAs and ϳ180 putative hypothetical proteins. Comparison of their expression profiles with those of previously studied proteins can provide clues as to the possible functions or roles of these gene products. In particular, linkage to a specific cluster exhibiting significant functional enrichment suggests that several of these uncharacterized proteins are likely to participate in processes important to proper myogenesis, including the control of mRNA transcription, chromatin modification, and/or cell cycle progression.
Of course, the observed clusters of functionally related proteins highlighted in this report reflect only a portion of all the proteins identified in this study. In addition to suggesting a potential role for novel gene products in skeletal muscle development, our results implicate a number of well characterized proteins that have not been linked previously to muscle cell differentiation. For example, two proteins involved in RNA synthesis (ATP-dependent RNA helicase (DEAD box pro-tein) and UMP/CMP kinase) as well as a protein involved in endocytosis (intersectin-1) were found to be differentially expressed during myogenic differentiation, yet these proteins had not been associated with muscle development previously. A systematic, hypothesis-driven analysis of similarly identified proteins based on reasonable interpretations of their expression characteristics seems likely to bear fruit. Hence our proteomic data can be viewed as a resource for targeted follow-up studies centered on one or more biochemical pathways of particular interest.
Comparison with Microarray-based Investigations of Myogenic Gene Transcription-In the past year, several studies have reported the use of DNA microarray technology to probe changes in gene expression in C2C12 cells coincident with the formation of mature myocytes (5)(6)(7). Gratifyingly the results of the most extensive gene expression study published to date (7) were found to correlate quite well with the proteomic data reported here with relatively few substantive inconsistencies at least in terms of overall temporal trends. This concordance was somewhat surprising given the general reports in the field and validates the general conclusions of the two studies and the respective experimental platforms, further encouraging follow-up investigations. Although gene expression profiling studies generally achieved a more extensive coverage (at least 2,895 mRNA transcripts were predicted to be expressed throughout skeletal myogenesis in Ref. 7), our study provides complementary evidence for possible posttranslational controls that may serve to further refine the biochemical transitions associated with myogenesis. Moreover, although certain gross incongruities in the observed patterns of protein and transcript levels are likely to have arisen due to technical limitations such as biased detection or other artifacts associated with LC-MS and microarray analyses, many may in fact represent bona fide, biologically meaningful differences, stemming from differential regulation of translation and/or mRNA or protein stability as has also been suggested previously in the literature (63).
Comparison with 2D PAGE Profiling of Myogenesis-A recent study (11) reported the use of 2D PAGE followed by silver staining and quantitative imaging to examine changes in the levels of ϳ2,000 protein spots throughout the myogenic differentiation. Although the vast majority did not exhibit any detectable differences in terms of relative abundance, ϳ100 predicted to be differentially expressed were identified, including some 26 phosphorylated variants. We determined that there were 39 proteins in total that were detected by Tannu et al. (11) as well as this study; 29 of these proteins (74%) showed virtually identical findings (see Supplemental Table S11 for details), whereas only 10 (26%) proteins (annexin V, protein-disulfide isomerase, transcription intermediary factor 1-␤, histone acetyltransferase type B, guanine nucleotide-binding protein ␤ subunit, 26 S proteasome-regulatory subunit, Lasp-1, 75-kDa glucose-regulated protein, secreted protein acidic and rich in cysteine (SPARC), and myosin light chain 1) showed significant differences in expression patterns between our two studies. Nevertheless despite the limited dynamic range afforded by SDS-PAGE, the substantial agreement between these two independent proteomic data sets further validates the main conclusions of this new study and the reliability of gel-free protein expression profiling as a means to investigate fundamental aspects of muscle cell biology. These modest discrepancies most likely arose due to technical differences in tissue culture, protein extraction, or sample preparation. Although gel-based profiling techniques offer certain advantages (64), our results indicates that gel-free shotgun methods offer fundamentally enhanced proteomic coverage (14).